Python requests: Timeouts, Retries, and Backoff Done Right
The first time you call an API in Python, the code looks harmless. You import requests, fetch a URL, and read the response. It works on your laptop, against a fast endpoint, on a good connection. Then it ships, the network has a bad afternoon, and that same line quietly hangs or crashes a real user's request.
The fix is not a bigger try/except. It is three habits that production code almost always has and tutorials almost always skip. A timeout on every call. Automatic retries for the failures worth retrying. And backoff, so those retries do not turn into an accidental denial-of-service attack on the very server you are trying to reach.
This guide walks through all three. You will see why the defaults are dangerous, how to set timeouts that actually protect you, and how to wire up retries with exponential backoff using the tools already inside requests, ending with a reusable session you can lift straight into a project. It assumes Python 3.10 or later and a working pip install requests, which bundles urllib3, the package that does the real retry work under the hood.
The request that hangs forever
Here is the call almost every tutorial starts with. It fetches some JSON and prints it.
import requests
response = requests.get("https://api.example.com/data")
print(response.json())
The problem is invisible until it bites. By default, requests has no timeout. If the server accepts the connection and then never sends a response, your program waits. Not for thirty seconds, not for a minute. Forever. The thread sits there blocked, and if this is running inside a web app, that is one of your workers gone until someone restarts the process.
The default is to wait forever
A slow or dead server does not raise an error in requests. It simply makes your code wait, with no timeout and no upper bound. The signature is a request that never returns and never errors, which is the failure mode hardest to spot in testing and most painful in production.
Timeouts done right
The fix is one keyword argument. Pass timeout to every request you make.
import requests
response = requests.get("https://api.example.com/data", timeout=10)
print(response.json())
A single number sets one budget for the whole call, but that hides a useful distinction. There are really two clocks. One measures how long Python waits to establish the connection. The other measures how long it waits between bytes once the server starts replying. You can set them separately by passing a tuple.
# (connect timeout, read timeout)
response = requests.get(
"https://api.example.com/data",
timeout=(3.05, 27),
)
The first value is the connect timeout. Give up if the server has not even accepted the connection within about three seconds, because a server that cannot say hello quickly is usually down. The second is the read timeout. Allow longer for the response itself, since a real query might genuinely take time to compute. The oddly specific 3.05 is a known trick. Connect timeouts that sit just above a multiple of three play more nicely with the way TCP resends lost packets.
The one rule that always holds is this. Set a timeout on every single request, not most of them. A short connect timeout with a more generous read timeout is a sensible starting point, and you can raise the read value for endpoints you know are slow.
One subtlety worth knowing. The read timeout is not a total deadline for the whole download. It resets every time new data arrives, so it measures silence, not total elapsed time. For most APIs returning a single JSON body this never matters, but a server that dribbles out one byte at a time can technically keep a connection alive past it.
Why one attempt is not enough
A timeout stops you waiting forever, but it turns a slow call into a failed one. That is an improvement, yet many of those failures are not real. They are transient. A connection reset, a momentary DNS hiccup, a load balancer returning 503 Service Unavailable for a second while it shuffles traffic. Run the exact same request a moment later and it succeeds.
This is the case for retries. Not retrying everything blindly, but recognising that a whole category of network failures clears up on its own if you simply ask again after a short pause.
- Connection errors. The socket dropped or the server refused the connection. Often gone on the next try.
- Read timeouts. One slow response does not mean the next will be slow.
- Server errors in the 5xx range. A
502,503, or504usually signals a temporary problem on their side, not yours. - Rate limiting. A
429 Too Many Requestsis the server explicitly telling you to wait and try again.
The distinction that matters is timing versus request. Retry the failures that are about timing. Do not retry the failures that are about your request, because a 404 Not Found or a 401 Unauthorized will return the same answer every time, and retrying it just wastes time and hammers the server.
The fragile hand-rolled retry
The obvious first instinct is to write the loop yourself. It looks reasonable.
import time
import requests
for attempt in range(5):
try:
response = requests.get("https://api.example.com/data", timeout=10)
response.raise_for_status()
break
except requests.exceptions.RequestException:
time.sleep(2)
else:
raise RuntimeError("All attempts failed")
It works, but it is doing a lot of things slightly wrong. It retries every error, including the 404s and 401s that will never recover. It waits a flat two seconds each time instead of backing off. It has no jitter, so ten machines running this code retry in lockstep. And this is the simple version. Add per-status rules, a growing delay, and a cap, and you have quietly reimplemented a library that already ships inside requests, only with more bugs. The retry logic in urllib3 is battle-tested and plugs in with a few lines, which turns maintaining retry code into simply configuring it.
The right way: HTTPAdapter and Retry
The proper tool is built from three pieces you mount together once. A Retry object describes the policy, an HTTPAdapter applies that policy to outgoing requests, and a Session carries the adapter so every call through it inherits the behaviour.
import requests
from requests.adapters import HTTPAdapter
from urllib3.util.retry import Retry
retry = Retry(
total=5,
backoff_factor=0.5,
status_forcelist=[429, 500, 502, 503, 504],
allowed_methods=["GET", "HEAD", "OPTIONS"],
raise_on_status=False,
)
adapter = HTTPAdapter(max_retries=retry)
session = requests.Session()
session.mount("https://", adapter)
session.mount("http://", adapter)
response = session.get("https://api.example.com/data", timeout=(3.05, 27))
response.raise_for_status()
print(response.json())
Every call you make through session now retries automatically, with no loop in your own code. We mount the adapter for both https:// and http:// so the policy applies whichever scheme a URL uses. Notice that timeout still lives on the individual request. The retry policy controls how many attempts happen and how they are spaced. The timeout controls how long each individual attempt is allowed to take. They are separate jobs, and you need both.
Create the session once and reuse it for many requests. Beyond carrying the retry policy, a Session reuses the underlying TCP connection across calls to the same host, which is faster than opening a fresh connection every time. Creating a new session per request throws that benefit away.
Backoff: spacing out the retries
The single most important setting above is backoff_factor. Without it, retries fire back to back, and if the server is already struggling, a wall of immediate retries is the last thing it needs. Backoff makes each retry wait a little longer than the one before.
With backoff_factor=0.5, the delay roughly doubles each time. The first retry fires immediately, and the growing pause kicks in from the second retry onward. This is exponential backoff, and it gives a struggling server room to recover instead of piling on.
| Retry | Wait before it (backoff_factor = 0.5) |
|---|---|
| 1st retry | 0 seconds (immediate) |
| 2nd retry | ~0.5 seconds |
| 3rd retry | ~1 second |
| 4th retry | ~2 seconds |
| 5th retry | ~4 seconds |
A cap stops this growing without bound. The backoff_max setting, 120 seconds by default, is the ceiling on any single wait, so even a long retry sequence never sleeps for an unreasonable stretch.
There is one more failure mode worth designing for. If dozens of machines run the same code and all fail at the same instant, pure exponential backoff makes them all retry at the same instant too. That is the thundering herd. Modern urllib3 can scatter the retries by adding randomness with backoff_jitter, so the load spreads out instead of arriving in synchronised waves.
What to retry, and what not to
Two settings decide which failures get a second chance. Getting them right is what separates a smart retry policy from one that wastes time on hopeless requests.
status_forcelist picks the status codes worth retrying
By default, Retry only retries connection-level failures, not bad HTTP responses. Listing [429, 500, 502, 503, 504] tells it to also retry those specific statuses, which are the ones that tend to clear up on their own. A 404 or 401 stays out of the list, because retrying it is pointless.
allowed_methods controls which HTTP verbs may retry
This is about idempotency. A GET can be repeated safely, because asking for the same data twice changes nothing. A POST often cannot, because it might create a second order or charge a card again. By default urllib3 retries idempotent methods such as GET and HEAD but deliberately leaves POST out.
The POST trap
Do not add POST to allowed_methods unless you are certain the operation is safe to repeat. A retried POST that the server actually received the first time, but whose response got lost on the way back, can double-submit. The signature is a duplicate record or a double charge with no error in your logs. If an endpoint supports an idempotency key, use it, and only then consider retrying writes.
One status code deserves special mention. When a server returns 429 or 503 with a Retry-After header, it is telling you exactly how long to wait. urllib3 honours that header by default, waiting the requested time instead of using its own backoff. That cooperation is often the difference between staying within a rate limit and getting blocked.
A reusable session factory
Rather than repeat the setup everywhere, we wrap it in one function. Drop this into a small module and import a configured session whenever you need one.
import requests
from requests.adapters import HTTPAdapter
from urllib3.util.retry import Retry
def make_session(
retries=5,
backoff_factor=0.5,
status_forcelist=(429, 500, 502, 503, 504),
):
"""Return a requests Session with sensible retry and backoff defaults."""
session = requests.Session()
retry = Retry(
total=retries,
backoff_factor=backoff_factor,
status_forcelist=status_forcelist,
allowed_methods=frozenset(["GET", "HEAD", "OPTIONS"]),
raise_on_status=False,
)
adapter = HTTPAdapter(max_retries=retry)
session.mount("https://", adapter)
session.mount("http://", adapter)
return session
Using it stays readable, and the only thing each call site has to remember is the timeout.
from http_client import make_session
session = make_session()
response = session.get("https://api.example.com/data", timeout=(3.05, 27))
response.raise_for_status()
data = response.json()
The function takes its retry count, backoff factor, and status list as arguments, so a noisy third-party API can get a more patient session while an internal service gets a stricter one. The shared defaults keep most call sites simple.
Knowing when to stop
Retries are not infinite, and they should not be. After total attempts, the policy gives up. What happens then depends on raise_on_status. With it set to False, as above, the final response comes back to your code even if it is a 503, and you decide what to do with a bad status using raise_for_status(). With it left at its default of True, an exhausted retry on a forced status raises an exception instead.
Either way, the important habit is to treat a failed-after-retries call as a real outcome, not a surprise. The request layer has done its job. It tried, it backed off, it waited on Retry-After, and the failure that survived all of that is one worth handling deliberately.
Retries buy reliability, not certainty
A good retry policy converts most transient failures into eventual successes. It does not promise success. Your code still needs a plan for the call that fails every attempt, which is where deliberate error handling takes over.
That handoff is its own topic. Once a call has genuinely failed, the question becomes how to categorise the failure, read what the server actually said, and fail gracefully instead of crashing. Our companion guide, Handling API Errors in Python, picks up exactly there. And when you want to prove your retry and timeout logic behaves under failure without hitting a live server, the testing guide shows how to mock those conditions in a test suite.
When a dedicated library helps
The adapter approach is the right default for HTTP, because the retry happens at the transport layer and applies to every call through the session. But sometimes you want to retry something that is not a single HTTP request. A small block of code, a database write, a multi-step operation. For that, a decorator-based library is cleaner.
- tenacity wraps any function in configurable retry logic with its own backoff and stop conditions. Useful when the thing you are retrying is wider than one request.
- backoff offers a similar decorator with exponential backoff built in.
- httpx is an alternative HTTP client with a similar transport-level retry mechanism, and it supports async requests if you need concurrency.
For ordinary API calls, the built-in HTTPAdapter and Retry are enough, and they keep your dependency list short. Reach for a dedicated retry library only when you need to wrap logic that is broader than a single HTTP call.
Frequently asked questions
Does Python requests retry failed requests automatically?
No. By default, requests makes a single attempt and does not retry anything, and it has no timeout either. Automatic retries come from mounting an HTTPAdapter configured with a urllib3 Retry object onto a Session. Once that is in place, every call through the session retries according to your policy without any loop in your own code.
What is a sensible default timeout for requests?
Set a short connect timeout and a longer read timeout, passed as a tuple such as timeout=(3.05, 27). A server that cannot accept the connection within a few seconds is usually down, so failing fast there is correct. The read value can be more generous, since a real query may take time to compute. The one rule that always holds is to set a timeout on every request, because the default is to wait forever.
Should I retry POST requests?
Usually not. A GET is safe to repeat because it changes nothing, but a POST may create or modify data, so a blind retry can double-submit an order or charge a card twice. By default urllib3 leaves POST out of the retryable methods. Only retry a write if the operation is genuinely safe to repeat or the endpoint supports an idempotency key.
Mastering APIs with Python
Timeouts, retries, and backoff are the baseline that separates code that works on your machine from code that survives a real network. In the full book, this reliability mindset runs through every project: real API clients built against live services, then made robust enough to trust in production. 30 chapters, six portfolio projects covering Flask, OAuth, SQLite, Postgres, Docker, CI/CD, and AWS.
Get the book for €35Chapter 3 is free to read.