7. Chapter review

You started this chapter with twenty-two chapters of synchronous requests work and the load-bearing claim that sync is still the right default. By the end of Section 1 you had the cost-profile framing for when async earns its complexity. By the end of Section 6 you had a working keystone that proved the cost ratio concretely: three concurrent fetches in roughly the wall-clock of the slowest single one, with partial-failure handling that keeps the aggregator alive when a source dies. This page rehearses the patterns, lays out the architectural-call quiz, and points at where the chapter sits in the larger production-systems arc.

What you have shipped

The Async News Aggregator from Section 6 is the concrete artifact: three concurrent fetches against NewsAPI, the Guardian, and Hacker News, with structured per-source response shapes, per-request timeouts, and a reducer that treats failures as empty contributions rather than fatal aborts. The same shape extends to any outbound aggregation -- swap the three news APIs for fifty product-detail lookups or twenty weather stations and the code beats stay the same. At fifty sources you would add the semaphore from Section 4 to bound the parallelism against a shared rate limit; at three independent providers the keystone leaves it off deliberately. The per-request timeout that protects the keystone from one slow source defining the wall-clock is the always-on discipline; the semaphore is the one you reach for when the request count crosses ten or so against a single provider.

The chapter also taught the negative cases hard. A sync I/O call sitting inside async def blocks the event loop and silently collapses the concurrency (Section 3's blocking_trap.py demo); a forgotten await returns a coroutine object that never runs (Section 2's missing_await.py); CPU-bound work does not benefit from an event loop and wants threads or processes instead (Section 1). Async is not "the modern way." It is one tool for one shape of problem.

Sync → async at a glance

The mechanical swaps that turn the most common sync patterns into async ones:

Synchronous	Asynchronous
`def function():`	`async def function():`
`import requests`	`import httpx`
`requests.get(url)`	`await client.get(url)`
`time.sleep(n)`	`await asyncio.sleep(n)`
`with open(...) as f:`	`async with aiofiles.open(...) as f:`
`[func(x) for x in items]`	`async with TaskGroup() as g: ...`
`function()`	`asyncio.run(function())`

Chapter review quiz

Seven architectural-call questions. None of them are recall; each pressure-tests a tradeoff the chapter argued.

Select a question to reveal the answer:

1. When would you reach for async over sync requests?

The default is sync. Async earns its complexity when the request count is greater than one, the calls are independent, and the wall-clock cost of waiting on them in series is the bottleneck. The canonical case is outbound aggregation across many endpoints (a dashboard from three or four sources, a fan-out across fifty). The negative cases stay clear: single-request scripts, sequential workflows where each step needs the previous answer, CPU-bound work, and request handlers serving one request at a time. "Always async" is a slogan, not an architectural answer.

2. You call requests.get() inside an async def and run five "concurrent" fetches via asyncio.gather. Why does the wall-clock look sequential?

Because requests.get() is synchronous I/O. When the first coroutine hits it, the event loop has nowhere to run -- the call is blocking the thread that owns the loop. The other coroutines are scheduled but cannot make progress until the blocking call returns. The fix is to use an async-aware client (httpx.AsyncClient) so the call yields control during the wait. The blocking_trap.py demo in Section 3 shows the 5x slowdown: five sequential waits dressed up as concurrent ones, with no exception to flag it.

3. When is asyncio.TaskGroup the right primitive, and when do you reach for asyncio.gather instead?

TaskGroup is the structured-concurrency primary on Python 3.11+: tasks are owned by the async with block, exceptions surface as ExceptionGroup, and failure cancels the surviving siblings automatically. Reach for it whenever fail-fast with structured cancellation is the shape you want and the runtime supports it. asyncio.gather survives for two reasons: it works on Python 3.10 and older, and its return_exceptions=True mode collects results and exceptions into one list. Default gather has a gotcha worth naming explicitly: when one task raises, the exception propagates to the caller but the surviving tasks are not cancelled -- they keep running, and any further exceptions are silently discarded. That is the failure shape TaskGroup was introduced to fix. The keystone uses plain gather over three coroutines that already catch their own exceptions and return structured-response dicts; failures arrive as success: False values, no exception ever leaves a fetcher, so the asymmetry does not matter. That is the partial-failure shape that fits when you control the inner coroutines; return_exceptions=True is the right call when you do not.

4. What does return_exceptions=True on gather buy you, and how does that differ from a TaskGroup with try/except?

gather(..., return_exceptions=True) changes the result list shape: every position is either a value or an exception object. Every task runs to completion regardless of failures. A TaskGroup with try/except inside each coroutine looks similar from the outside, but the semantics are different: TaskGroup's contract is fail-fast at the block boundary -- if any task raises (after the inner try/except), the surviving siblings get cancelled, and the block re-raises as an ExceptionGroup. gather(return_exceptions=True) is the escape hatch when you want collect-everything semantics without rewriting your inner coroutines; TaskGroup-with-internal-try is the structured-concurrency answer when you control the inner functions. Default gather (without return_exceptions) is a third shape and the trap: it raises but does not cancel siblings, leaving you with background coroutines whose later failures vanish.

5. You are fan-out'ing fifty calls to an API with a 100-requests-per-minute rate limit. Why a semaphore, and what cap?

A semaphore bounds parallelism, which is what the API's rate limit cares about. Without one, fifty concurrent coroutines burst all fifty calls at once and the API returns 429s even though you would have stayed inside the per-minute budget if you had paced them. asyncio.Semaphore(10) caps concurrent in-flight requests at ten; combined with the API's response latency this naturally throttles the request rate. The reason to be polite to a rate-limited API is not just self-protection: bursts get you flagged at the receiver, and a flagged client gets aggressive ratelimits applied to it long after the burst is over.

6. Per-request timeouts, per-batch timeouts, and httpx's transport-level timeout -- which fires when?

httpx.Timeout(10.0) on the client caps each individual call at ten seconds; one slow endpoint times out without affecting the others. asyncio.wait_for(group, 30) caps the whole batch at thirty seconds; if the whole fan-out has not finished by then, the wrapper cancels the remaining tasks regardless of which were close to done. httpx's transport-level timeout (connect, read, write, pool) is finer-grained and lets you bound individual phases of a single request -- a deliberately slow server that opens connections fast but reads slowly trips the read timeout, not the connect one. The right combination for most aggregators is per-request httpx timeouts on the client plus an asyncio.wait_for batch cap as a backstop.

7. The keystone uses sync sqlite3 for the database write. Why, given the whole chapter argues for async?

Because the database is not the bottleneck. The aggregator's wall-clock is dominated by the three concurrent HTTP fetches; a synchronous SQLite insert at the end adds milliseconds, not seconds. Switching to aiosqlite would let the database call yield while it waited, but there is nothing else to do during that wait -- the HTTP work is already done. The chapter's argument is "use async where the cost ratio justifies it" not "make every I/O call async on principle." The database call stays sync because the cost profile says it should, and the prose names that decision explicitly so the reader sees the architectural call rather than guessing at it. aiosqlite earns its place when the database is the bottleneck, not before.

Looking forward

You have built an aggregator that respects rate limits, degrades gracefully when a source dies, and hits the wall-clock target the chapter opened with. The next step in the production-systems arc is the database substrate: Chapter 24 walks through the move from SQLite to PostgreSQL, with the rationale being the same cost-profile thinking this chapter applied to the HTTP layer. SQLite is the right default until the concurrent-write contention or the dataset size says otherwise; PostgreSQL is the answer when it does. The async fan-out patterns from this chapter pair naturally with PostgreSQL's connection pooling, and the keystone's "the database is not the bottleneck" decision will be the one to revisit when the bottleneck moves.