3. Foundation: importing production patterns
In this section you copy three utility modules from earlier chapters into the project root: errors.py for error categorisation and retry, json_helpers.py for safe access into nested JSON, and validators.py for a validation pipeline that rejects malformed responses before they reach business logic. Production code reuses what already works, and these three modules already work: Chapters 9, 10, and 12 each shipped one. Every layer in the rest of this chapter imports from at least one of these files; skip this section and §4 onwards won't run.
Each file is a consolidated version, stitching together the relevant code from multiple sections of its source chapter. After each save, a one-line import check confirms the file is on disk and parses cleanly. If a check fails, fix that file before saving the next.
errors.py: error handling and retry (from Chapter 9)
Chapter 9 built error handling in three pieces: four-category classification, three-part user messages, and retry with exponential backoff plus jitter. Save them as a single module at the project root, errors.py:
"""
Error handling infrastructure from Chapter 9.
Provides categorization, message composition, and retry logic.
"""
import requests
import time
import random
from typing import Optional, Dict, Any, Tuple
from datetime import datetime
# ============================================================================
# ERROR CATEGORIZATION (Chapter 9, Section 3)
# ============================================================================
def categorize_error(exception: Exception, response: Optional[requests.Response] = None,
user_input: Optional[str] = None) -> str:
"""
Map exceptions to four categories for systematic handling.
Returns: 'user_input', 'transient', 'not_found', or 'unknown'
From Chapter 9: Check order matters - user input first, then transient,
then not found, finally unknown.
"""
# 1. User input validation errors (check first)
if user_input is not None:
if not user_input or not user_input.strip():
return 'user_input'
if len(user_input) > 100:
return 'user_input'
if not any(c.isalpha() for c in user_input):
return 'user_input'
# 2. Network and timeout errors (transient - worth retrying)
if isinstance(exception, (requests.exceptions.Timeout,
requests.exceptions.ConnectionError)):
return 'transient'
# 3. HTTP errors - examine status code
if isinstance(exception, requests.exceptions.HTTPError):
if response is None:
return 'unknown'
status_code = response.status_code
# Server errors (transient - worth retrying)
if status_code >= 500:
return 'transient'
# Rate limiting (transient - worth retrying with delay)
if status_code == 429:
return 'transient'
# Not found (permanent - don't retry)
if status_code == 404:
return 'not_found'
# Other client errors (likely user input issue)
if 400 <= status_code < 500:
return 'user_input'
# 4. Parsing and data errors (often means "no results")
if isinstance(exception, (KeyError, ValueError, TypeError)):
return 'not_found'
# Everything else is unknown
return 'unknown'
# ============================================================================
# MESSAGE COMPOSITION (Chapter 9, Section 4)
# ============================================================================
MESSAGE_TEMPLATES = {
'user_input': {
'empty': {
'what': 'Please enter a city name to get weather information.',
'how': 'Type the name of any city or town.',
'examples': 'Examples: London, Paris, Tokyo'
},
'too_long': {
'what': 'City names are typically short (under 100 characters).',
'how': 'Please enter just the city name.',
'examples': 'Examples: Paris, San Francisco'
},
'default': {
'what': "We couldn't process that city name.",
'how': 'Please use only letters, spaces, and hyphens.',
'examples': 'Examples: São Paulo, New York'
}
},
'not_found': {
'with_city': {
'what': 'We couldn\'t find weather data for "{city_name}".',
'how': 'Please check the spelling or try a nearby city.',
'examples': 'Examples: London, Dublin, Manchester'
},
'default': {
'what': "We couldn't find that location in our database.",
'how': 'Please try a different city name or check spelling.',
'examples': 'Examples: Tokyo, New York, Sydney'
}
},
'transient': {
'default': {
'what': "We're having trouble connecting to the weather service.",
'how': 'This is usually temporary - please try again in a moment.',
'examples': 'If the problem continues, check your internet connection.'
},
'rate_limited': {
'what': "We're receiving too many requests right now.",
'how': 'Please wait {retry_after} seconds, then try again.',
'examples': 'This is a temporary limit from the provider.'
}
},
'unknown': {
'default': {
'what': 'We encountered an unexpected problem.',
'how': 'Please try again, or try a different city.',
'examples': 'If this continues, the service may be unavailable.'
}
}
}
def compose_error_message(category: str, context: Optional[Dict[str, Any]] = None) -> str:
"""
Generate three-part user messages from templates.
From Chapter 9: Three parts - what happened, what to do, examples.
"""
if context is None:
context = {}
city_name = context.get('city_name', '')
retry_after = context.get('retry_after', 0)
is_rate_limited = context.get('is_rate_limited', False)
templates = MESSAGE_TEMPLATES.get(category, MESSAGE_TEMPLATES['unknown'])
# Select variant within category
if category == 'user_input':
if not city_name or not city_name.strip():
template = templates['empty']
elif len(city_name) > 100:
template = templates['too_long']
else:
template = templates['default']
elif category == 'not_found':
if city_name:
template = templates['with_city']
else:
template = templates['default']
elif category == 'transient':
if is_rate_limited:
template = templates['rate_limited']
else:
template = templates['default']
else:
template = templates['default']
# Format with context
what = template['what'].format(city_name=city_name, retry_after=retry_after)
how = template['how'].format(city_name=city_name, retry_after=retry_after)
examples = template['examples'].format(city_name=city_name, retry_after=retry_after)
return f"{what}\n{how}\n{examples}"
# ============================================================================
# SIMPLE LOGGING (Chapter 9, Section 6)
# ============================================================================
def log_error_simple(category: str, exception: Exception,
context: Optional[Dict[str, Any]] = None) -> None:
"""Simple console logging for debugging production issues."""
timestamp = datetime.now().strftime('%Y-%m-%d %H:%M:%S')
print(f"\n[ERROR LOG] {timestamp}")
print(f" Category: {category}")
print(f" Exception: {type(exception).__name__}: {str(exception)}")
if context:
print(f" User input: {context.get('city_name', 'N/A')}")
print()
# ============================================================================
# UNIFIED ERROR HANDLER (Chapter 9, Section 6)
# ============================================================================
def handle_error(exception: Exception, context: Optional[Dict[str, Any]] = None,
response: Optional[requests.Response] = None) -> None:
"""
Unified error handling: categorize → log → compose → display.
From Chapter 9: One call that orchestrates the complete error response.
"""
if context is None:
context = {}
user_input = context.get('city_name', context.get('user_input'))
# Categorize the error
category = categorize_error(exception, response=response, user_input=user_input)
# Log technical details
log_error_simple(category, exception, context)
# Add category-specific context for message composition
if category == 'transient' and response and response.status_code == 429:
context['is_rate_limited'] = True
retry_after = response.headers.get('Retry-After', '0')
try:
context['retry_after'] = int(retry_after)
except ValueError:
context['retry_after'] = 0
# Compose and display user-friendly message
message = compose_error_message(category, context)
print("\n" + message + "\n")
# ============================================================================
# SMART RETRY LOGIC (Chapter 9, Section 5)
# ============================================================================
def retry_request(url: str, params: Dict[str, Any], max_attempts: int = 3,
timeout: float = 10.0, base_delay: float = 1.0,
max_delay: float = 60.0) -> Optional[requests.Response]:
"""
Make HTTP request with exponential backoff + jitter retry.
From Chapter 9: Only retries transient failures (timeouts, 5xx).
Respects Retry-After header for 429 responses.
"""
for attempt in range(max_attempts):
try:
response = requests.get(url, params=params, timeout=timeout)
# Special handling for 429 (Rate Limited)
if response.status_code == 429:
retry_after = response.headers.get('Retry-After')
if retry_after and attempt < max_attempts - 1:
try:
wait_time = min(float(retry_after), max_delay)
print(f"Rate limited. Waiting {wait_time:.0f} seconds...")
time.sleep(wait_time)
continue
except ValueError:
pass # Invalid Retry-After, fall through
return response
# Raise for HTTP errors (will be caught below)
response.raise_for_status()
return response
except (requests.exceptions.Timeout,
requests.exceptions.ConnectionError) as e:
if attempt == max_attempts - 1:
return None
# Exponential backoff with jitter (Chapter 9, Section 5.3)
exponential_delay = min(base_delay * (2 ** attempt), max_delay)
jitter = random.uniform(0, exponential_delay * 0.5)
wait_time = exponential_delay + jitter
print(f"Connection issue. Retrying in {wait_time:.1f} seconds...")
time.sleep(wait_time)
except requests.exceptions.HTTPError as e:
# Only retry server errors (5xx)
if e.response and e.response.status_code >= 500:
if attempt == max_attempts - 1:
return None
exponential_delay = min(base_delay * (2 ** attempt), max_delay)
jitter = random.uniform(0, exponential_delay * 0.5)
wait_time = exponential_delay + jitter
print(f"Server error {e.response.status_code}. Retrying in {wait_time:.1f}s...")
time.sleep(wait_time)
else:
# 4xx errors - don't retry
return e.response
except Exception:
# Unexpected: re-raise so the caller can categorise it as
# 'unknown' rather than masking as transient/retries_exhausted.
raise
return None
The five functions split into runtime utilities and a higher-level convenience. retry_request() wraps every outbound HTTP call, retrying transient failures (timeouts, 5xx, 429) with exponential backoff before the exception ever reaches a layer above; §4's API client uses it on every fetch. categorize_error() maps an exception (plus optional response and user input) to one of four categories (user_input, transient, not_found, unknown), and §4's API client puts that category onto every APIResult so upper layers branch on data, not on exception types. compose_error_message() produces the three-part what / how / examples text from a category and context, and handle_error() plus log_error_simple() package the categorise-then-log-then-compose-then-print pipeline into a single call for layers that just want to print and move on. The chapter's main flow uses categories as data (carried on APIResult and WorkflowResult) and lets §7's display layer do the user-facing formatting; handle_error() sits in the toolkit for the simpler call-and-print path.
Confirm the file is on disk and parses cleanly:
$ python -c "from errors import retry_request; print('errors.py OK')"
errors.py OK
If you see errors.py OK, the module parsed and retry_request (the last function defined) is present. If you don't, the paste truncated mid-file or errors.py landed somewhere other than the project root; check both before saving the next file.
json_helpers.py: safe navigation through nested JSON (from Chapter 10)
Chapter 10 turned data["a"]["b"]["c"] from a crash-on-missing-field hazard into a single safe call. Save the consolidated module at the project root as json_helpers.py:
"""
JSON processing infrastructure from Chapter 10.
Provides safe navigation and response normalization.
"""
from typing import Any, List, Optional, Dict, Tuple
# Common collection wrapper keys across different APIs
COMMON_COLLECTION_KEYS = ["items", "results", "data", "content", "entries", "records"]
# ============================================================================
# SAFE NAVIGATION (Chapter 10, Section 4)
# ============================================================================
def safe_get(obj: Any, path: str, default: Any = None) -> Any:
"""
Dot-path lookup: 'owner.login' → obj['owner']['login'] if present.
From Chapter 10: Returns default if any part of path doesn't exist.
Prevents crashes from missing nested fields.
"""
cur = obj
for part in path.split("."):
if not isinstance(cur, dict) or part not in cur:
return default
cur = cur[part]
return cur
def try_fields(d: Dict[str, Any], names: List[str], default: Any = None) -> Any:
"""
Return first present/non-empty field from list of candidates.
From Chapter 10: Useful when different APIs use different field names
for the same data (e.g., 'id' vs 'order_id').
"""
for name in names:
val = d.get(name)
if val not in (None, ""):
return val
return default
# ============================================================================
# COLLECTION NORMALIZATION (Chapter 10, Section 4)
# ============================================================================
def normalize_collection(api_response: Any,
container_hints: Optional[List[str]] = None) -> List[Any]:
"""
Return a list of items regardless of response shape.
From Chapter 10: Handles direct arrays, wrapped collections, and single objects.
- list → itself
- dict+wrapper → wrapper list
- dict (single) → [dict]
- other → []
"""
# Direct array → pass through
if isinstance(api_response, list):
return api_response
# Not a dict → can't extract anything
if not isinstance(api_response, dict):
return []
# Check for common wrapper keys
keys = (container_hints or []) + COMMON_COLLECTION_KEYS
for key in keys:
val = api_response.get(key)
if isinstance(val, list):
return val
# No wrapper found → treat dict as single item
return [api_response]
def extract_items_and_meta(api_response: Any,
container_hints: Optional[List[str]] = None
) -> Tuple[List[Any], Dict[str, Any]]:
"""
Return (items, metadata) and normalize pagination signals.
From Chapter 10: Preserves metadata like totals and pagination cursors
while extracting the actual data array.
"""
meta: Dict[str, Any] = {}
# Direct list → no metadata
if isinstance(api_response, list):
return api_response, meta
# Not a dict → can't extract anything
if not isinstance(api_response, dict):
return [], meta
# Find the collection container
keys = (container_hints or []) + COMMON_COLLECTION_KEYS
container_key = None
for key in keys:
if key in api_response and isinstance(api_response[key], list):
container_key = key
break
# Extract items and separate metadata
if container_key:
items = api_response[container_key]
# Everything else is metadata
meta = {k: v for k, v in api_response.items() if k != container_key}
else:
# Single object response
items = [api_response]
meta = {}
# Normalize pagination signals into common format
meta_obj = meta.get("meta") if isinstance(meta.get("meta"), dict) else {}
# Look for cursor-style pagination
next_token = (
meta_obj.get("cursor") or
meta.get("cursor") or
None
)
# Look for URL-style pagination
if not next_token:
links = meta.get("links") if isinstance(meta.get("links"), dict) else {}
next_token = meta.get("nextPage") or links.get("next") or None
# Extract count and page information
total = meta.get("total") or meta.get("total_count") or meta_obj.get("total") or None
page_info = {
"page": meta.get("page") or meta_obj.get("page"),
"per_page": meta.get("per_page") or meta_obj.get("per_page")
}
# Add normalized fields to metadata
meta_norm = {"next_token": next_token, "total": total, "page_info": page_info}
return items, {**meta, **meta_norm}
def first_item(api_response: Any,
container_hints: Optional[List[str]] = None) -> Optional[Dict[str, Any]]:
"""
Get the first item (or None) across response variants.
From Chapter 10: Convenience helper for single-resource endpoints.
"""
items = normalize_collection(api_response, container_hints)
return items[0] if items else None
The functions split between two scales of access. At the field level, safe_get() walks dot-paths through nested dicts and returns the default the moment any link is missing, and try_fields() picks the first non-empty value from a list of candidate keys (useful when one API calls a field id and another calls it order_id). At the response level, extract_items_and_meta() normalises any list-shaped response into (items, meta) regardless of where the provider buried the items array, so the orchestrator never has to know which API returned the data.
Confirm the file is on disk and parses cleanly:
$ python -c "from json_helpers import first_item; print('json_helpers.py OK')"
json_helpers.py OK
If you see json_helpers.py OK, the module parsed and first_item (the last function defined) is present. If you don't, the paste truncated mid-file or json_helpers.py landed somewhere other than the project root.
validators.py: three-layer validation (from Chapter 12)
Chapter 12 split validation into three layers: structure (is the response shaped the way we expect?), content (do the fields hold the values we expect?), and business rules (does the data make sense in our domain?). The processors in §5 will compose all three. Save the consolidated module at the project root as validators.py:
"""
Validation infrastructure from Chapter 12.
Provides three-layer validation (structure → content → business rules).
"""
from typing import Tuple, Dict, Any, List, Optional
class ValidationError(Exception):
"""Custom exception for validation failures."""
pass
# ============================================================================
# STRUCTURE VALIDATION (Chapter 12, Section 2, Layer 1)
# ============================================================================
def validate_structure(data: Any, required_sections: List[str]) -> Tuple[bool, Optional[str]]:
"""
Validate response has expected top-level structure.
From Chapter 12: Layer 1 catches shape errors before attempting access.
Returns (is_valid, error_message).
"""
# Must be a dictionary
if not isinstance(data, dict):
return False, "Response must be a dictionary"
# Check required sections exist. The processor that owns each endpoint
# validates the expected type for each section.
for section in required_sections:
if section not in data:
return False, f"Missing required section: '{section}'"
return True, None
# ============================================================================
# CONTENT VALIDATION (Chapter 12, Section 2, Layer 2)
# ============================================================================
def validate_range(value: Any, field_name: str,
min_val: Optional[float] = None,
max_val: Optional[float] = None) -> Tuple[bool, Optional[str]]:
"""
Validate numeric value is within realistic range.
From Chapter 12: Catches sensor errors and placeholder values.
"""
try:
num_value = float(value)
except (ValueError, TypeError):
return False, f"{field_name} must be numeric"
if min_val is not None and num_value < min_val:
return False, f"{field_name} {num_value} below minimum {min_val}"
if max_val is not None and num_value > max_val:
return False, f"{field_name} {num_value} above maximum {max_val}"
return True, None
# ============================================================================
# HELPER: SAFE TYPE CONVERSION (Chapter 12, Section 4)
# ============================================================================
def safe_float(value: Any, field_name: str) -> float:
"""
Safely convert value to float with descriptive error.
From Chapter 12: Defensive programming for API responses.
"""
try:
return float(value)
except (ValueError, TypeError):
raise ValidationError(f"Cannot convert {field_name} '{value}' to float")
def safe_int(value: Any, field_name: str) -> int:
"""
Safely convert value to int with descriptive error.
From Chapter 12: Defensive programming for API responses.
"""
try:
return int(value)
except (ValueError, TypeError):
raise ValidationError(f"Cannot convert {field_name} '{value}' to int")
validate_structure() handles Layer 1: it checks that the response is a dict and that the top-level sections you expect are present. It deliberately does not require every section to be a dict; real APIs mix objects, arrays, strings, and numbers at the top level, so the endpoint-specific processor validates each section's expected type where it has the context to do so. validate_range() handles Layer 2: it answers "is this number within the realistic range?" Layer 3 (business rules) lives downstream in §5's processors, which chain these checks into per-response pipelines and add the domain-specific rules. The safe_float() and safe_int() helpers do the type-conversion side of Layer 2: they wrap conversion in a custom ValidationError so a non-numeric value surfaces with its field name attached.
Confirm the file is on disk and parses cleanly:
$ python -c "from validators import safe_int; print('validators.py OK')"
validators.py OK
If you see validators.py OK, the module parsed and safe_int (the last function defined) is present. If you don't, the paste truncated mid-file or validators.py landed somewhere other than the project root.
Why Chapter 11's aggregation patterns don't apply here
Chapter 11 solved a different shape of problem: hitting NewsAPI, the Guardian, and HackerNews in parallel, normalising three disparate response formats into one canonical model, and merging the results while letting any single source fail without taking the others down. That pattern earns its keep when you're aggregating the same kind of data from competing providers.
This chapter is the other shape: sequential calls to one provider, where the second call depends on the first. Geocoding has to succeed before weather can run, because the weather endpoint needs the coordinates the geocoding endpoint returns. There's no aggregation to do, no canonical model to design, no inter-source deduplication. The orchestrator just runs step one, decides whether step two is worth attempting, and packages whatever it got.
Chapter 11's parallel patterns come back in Part III when you wire Spotify together with other music APIs. For this chapter, the toolkit is error handling, safe navigation, and validation. Nothing more.
Three files in place
Three files on disk, three import checks passed: errors.py, json_helpers.py, validators.py. Section 4 starts with the API client, which calls retry_request directly so every outbound HTTP request gets exponential-backoff retry without the client layer having to think about it.