3. Systematic error categorization

The three-part message pattern from the previous page is only useful if something upstream can tell it which message to produce. That something is a categorizer: a single function that takes any exception and returns one of four labels. With the label in hand, the rest of the application knows what to do — show a message, retry, suggest alternatives, log and bail. Without it, every error site grows its own decision tree, and none of them agree.

Why categorization scales

Without categorization: N different errors map to N different handlers. Adding a new API means adding more custom handling for each possible failure mode. With categorization: N different errors map to 4 categories, and those 4 categories each have one consistent handler. When you add a new API or encounter a new error type, you map it to an existing category rather than writing new handling code. The system stays maintainable as complexity grows.

The four error categories

Every error in your application maps to one of these four categories. Each category has a distinct handling strategy:

Category When to Use Response Strategy Examples
user_input Invalid input before network call Fail fast, provide examples, no retry Empty string, too long, invalid characters
transient Temporary failures that might resolve Retry with exponential backoff + jitter Timeouts, 429 rate limits, 500/502/503
not_found Resource doesn't exist Suggest alternatives, no retry City not found, 404 responses
unknown Unexpected failures Log details, show generic message Unexpected exceptions, parsing errors

This categorization isn't arbitrary - it's based on what action your application should take. User input errors need immediate feedback with no retry. Transient errors need retry logic. Not found errors need suggestions. Unknown errors need logging and investigation.

Why check order matters

The order you check for errors determines accuracy. Always check in this sequence:

Four error categories checked in order: user_input, transient, not_found, unknown. Each has a small semantic marker and a one-line description.
The four categories, checked in order. Input first, then transient (retryable), then not-found (permanent), then unknown as the catch-all.

Why this order? User input problems can trigger various exceptions that look like network errors. For example, an empty city name might cause a KeyError when parsing the API response, which looks like a data problem. But the root cause is invalid input - checking it first prevents misdiagnosis and wasted API requests.

check_order.py
# WRONG: Check network first
try:
    response = requests.get(f"https://api.example.com/search?q={city_name}")
    data = response.json()
    location = data["results"][0]  # KeyError if city_name was empty!
except KeyError:
    return "not_found"  # WRONG! This is actually a user_input error

# RIGHT: Check input first
if not city_name or not city_name.strip():
    return "user_input", "empty"

try:
    response = requests.get(f"https://api.example.com/search?q={city_name}")
    data = response.json()
    location = data["results"][0]
except KeyError:
    return "not_found", "city"  # NOW this is accurate

Checking input first prevents wasted API requests, ensures accurate categorization, and fails fast with appropriate user guidance.

Implementation: the categorize function

Create a single function that maps any exception to one of the four categories. This function becomes the central decision point for all error handling:

error_categorizer.py
import requests
from requests.exceptions import Timeout, ConnectionError, HTTPError

def categorize_error(exception, response=None):
    """
    Categorize any exception into one of four types.
    
    Returns: (category, error_type, context)
    """
    # User input errors (checked before network calls)
    if isinstance(exception, ValueError):
        if "empty" in str(exception).lower():
            return ("user_input", "empty", {})
        elif "too long" in str(exception).lower():
            return ("user_input", "too_long", {})
        else:
            return ("user_input", "invalid_chars", {})
    
    # Transient errors that should be retried
    if isinstance(exception, Timeout):
        return ("transient", "timeout", {})
    
    if isinstance(exception, ConnectionError):
        return ("transient", "connection", {})
    
    if isinstance(exception, HTTPError):
        status_code = exception.response.status_code
        
        # Rate limiting
        if status_code == 429:
            retry_after = exception.response.headers.get('Retry-After', '60')
            return ("transient", "rate_limit", {"retry_seconds": retry_after})
        
        # Server errors (temporary)
        if status_code in [500, 502, 503, 504]:
            return ("transient", "server_error", {})
        
        # Not found
        if status_code == 404:
            return ("not_found", "city", {})
    
    # API returned empty results (city not found)
    if isinstance(exception, KeyError) and exception.args[0] == 'results':
        return ("not_found", "city", {})
    
    # Everything else is unknown
    return ("unknown", "general", {})

This function encapsulates all error categorization logic in one place. When you encounter a new error type, you just add a condition here - you don't need to modify message composition, retry logic, or logging code.

Mapping HTTP status codes

HTTP status codes map predictably to categories. Here's the complete mapping:

Status Code Meaning Category Should Retry?
400 Bad Request (invalid input) user_input ❌ No
404 Not Found (resource doesn't exist) not_found ❌ No
429 Too Many Requests (rate limit) transient ✅ Yes (after Retry-After)
500 Internal Server Error transient ✅ Yes
502 Bad Gateway transient ✅ Yes
503 Service Unavailable transient ✅ Yes
504 Gateway Timeout transient ✅ Yes

Client errors (4xx) are typically permanent - retrying won't help. Server errors (5xx) are typically temporary - the service might recover. Rate limits (429) are special: they're temporary but require respecting the Retry-After header.

Special case: 429 rate limits

When APIs return 429 with a Retry-After header, honor it exactly. The API knows its capacity better than your application does. If the header says "wait 60 seconds", wait 60 seconds. Ignoring it can get you banned or throttled more aggressively. Only fall back to exponential backoff when the API doesn't provide explicit guidance.

Validation before network calls

Always validate user input before making network requests. This prevents wasted API calls, provides faster feedback, and ensures accurate error categorization:

input_validation.py
def validate_city_name(city_name):
    """
    Validate city name before making API calls.
    Raises ValueError with descriptive message if invalid.
    """
    # Check for empty input
    if not city_name or not city_name.strip():
        raise ValueError("City name cannot be empty")
    
    # Check length
    if len(city_name) > 100:
        raise ValueError("City name too long (maximum 100 characters)")
    
    # Check for valid characters (letters, spaces, hyphens, apostrophes)
    import re
    if not re.match(r"^[a-zA-Z\s\-']+$", city_name):
        raise ValueError("City name contains invalid characters")
    
    return city_name.strip()

def find_location(city_name):
    """Find location with validation."""
    # Validate FIRST, before network call
    try:
        validated_name = validate_city_name(city_name)
    except ValueError as e:
        # Categorize as user_input and return early
        category, error_type, context = categorize_error(e)
        message = compose_error_message(category, error_type, city_name=city_name)
        return None, message
    
    # NOW make the network request
    try:
        response = requests.get(f"https://api.example.com/search?q={validated_name}")
        response.raise_for_status()
        data = response.json()
        return data["results"][0], None
    except Exception as e:
        category, error_type, context = categorize_error(e, response)
        message = compose_error_message(category, error_type, city_name=city_name)
        return None, message

This validation-first approach provides immediate feedback for user input errors, prevents wasted API requests (and associated costs), and ensures accurate error categorization.