5. Defensive programming for optional fields
Production APIs almost never guarantee every field shows up every time. Whether a missing field is a crisis or a non-event depends entirely on which field went missing: a missing primary key breaks every downstream lookup, a missing display name just renders "Unknown." This page turns that distinction into code with a classify-then-handle pattern that generalises to every API you integrate, not just the vendor orders we're building toward.
Classify fields by criticality
Treat "missing" as a business decision, not a technical surprise. Before you write any handler, classify each field you care about. The handling strategy then falls out of the classification -- you stop making decisions case-by-case and start following a policy.
| Class | Examples | Consequence if missing | Action |
|---|---|---|---|
| Required | Primary key, timestamp, price for billing | Structural or business failure. A missing price breaks billing entirely; a missing order ID means the record can't be referenced downstream. | Fail fast (raise or return error), log context, stop processing this record |
| Conditionally required | Discount value when promo exists |
Policy violation. The promo code was applied but the discount amount is unknown; silently charging full price is a business error, not a technical one. | Validate the rule; reject or strip the invalid portion; continue only if policy allows |
| Recommended | Display name, avatar, summary text | Usability degradation only. A missing avatar shows a placeholder; a missing display name falls back to "Unknown." The transaction still completes. | Soft default ("Unknown", empty string), continue processing, optional warning log |
| Optional | Secondary attributes, notes, tags | Expected variation. The field just doesn't exist for this record type; absence carries no meaning. | Omit silently; keep shape stable with empty containers ([], {}) so downstream iteration still works |
The four rows aren't academic distinctions. Required means "this record is unusable without the field." Recommended means "the record is still useful; the user experience is slightly worse." Optional means "nothing changes if this is missing." Conditionally required is the tricky one -- it says "this field is required only when another field is present," and getting it wrong causes silent business errors instead of loud technical ones.
Guard helpers: require, default, safe_get
Two tiny helpers make intent obvious at the call site. require() says "this must exist or we abort." default() says "use this fallback if the field is missing." The third tool in the toolkit is safe_get(), which you already built on the previous pages -- we just import it here. Save as field_policies.py:
from typing import Any, Dict
from safe_get import safe_get # re-exported so callers can import everything here
class MissingRequired(Exception):
"""Raised when a required field is missing or empty."""
pass
def require(d: Dict[str, Any], name: str) -> Any:
"""Return d[name] or raise a clear error."""
if not isinstance(d, dict) or name not in d or d[name] in (None, ""):
raise MissingRequired(f"Missing required field: {name}")
return d[name]
def default(d: Dict[str, Any], name: str, fallback: Any):
"""Return d.get(name) with an explicit default."""
v = d.get(name)
return fallback if v in (None, "") else v
if __name__ == "__main__":
order = {"id": "ORD-123", "customer": {"email": "alice@example.com"}}
order_id = require(order, "id") # must exist
email = safe_get(order, "customer.email", "Unknown") # optional
notes = default(order, "notes", "") # optional string
print(order_id)
print(email)
print(repr(notes))
Three of the four classes have a dedicated helper. require() handles the "required" row -- raise loudly. default() handles recommended and optional -- substitute a fallback silently. safe_get() is the path helper both lean on when the field lives inside nested structure.
Conditionally required is the exception. Its rule is case-specific ("this field is required only when that field is present"), so there's no canonical helper to reach for. You'll see one implemented on the normalizer page when handling Variant B's promo-to-discount conversion: if promo is present, discount.value becomes required for that record, and the rule lives inline in the normalizer rather than in a reusable utility.
The from safe_get import safe_get line at the top of field_policies.py isn't strictly necessary -- this file doesn't call safe_get() internally. It's there as a re-export, so callers downstream can write from field_policies import require, default, safe_get and grab everything they need from one module. If you've consolidated the helpers into api_helpers.py as suggested on the flexible-access page, drop the re-export and import directly from api_helpers instead.
Defaults and sentinels that keep shapes stable
Defaults aren't about faithfully representing missing data. They're about stopping the next function from crashing. If downstream code iterates order["items"], returning [] is always safer than returning None -- the loop runs zero times and moves on. If the UI renders customer["email"], returning "Unknown" is safer than None because the string formatter doesn't blow up. Pick defaults that keep the shape stable:
- Strings:
""or"Unknown"(UI-safe) - Numbers:
0, but avoid when 0 is a meaningful value (useNoneand format in the UI) - Lists/dicts:
[]or{}so iteration and lookup both stay safe - Booleans: pick a policy default and document it -- usually
Falseunless you have a specific reason
Error policy: fail fast vs fail soft
Decide up front what happens when a required field is missing. Ad-hoc try/except scattered across the codebase is how missing-field bugs become silent business errors. Three policies cover most cases:
- Fail fast: required/structural fields raise or return an error object with context (endpoint, record id, payload hash). The caller decides whether to retry, quarantine, or alert.
- Fail soft: recommended and optional fields apply defaults and continue. A lightweight warning log is optional depending on whether the absence is interesting.
- Quarantine: when a whole record looks corrupt (schema mismatch, nonsensical types), log it and move it to a review bucket. Don't let one bad record poison the pipeline.
Worked example: defensive order extraction
Here's the minimal order extractor that puts the classification into practice. id is required (fail fast if missing), customer.email is recommended (default to "Unknown"), discount is optional (keep it None when absent). Save as extract_order_minimal.py:
from typing import Any, Dict
from field_policies import require, default, safe_get, MissingRequired
def extract_order_minimal(obj: Dict[str, Any]) -> Dict[str, Any]:
try:
# Required
oid = require(obj, "id")
# Recommended / Optional
email = safe_get(obj, "customer.email", "Unknown")
total = default(obj, "total", 0.0)
discount = default(obj, "discount", None)
return {
"id": oid,
"email": email,
"total": float(total) if total is not None else None,
"discount": discount,
}
except MissingRequired as e:
return {"error": str(e), "_raw_id": obj.get("id")}
if __name__ == "__main__":
good = {
"id": "ORD-123",
"customer": {"email": "alice@example.com"},
"total": 129.5,
}
degraded = {
"id": "ORD-124",
"total": "0",
"discount": {"type": "percent", "value": 10},
}
broken = {"customer": {"email": "nobody@example.com"}}
for order in (good, degraded, broken):
print(extract_order_minimal(order))
Run it from the project root:
python extract_order_minimal.py
{'id': 'ORD-123', 'email': 'alice@example.com', 'total': 129.5, 'discount': None}
{'id': 'ORD-124', 'email': 'Unknown', 'total': 0.0, 'discount': {'type': 'percent', 'value': 10}}
{'error': 'Missing required field: id', '_raw_id': None}
Three orders, three outcomes driven by the same policy. The good order produced the canonical shape. The degraded order was missing the recommended email but had everything required, so it came through with "Unknown" as the fallback and the optional discount preserved. The broken order was missing the required id, so require() raised MissingRequired, the caller caught it, and returned a structured error object -- loud, contextual, and recoverable by the caller who can quarantine the record or alert downstream.
The defensive toolkit is complete: classify the field, pick the helper that matches, let defaults keep the shape stable, decide up front what "missing required" means. The only thing left is to assemble everything -- discovery, flexible access, deep navigation, field policies -- into one normalizer that turns both vendor order variants into the same canonical output. That's the normalizer page.