3. Flexible access patterns

Discovery maps the shape; this page writes the access code that doesn't care which shape arrives. One access layer handles direct arrays, wrapped collections, and single objects, preserves pagination metadata, tolerates alternative field names, and powers the vendor-orders normalizer you'll build at the end of the chapter.

Step 1: normalize any collection shape

The first utility handles the most common variation -- where the actual data lives. Some APIs return arrays directly. Others wrap arrays under keys like items, results, or data. Single-item endpoints often return an unwrapped object. Your code should not need to know which pattern each API uses.

The function below inspects the response type and structure, then returns a list regardless of the input shape. Direct arrays pass through unchanged. Wrapped collections get unwrapped. Single objects become one-item lists. After this, the rest of your code can always expect a list. Save it as normalize_collection.py:

normalize_collection.py
from typing import Any, List, Optional

COMMON_COLLECTION_KEYS = ["items", "results", "data", "content", "entries", "records"]

def normalize_collection(
    api_response: Any,
    container_hints: Optional[List[str]] = None,
) -> List[Any]:
    """
    Return a list of items regardless of response shape:
    - list          -> itself
    - dict+wrapper  -> wrapper list
    - dict (single) -> [dict]
    - other         -> []
    """
    # Direct array: pass through
    if isinstance(api_response, list):
        return api_response

    # Not a dict: can't extract anything
    if not isinstance(api_response, dict):
        return []

    # Check for common wrapper keys
    keys = (container_hints or []) + COMMON_COLLECTION_KEYS
    for key in keys:
        val = api_response.get(key)
        if isinstance(val, list):
            return val

    # No wrapper found: treat dict as single item
    return [api_response]

The container_hints parameter lets you handle domain-specific wrappers without modifying the function. If an API uses products or repositories instead of the common patterns, pass them as hints. You'll keep this function in your toolkit and reach for it again and again -- each new API you integrate is a chance to add another wrapper convention to your hints list.

Step 2: extract items and preserve metadata

The normalizer above solves one problem but creates another: it discards everything except the items. Real APIs include valuable metadata -- pagination cursors, total counts, page information -- that downstream code often needs. APIs signal "more data available" in several ways:

  • Page numbers: ?page=2 (like book pages)
  • Cursors: opaque tokens like "cursor": "eyJwYWdlIjoyfQ=="
  • Next URLs: direct links like "/search?q=python&page=2"
  • Offset/limit: ?offset=20&limit=20 (skip 20, get next 20)

The enhanced helper below does two jobs. It normalizes the collection structure (like the previous function) and captures everything else as metadata. It also recognises the common pagination patterns and collapses them into a single next_token field, so downstream code doesn't need to know which style an API uses. Save it as extract_items_and_meta.py:

extract_items_and_meta.py
from typing import Any, Dict, List, Tuple, Optional
from normalize_collection import COMMON_COLLECTION_KEYS

def extract_items_and_meta(
    api_response: Any,
    container_hints: Optional[List[str]] = None,
) -> Tuple[List[Any], Dict[str, Any]]:
    """
    Return (items, metadata) and normalize pagination signals to:
      meta.next_token  (cursor or next URL)
      meta.total       (total results if present)
      meta.page_info   (page/per_page if present)
    """
    meta: Dict[str, Any] = {}

    # Direct list: no metadata
    if isinstance(api_response, list):
        return api_response, meta

    # Not a dict: can't extract anything
    if not isinstance(api_response, dict):
        return [], meta

    # Find the collection container
    keys = (container_hints or []) + COMMON_COLLECTION_KEYS
    container_key = None
    for key in keys:
        if key in api_response and isinstance(api_response[key], list):
            container_key = key
            break

    # Extract items and separate metadata
    if container_key:
        items = api_response[container_key]
        # Everything else is metadata
        meta = {k: v for k, v in api_response.items() if k != container_key}
    else:
        # Single object response
        items = [api_response]
        meta = {}

    # Normalize pagination signals into common format
    meta_obj = meta.get("meta") if isinstance(meta.get("meta"), dict) else {}

    # Cursor-style pagination
    next_token = (
        meta_obj.get("cursor")
        or meta.get("cursor")
        or None
    )

    # URL-style pagination
    if not next_token:
        links = meta.get("links") if isinstance(meta.get("links"), dict) else {}
        next_token = meta.get("nextPage") or links.get("next") or None

    # Count and page information
    total = meta.get("total") or meta.get("total_count") or meta_obj.get("total") or None
    page = meta.get("page") or meta_obj.get("page")
    per_page = meta.get("per_page") or meta_obj.get("per_page")
    page_info = {"page": page, "per_page": per_page} if (page or per_page) else None

    meta_norm = {"next_token": next_token, "total": total, "page_info": page_info}
    return items, {**meta, **meta_norm}

Whether an API uses cursors, page numbers, or next-URL links, callers of extract_items_and_meta() always get a next_token field to check for more data. The original provider-specific keys stay in meta too, so you still have everything if you need to inspect it. One pagination convention this helper does not cover: HTTP Link headers (used by GitHub's search endpoint and others). Those signals live outside the JSON body, so the helper can't see them; if you hit a header-paginated API, you'll need to parse response.headers["Link"] separately and feed the cursor or URL back in.

Step 3: a convenience helper for single items

Many API calls fetch exactly one resource -- one user, one repository, one order. Rather than normalising to a list and immediately accessing [0], this helper does both steps safely. Save it next to the others:

first_item.py
from typing import Any, Dict, List, Optional
from normalize_collection import normalize_collection

def first_item(
    api_response: Any,
    container_hints: Optional[List[str]] = None,
) -> Optional[Dict[str, Any]]:
    """Get the first item (or None) across response variants."""
    items = normalize_collection(api_response, container_hints)
    return items[0] if items else None

This is useful for detail endpoints where you know there's exactly one result, and for processing search results one record at a time.

Seeing it work against the two GitHub endpoints

The single-repository endpoint returns an unwrapped object with many fields; the search endpoint wraps results in an items array with metadata. The same access code should handle both. Save this driver as test_extract.py:

test_extract.py
import requests
from extract_items_and_meta import extract_items_and_meta
from first_item import first_item

# Fetch both GitHub response types
single_repo = requests.get(
    "https://api.github.com/repos/octocat/Hello-World",
    timeout=10,
).json()

search_results = requests.get(
    "https://api.github.com/search/repositories?q=python&per_page=2",
    timeout=10,
).json()

print("=== Testing Universal Access Patterns ===\n")

# Single repository endpoint
items1, meta1 = extract_items_and_meta(single_repo)
print("Single repo endpoint:")
print(f"  Items returned: {len(items1)}")
print(f"  Pagination token: {meta1.get('next_token')}")
print(f"  Total count: {meta1.get('total')}")
print(f"  First item name: {items1[0].get('name')}\n")

# Search endpoint
items2, meta2 = extract_items_and_meta(search_results)
print("Search endpoint:")
print(f"  Items returned: {len(items2)}")
print(f"  Pagination token: {meta2.get('next_token')}")
print(f"  Total count: {meta2.get('total'):,}")
print(f"  First item name: {items2[0].get('name')}\n")

# Convenience helper
first = first_item(single_repo)
print("Using first_item() helper:")
print(f"  Repository: {first.get('name')} by {first.get('owner', {}).get('login')}")

Run it from the project root:

Terminal
python test_extract.py

Representative output. GitHub's search totals and ranking change over time, so your total count and first result may differ:

Terminal
=== Testing Universal Access Patterns ===

Single repo endpoint:
  Items returned: 1
  Pagination token: None
  Total count: None
  First item name: Hello-World

Search endpoint:
  Items returned: 2
  Pagination token: None
  Total count: 8,937,004
  First item name: public-apis

Using first_item() helper:
  Repository: Hello-World by octocat

The same extraction code worked against both responses. The single-repository endpoint was normalized to a one-item list; the search endpoint's items array was extracted, and its total_count came through under meta['total']. Downstream code sees a consistent interface regardless of API structure.

Step 4: safe field access utilities

With containers normalized, two field-level challenges remain: navigating nested paths safely, and dealing with APIs that use different field names for the same concept. Two small helpers cover both. Save them as safe_get.py:

safe_get.py
from typing import Any, Dict, List

def safe_get(obj: Any, path: str, default=None):
    """
    Dot-path lookup: 'owner.login' -> obj['owner']['login'] if present.
    Returns default if any part of the path doesn't exist.
    """
    cur = obj
    for part in path.split("."):
        if not isinstance(cur, dict) or part not in cur:
            return default
        cur = cur[part]
    return cur

def try_fields(d: Dict[str, Any], names: List[str], default=None):
    """
    Return the first present/non-empty field from a list of candidates.
    Useful when different APIs use different names for the same concept.
    """
    for name in names:
        val = d.get(name)
        if val not in (None, ""):
            return val
    return default

safe_get() handles nested navigation across dict keys without crashes. try_fields() handles field-name variations -- trying id then order_id, or total then amount. This version walks dict keys only; the navigation page extends it to understand periods[0]-style array indices as first-class path segments, so a single dot-path can drill through mixed dict-and-array nesting.

Build a project utilities file

Don't copy-paste these functions into every script you write. Create a dedicated api_helpers.py in your project directory and collect them there as you build them. By the end of the chapter you'll have safe_get(), try_fields(), normalize_collection(), extract_items_and_meta(), and more -- all in one place. Whenever you need them:

main.py
from api_helpers import safe_get, extract_items_and_meta

# Use them anywhere in your main script
owner = safe_get(repo, "owner.login", "Unknown")

This is how professional developers actually work. Nobody memorises these patterns, they build a personal toolkit and reuse it. The api_helpers.py you leave this chapter with is infrastructure you carry into every future project.

From here on, the per-file "save as" labels on each code block (normalize_collection.py, safe_get.py, and so on) treat each function's file as a convenience for running the example in isolation -- the canonical home for all of them is the api_helpers.py you just started. When the later pages say "replace safe_get.py with this extended version" or "save field_policies.py," feel free to paste the new content into the appropriate section of api_helpers.py instead. The standalone filenames are about teaching one idea per file; the toolkit is one file.

Putting it together: a cross-endpoint repository extractor

One more example that combines every utility on this page: a unified extractor that works against both GitHub endpoints. Save as extract_repo_info.py:

extract_repo_info.py
import requests
from first_item import first_item
from extract_items_and_meta import extract_items_and_meta
from safe_get import safe_get

def extract_repo_info(api_response):
    """
    Return (repo_dict, meta) with consistent shape.
    Works with single-object responses or wrapped collections.
    """
    # First repository from any response shape
    repo = first_item(api_response)
    if not isinstance(repo, dict):
        return None, {}

    # Fields via safe navigation
    info = {
        "name": repo.get("name", "Unknown"),
        "owner": safe_get(repo, "owner.login", "Unknown"),
        "stars": repo.get("stargazers_count", 0),
        "description": repo.get("description") or "No description",
        "language": repo.get("language") or "Not specified",
        "url": repo.get("html_url", ""),
        "private": bool(repo.get("private", False)),
    }

    # Preserve pagination metadata
    _, meta = extract_items_and_meta(api_response)
    return info, meta

if __name__ == "__main__":
    print("=== Cross-API Repository Extraction ===\n")

    # Single repository endpoint
    single = requests.get(
        "https://api.github.com/repos/octocat/Hello-World",
        timeout=10,
    ).json()
    repo1, meta1 = extract_repo_info(single)
    print("Single endpoint:")
    print(f"  {repo1['name']} by {repo1['owner']}")
    print(f"  Stars: {repo1['stars']:,}")
    print(f"  Language: {repo1['language']}")
    print(f"  Pagination: {meta1.get('next_token')}\n")

    # Search endpoint
    search = requests.get(
        "https://api.github.com/search/repositories?q=python&per_page=1",
        timeout=10,
    ).json()
    repo2, meta2 = extract_repo_info(search)
    print("Search endpoint:")
    print(f"  {repo2['name']} by {repo2['owner']}")
    print(f"  Stars: {repo2['stars']:,}")
    print(f"  Language: {repo2['language']}")
    print(f"  Total results: {meta2.get('total'):,}")

Run it. The repository names, star counts, languages, and totals below are representative because they come from GitHub's live API:

Terminal
python extract_repo_info.py
Terminal
=== Cross-API Repository Extraction ===

Single endpoint:
  Hello-World by octocat
  Stars: 3,126
  Language: Not specified
  Pagination: None

Search endpoint:
  public-apis by public-apis
  Stars: 294,142
  Language: Python
  Total results: 8,937,004

The same function handled both endpoints. It safely navigated nested fields like owner.login. It preserved pagination metadata. It filled sensible defaults for missing fields. Any caller of extract_repo_info() now receives a predictable dictionary regardless of which endpoint was hit.

Those are the wins you get from separating container normalisation, metadata extraction, and field access into independent utilities: downstream code stops caring which wrapper the API used, pagination info flows through instead of getting discarded, nested field navigation stops being a crash risk, and a new API with an unfamiliar wrapper works by adding one entry to container_hints.

Containers are normalized, metadata survives, top-level fields are reachable. The next challenge is what happens when the data you want lives five or six levels deep, behind a chain of dictionaries that may or may not be fully populated. That's where the safe-get helper earns its keep -- and where we extend it to handle array indices as first-class path segments.