4. Automating with JSON Schema

The manual pipeline from section 3 works and ships. The 60 lines of structural and content checks are also near-repetitive if / try / return code that grows linearly with every new field. JSON Schema replaces them with a declarative document the jsonschema library reads and enforces. This page does the swap; the section-3 business-rule validator stays intact for section 5's hybrid approach.

What JSON Schema is

JSON Schema is a standard for describing the shape and constraints of JSON data. You write the rules once as a JSON document (required fields, types, numeric ranges, string patterns, nested structures), and a library reads it and validates data against it. The schema itself is declarative and human-readable; anyone on the team can scan it and know what a valid response looks like without hunting through Python source.

Here is the structural and content validation from section 3, expressed as a single JSON Schema:

weather_schema.json
{
  "type": "object",
  "required": ["current"],
  "properties": {
    "current": {
      "type": "object",
      "required": ["temperature_2m"],
      "properties": {
        "temperature_2m": {
          "type": "number",
          "minimum": -100,
          "maximum": 60
        },
        "relative_humidity_2m": {
          "type": "number",
          "minimum": 0,
          "maximum": 100
        },
        "wind_speed_10m": {
          "type": "number",
          "minimum": 0,
          "maximum": 200
        },
        "weather_code": {
          "type": "integer"
        },
        "apparent_temperature": {
          "type": "number"
        }
      }
    },
    "current_units": {
      "type": "object"
    }
  }
}

What the schema defines

Four things, covering both layer 1 and layer 2 of the three-layer pattern.

Structure: root is an object with a required current field, and current is itself an object with a required temperature_2m.

Types: every numeric field declares "type": "number" or "integer", which the validator enforces before any range check.

Constraints: temperature between -100 and 60, humidity between 0 and 100, wind speed between 0 and 200.

Required versus optional: temperature_2m is in the required array, the rest are optional and only validated if present.

Two small differences from the manual version are intentional. The schema treats missing current_units as valid rather than printing a warning, because that warning was best-effort logging rather than validation. It also checks that weather_code is an integer but does not enumerate every known code; unknown icon codes are an enhancement-tier concern handled gracefully at the application boundary in section 6.

The schema is roughly 35 lines of JSON. The equivalent hand-written validators from section 3 are closer to 60 lines of Python. The schema also doubles as documentation -- onboarding a new developer to the Weather Dashboard is a single file read rather than a tour of three validator functions.

Running the schema in Python

The jsonschema library reads a schema dictionary and validates instances against it. Install it with pip install jsonschema, then the validator is short:

schema_validator.py
import json
from jsonschema import validate, ValidationError


def load_schema(path="weather_schema.json"):
    with open(path) as f:
        return json.load(f)


def validate_weather_with_schema(data, schema):
    """Validate weather data against a JSON Schema.
    Returns (is_valid, error_message) to match the manual pipeline."""
    try:
        validate(instance=data, schema=schema)
        return True, None
    except ValidationError as e:
        return False, e.message


if __name__ == "__main__":
    schema = load_schema()

    good_data = {
        "current": {
            "temperature_2m": 22.5,
            "relative_humidity_2m": 65,
            "wind_speed_10m": 12.3,
        }
    }
    bad_data = {"current": {"temperature_2m": 150}}

    print(validate_weather_with_schema(good_data, schema))
    print(validate_weather_with_schema(bad_data, schema))
Terminal
$ python schema_validator.py
(True, None)
(False, '150 is greater than the maximum of 60')

The function returns (is_valid, error_message) deliberately: it matches the signature of the manual validators from section 3, so the hybrid approach in section 5 can swap one for the other without rewriting the pipeline. The error message comes straight from the library -- "150 is greater than the maximum of 60" is clear, specific, and names the violated constraint. And the schema is library-agnostic: if you ever need to swap jsonschema for fastjsonschema or an OpenAPI-backed validator, the schema document travels with you.

Where schemas shine and where they struggle

JSON Schema is excellent at mechanical validation and weak at domain logic. That is not a flaw -- it is the scope of the standard. Knowing the line tells you where to stop and write the rest in Python.

Validation type Schema capability Example
Type checking Excellent temperature must be a number
Range validation Excellent humidity between 0 and 100
Required fields Excellent temperature_2m is required
String patterns Good email must match a regex
Cross-field logic Limited / awkward snow codes at warm temperatures
Complex business rules Better in Python apparent versus actual temperature

The mechanical rows (type, range, required, patterns) cover what JSON Schema is for. JSON Schema also has conditional keywords for some relationship rules, but once a rule needs domain judgement, arithmetic, or array-by-array comparison, Python is clearer. That split is the whole point of the hybrid approach in section 5: schemas automate the mechanical layers; hand-written code handles the domain.

The closing observation from section 3 was that manual validation is repetitive. JSON Schema removes the repetition and tightens type coverage on weather_code and apparent_temperature, which the manual content validator left to Layer 3's try/except blocks. Validation still runs at the same boundary; only the implementation changes. Section 5 combines the schema you just wrote with the section 3 business-rule validator into one production-ready pipeline.