9. Chapter review

The Music Time Machine is now PostgreSQL-backed, locally tested under concurrent writes, deployed to Railway, and using raw_json JSONB plus full-text search for the queries that benefit from them. This page recaps the patterns and locks them in with a quiz.

The migration followed Chapter 24's workflow at application scale: audit, schema mapping, data move, application-code updates, verification, and deploy. Three patterns matter most:

  • Order migrations by dependency. Tracks move first because snapshots reference tracks.
  • Make scripts idempotent. ON CONFLICT lets a half-finished migration be rerun safely.
  • Prove the failure mode is gone. Concurrent-write tests verify the migration actually solved the file-lock problem.

Quiz

Select question to reveal the answer:
Why does SQLite produce lock errors with multiple Flask workers, and how does PostgreSQL solve this?

SQLite is embedded in the application process and coordinates writes through a single database file. Multiple deployed workers can collide when they try to write at the same time. PostgreSQL runs as a separate database server and coordinates concurrent clients with transactions, locks, and connection management. The app still has to commit and return connections correctly, but overlapping requests no longer fight over one local file.

What is JSONB, and why is it better than storing raw_json as SQLite text?

JSONB is PostgreSQL's binary JSON type. It validates JSON, stores it in a queryable representation, and supports indexes. SQLite text can preserve the provider payload, but to query inside it you usually load rows into Python and parse each string. PostgreSQL can answer questions such as raw_json->'album'->>'release_date' or raw_json @> '{"explicit": true}'::jsonb inside the database.

Why migrate tracks before snapshots?

Snapshots have a foreign key to tracks. If you insert snapshots first, PostgreSQL rejects rows whose track_id does not exist yet. Migrating tracks first respects the dependency graph and lets PostgreSQL enforce referential integrity during the move instead of after the fact.

Why use psycopg2's Json adapter when inserting raw_json?

The adapter serialises the Python object as JSON for PostgreSQL. That avoids the string-shaped-wrapper failure mode where a JSON string gets stored inside a JSONB value instead of an object. Verification catches this by checking jsonb_typeof(raw_json) = 'object' and by extracting nested fields.

What does ON CONFLICT do for migration safety?

It makes the migration idempotent. If tracks were already inserted before the script failed, rerunning the script updates or skips those rows instead of crashing on duplicate primary keys. The result of one run and several reruns is the same complete target database.

Why keep normal columns if raw_json has the full track object?

Normal columns are the stable application surface: track name, artist name, album name, duration, and URL. They make common queries simple and readable. raw_json preserves the complete provider payload for flexible future questions. The best schema uses both: normalise what you know you need, archive the rest.

What does EXPLAIN ANALYZE tell you?

It runs the query and shows PostgreSQL's execution plan: which indexes were used, how many rows were scanned, how long each step took, and the total execution time. Use it to confirm an index is actually helping rather than assuming that adding an index improved the route.

Looking ahead

Chapter 26 wraps the Music Time Machine in a REST API. The PostgreSQL foundation from this chapter makes that affordable: the connection pool handles concurrent API requests, full-text search powers lookup endpoints, and raw_json JSONB keeps provider-payload questions inside the database instead of pushing them into Python loops.