Chapter 19: Testing your application

1. Why testing matters

You've built a Flask app on top of SQLite: charts, playlists, OAuth, settings actions. Everything works when you click through it. This chapter covers the gap between "works when I demo it" and "I can prove it still works after the next change". The toolkit is pytest, in-memory SQLite fixtures, mocked Spotify calls, and Flask test client coverage of the routes Chapter 18 built.

Three chapters of code now sit on disk: Chapter 16's per-feature scripts, Chapter 17's Flask spine, Chapter 18's three feature pages. Every change you make from here on can break something. A new playlist source might silently regress the existing ones. A schema tweak might corrupt the catalogue-growth query. An OAuth refactor might break the disconnect flow's session-key cleanup. Without tests, you find out when a recruiter or hiring manager clicks something during a demo. With tests, you find out in the seconds before you commit.

This chapter is the one that turns a portfolio project into something you can put on a resume. "Tested with pytest, nearly 50 tests, runs in under five seconds, GitHub Actions on every push" is a different kind of artifact than a working app. It signals that you understand maintainability, not just feature completion -- and that's what the next round of code review (interview, code review at a new job) will pressure-test you on.

What you'll learn

Set up pytest with a clean project structure and shared fixtures in conftest.py
Apply the Arrange-Act-Assert pattern to pure functions like _build_playlist_query and normalise_track
Mock Spotify API calls with unittest.mock and pytest-mock so tests never depend on the network
Write integration tests against an in-memory SQLite database seeded with realistic snapshot fixtures
Use Flask's app.test_client() to exercise the eight routes Chapter 18 built, including AJAX, CSRF, and authenticated session manipulation
Freeze time with freezegun to make date-dependent tests deterministic
Measure coverage with pytest-cov and read reports without obsessing over the percentage
Run the suite on every push with GitHub Actions

What you'll build

tests/conftest.py — shared fixtures: in_memory_db, seeded_db, app_client, authenticated_client
tests/test_helpers.py — unit tests for _format_month, _build_playlist_query, normalise_track, retry_with_backoff
tests/test_database.py — integration tests for calculate_taste_stats, get_taste_chart_data, find_forgotten_gems
tests/test_spotify_client.py — mocked tests for SpotifyClient's OAuth methods
tests/test_routes.py — Flask test client coverage for /analytics, /playlists, /api/generate-playlist, and the five /settings/* routes
tests/test_auth.py — @require_auth decorator and OAuth state CSRF defence
tests/test_features.py — Chapter 16's monthly-snapshot writes (creation and within-day idempotency) with frozen time
pytest.ini + .gitignore + .github/workflows/tests.yml — configuration + CI

Carry-forward from earlier chapters

The application code stays exactly where Chapters 16, 17, and 18 left it. You're not refactoring; you're adding a tests/ folder alongside the existing files. Every module you've built becomes a test target:

Chapter 16 already showed you the basics. Its testing page introduced in-memory SQLite fixtures (:memory:) and unittest.mock.MagicMock / patch for the Spotipy client. This chapter grows that into a full pytest suite with shared fixtures, parameterised tests, and CI integration.
Chapter 17's helpers are unit-test gold. _format_month, get_db_connection, calculate_taste_stats, refresh_token_if_needed, the require_auth decorator -- all pure-or-near-pure, all easy to test against in-memory fixtures.
Chapter 18 hands you a route inventory. Eight Flask routes (Analytics, Playlist Manager form + AJAX, five Settings actions) with explicit edge cases: silent-coerce on invalid ?range=, playlist-source whitelist validation, two-step confirmation on destructive actions, X-CSRFToken header for AJAX. Chapter 18's review page commits Chapter 19 to pressure-test all of these.

The testing pyramid

Not every test costs the same. The testing pyramid is the standard heuristic for where to spend effort: lots of fast unit tests at the base, fewer integration tests in the middle, and very few end-to-end tests at the top. Each layer catches different problems with different costs.

Unit tests run in milliseconds. They verify pure functions in isolation -- no database, no network, no Flask app. The Music Time Machine has plenty of these: _format_month, _build_playlist_query, normalise_track, retry_with_backoff. When a unit test fails, the cause is usually obvious from the test name.
Integration tests run in tens of milliseconds. They verify components working together against real (but ephemeral) dependencies: in-memory SQLite for database queries, Flask's test client for routes. Slower than unit tests, but they catch SQL mistakes, schema assumptions, and request/response bugs that unit tests can't.
End-to-end tests run in seconds and break easily. They simulate full user workflows across OAuth, Spotify, and the database. Powerful but expensive: real API access, real network, brittle. This chapter keeps them minimal -- a quick manual smoke test for "does the whole app still work?" rather than automated coverage.

The chapter focuses on the highest-return layers: unit and integration. Six of the eight sub-pages live there. The flask-routes sub-page is the integration-test centerpiece because Chapter 18's outbound commitment lands there. End-to-end stays out of scope.

How the chapter unfolds

You'll start by setting up pytest with a project structure that mirrors what Chapters 16-18 actually shipped (single-file app.py plus per-feature Chapter 16 scripts). Then unit tests on the pure-function helpers. Then mocking the Spotify boundary so tests don't need network access. Then in-memory SQLite for database queries. Then Flask's test client for the routes Chapter 18 built. Then time-dependent logic and coverage measurement. The review page locks in what carries forward to Chapter 20's deployment.

By the end, you'll have a suite that runs on every push, catches regressions before they reach production, and gives you a real answer to "how do you know your code works?"