2. PostgreSQL schema mapping

Schema mapping is where most migrations earn their bugs. Type mismatches, missing constraints, and foreign-key ordering all need explicit attention before any data moves. Here, the important move is small but powerful: tracks.raw_json changes from SQLite TEXT to PostgreSQL JSONB.

Mapping tracks

The tracks table stores the stable entity data. Common fields stay as normal columns because the app reads them constantly. The complete Spotify track object lives in raw_json so future questions can reach into fields the original schema did not promote.

SQLite column	PostgreSQL column	Why
`track_id TEXT`	`track_id VARCHAR(64) PRIMARY KEY`	Spotify IDs are short strings; the primary key stays stable.
`name TEXT`	`name TEXT NOT NULL`	Track names can be long; no artificial length cap needed.
`artist_name TEXT`	`artist_name TEXT NOT NULL`	Denormalised for the single-user project and simple dashboard queries.
`album_name TEXT`	`album_name TEXT`	Useful display field, but not required for every imported row.
`duration_ms INTEGER`	`duration_ms INTEGER`	Same numeric type is sufficient.
`spotify_url TEXT`	`spotify_url TEXT`	External URLs vary in length.
`raw_json TEXT`	`raw_json JSONB`	The cached provider payload becomes queryable and indexable.

Mapping snapshots

Snapshots are events: a track appeared in a user's top tracks on a date, for a given Spotify time range, at a given rank. The composite primary key is the important contract because it keeps reruns idempotent.

SQLite column	PostgreSQL column	Why
`track_id TEXT`	`track_id VARCHAR(64) REFERENCES tracks(track_id)`	Every snapshot must point at an existing track.
`snapshot_date TEXT`	`snapshot_date DATE`	PostgreSQL can compare and index dates natively.
`time_range TEXT`	`time_range VARCHAR(32)`	Values such as `short_term`, `medium_term`, and `long_term`.
`rank INTEGER`	`rank INTEGER CHECK (rank > 0)`	Ranks start at 1; the database can enforce that.

The target schema

schema_postgres.sql

CREATE TABLE IF NOT EXISTS tracks (
    track_id VARCHAR(64) PRIMARY KEY,
    name TEXT NOT NULL,
    artist_name TEXT NOT NULL,
    album_name TEXT,
    duration_ms INTEGER,
    spotify_url TEXT,
    raw_json JSONB,
    created_at TIMESTAMPTZ DEFAULT NOW(),
    updated_at TIMESTAMPTZ DEFAULT NOW()
);

CREATE TABLE IF NOT EXISTS snapshots (
    track_id VARCHAR(64) NOT NULL REFERENCES tracks(track_id) ON DELETE CASCADE,
    snapshot_date DATE NOT NULL,
    time_range VARCHAR(32) NOT NULL,
    rank INTEGER NOT NULL CHECK (rank > 0),
    created_at TIMESTAMPTZ DEFAULT NOW(),
    PRIMARY KEY (track_id, snapshot_date, time_range)
);

CREATE INDEX IF NOT EXISTS idx_snapshots_date_range
    ON snapshots (snapshot_date, time_range);

CREATE INDEX IF NOT EXISTS idx_snapshots_track_date
    ON snapshots (track_id, snapshot_date);

CREATE INDEX IF NOT EXISTS idx_tracks_artist_name
    ON tracks (artist_name);

CREATE INDEX IF NOT EXISTS idx_tracks_raw_json
    ON tracks USING GIN (raw_json);

CREATE INDEX IF NOT EXISTS idx_tracks_release_year
    ON tracks ((LEFT(raw_json->'album'->>'release_date', 4)::integer));

Two JSONB indexes do different jobs. The GIN index helps containment queries such as raw_json @> '{"explicit": true}'::jsonb. The expression index helps numeric filters or sorts on a specific extracted field, such as LEFT(raw_json->'album'->>'release_date', 4)::integer.

Why raw_json stays alongside normal columns

Do not replace the normal columns with JSONB lookups everywhere. The normal columns are the app's stable surface: names, artist names, album names, duration, and URLs. raw_json is the flexible archive. Keeping both gives you fast common queries and room for future analysis without schema churn.

Next, in section 3, we create the local and Railway PostgreSQL databases and apply this schema.