7. Testing OAuth Flows

OAuth is the kind of code you write once, ship, and never touch again. Which is exactly why it's the code most worth testing. Six months later, a dependency bump changes the requests default timeout, or you refactor your session handling, or you migrate to a new OAuth provider. Something subtle breaks. Your login flow starts failing silently for 2% of users. You don't find out until someone tweets about it.

A good OAuth test suite pins down three things: your code sends the right payload to the token endpoint, your CSRF protection actually rejects mismatched state values, and your error handling doesn't leave partial state behind when the upstream call fails. All three are testable end-to-end with responses and the Flask test client, without running a real OAuth flow even once.

Add the callback handler

We'll test a standard authorization-code flow with CSRF protection. The /oauth/callback endpoint receives a code and state from GitHub, verifies the state matches what we stored at the start of the flow, exchanges the code for an access token, and stores the token in the session. Add this to your existing app.py:

app.py (OAuth additions)
@app.route("/oauth/callback")
def oauth_callback():
    code = request.args.get("code")
    state = request.args.get("state")

    expected_state = session.pop("oauth_state", None)
    if not state or state != expected_state:
        return jsonify({"error": "invalid state"}), 400

    response = requests.post(
        "https://github.com/login/oauth/access_token",
        data={
            "client_id": app.config["GITHUB_CLIENT_ID"],
            "client_secret": app.config["GITHUB_CLIENT_SECRET"],
            "code": code,
        },
        headers={"Accept": "application/json"},
        timeout=10,
    )

    if response.status_code != 200:
        return jsonify({"error": "token exchange failed"}), 502

    token = response.json().get("access_token")
    if not token:
        return jsonify({"error": "no token in response"}), 502

    session["github_token"] = token
    return redirect(url_for("dashboard"))

Four things can go wrong: missing or tampered state, GitHub returns a non-200, GitHub returns 200 but without a token, or everything works. We'll write a test for each of the three shapes that matter most.

Before we can run these tests, the OAuth client config needs to be available during test runs. Update the client fixture in tests/conftest.py so the config block includes the GitHub credentials:

tests/conftest.py (config update)
flask_app.config.update(
    TESTING=True,
    SECRET_KEY="test-secret-not-for-production",
    WEATHER_API_KEY="test_key",
    GITHUB_CLIENT_ID="test_client_id",
    GITHUB_CLIENT_SECRET="test_client_secret",
)

Test 1: the happy path

The successful exchange. Mock GitHub's token endpoint, simulate a callback with matching state, verify the token ends up in the session and the user is redirected to the dashboard. Save this as tests/test_oauth.py:

tests/test_oauth.py
import responses

@responses.activate
def test_oauth_callback_exchanges_code_for_token(client):
    # Simulate the start of the OAuth flow: we stored a state token
    with client.session_transaction() as sess:
        sess["oauth_state"] = "matching_state_value"

    # GitHub will return a valid token when called
    responses.add(
        responses.POST,
        "https://github.com/login/oauth/access_token",
        json={"access_token": "gho_fake_token", "token_type": "bearer"},
        status=200,
    )

    # GitHub redirects the user back with code and matching state
    response = client.get(
        "/oauth/callback?code=abc123&state=matching_state_value"
    )

    # We were redirected to the dashboard
    assert response.status_code == 302
    assert "/dashboard" in response.headers["Location"]

    # The token is now in the session
    with client.session_transaction() as sess:
        assert sess["github_token"] == "gho_fake_token"
        assert "oauth_state" not in sess  # state was consumed

    # Our code sent the correct payload to GitHub
    assert len(responses.calls) == 1
    sent_body = responses.calls[0].request.body.decode()
    assert "code=abc123" in sent_body
    assert "client_id=test_client_id" in sent_body
    assert "client_secret=test_client_secret" in sent_body
    assert responses.calls[0].request.headers["Accept"] == "application/json"

Run it from the project root:

Terminal
$ pytest tests/test_oauth.py
tests/test_oauth.py .                                                   [100%]

============================== 1 passed in 0.06s ==============================

This test does a lot in one go, so it's worth walking through what it locks down. The redirect works, the redirect target is right, the token is stored, and the state was consumed from the session. (A common bug is forgetting to pop the state value, leaving stale state that breaks the next login attempt.) But the strongest assertions are the last four, on responses.calls. Your code sent GitHub the right payload: the right code, the right client ID and secret, and the right Accept header.

Without that last block, the test would pass even if someone refactored the code to accidentally leak client_secret into the URL, forgot the Accept header and started getting URL-encoded responses back, or swapped the endpoint for a wrong one. With it, the test pins down the entire wire contract. That's a test earning its keep.

Test 2: the CSRF check

OAuth's state parameter exists for one reason: to stop an attacker tricking a logged-in user's browser into completing an OAuth flow they didn't start. If your state check is broken, your application is vulnerable. This test proves it isn't. Add it to the bottom of tests/test_oauth.py:

tests/test_oauth.py (continued)
@responses.activate
def test_oauth_callback_rejects_mismatched_state(client):
    # We stored one state value
    with client.session_transaction() as sess:
        sess["oauth_state"] = "our_legitimate_state"

    # An attacker triggers the callback with a different state
    response = client.get(
        "/oauth/callback?code=abc123&state=attacker_controlled_state"
    )

    # Request is rejected
    assert response.status_code == 400
    assert response.get_json() == {"error": "invalid state"}

    # No token was fetched
    assert len(responses.calls) == 0

    # No token was stored
    with client.session_transaction() as sess:
        assert "github_token" not in sess

Three things this test locks down. The status code is 400, not 500 or a redirect to an error page you haven't built. No HTTP call was made to GitHub (the state check happens first, as it should). And the session remains clean, with no token stored.

Notice that @responses.activate is still applied even though no request should be made. That's deliberate. If a refactor accidentally moves the state check to after the token request, responses catches it (the unregistered endpoint raises ConnectionError, not a silent 200). Belt and braces.

Test 3: upstream failure

GitHub goes down. Or rate-limits you. Or responds with an error because your client ID was revoked. Your code needs to degrade gracefully and, critically, not leave your user's session in a half-authenticated state. Add this test to the bottom of tests/test_oauth.py:

tests/test_oauth.py (continued)
@responses.activate
def test_oauth_callback_handles_github_failure(client):
    with client.session_transaction() as sess:
        sess["oauth_state"] = "matching_state"

    responses.add(
        responses.POST,
        "https://github.com/login/oauth/access_token",
        status=500,
    )

    response = client.get("/oauth/callback?code=abc123&state=matching_state")

    # We returned a 502 Bad Gateway to the user
    assert response.status_code == 502
    assert response.get_json() == {"error": "token exchange failed"}

    # No partial state in the session
    with client.session_transaction() as sess:
        assert "github_token" not in sess

Run the whole file and you should now have three passing tests:

Terminal
$ pytest tests/test_oauth.py
tests/test_oauth.py ...                                                 [100%]

============================== 3 passed in 0.06s ==============================

The assertion on missing github_token is the critical one. If you ever see a test pass where the token is present after an upstream failure, stop and find out why. It means your code stored something useless, and the next authenticated request is going to fail in a much more confusing place.

What you haven't tested yet

Three tests cover most of the value, but a serious production suite would add a few more: GitHub returning 200 with no access_token key (edge case, but it happens during permission changes), network timeouts during the POST, and a callback with code missing entirely. Each is one more responses.add() call and three more assertions. The scaffolding you've already built makes adding them near-free.

That's the compounding payoff of good test infrastructure. The first test costs an hour. The tenth test costs five minutes.

Next, we'll step back from individual test files and look at how the whole suite composes: fixtures that layer, shared conftest patterns, running in CI, and keeping the whole thing fast.