6. Working with binary data

Not every API returns JSON. Plenty return images, PDFs, audio clips, or zip archives, and binary data breaks if you reach for it the same way you'd reach for text. This section covers the two small rules that keep binary responses intact.

The first rule is to use the right accessor. JSON and HTML live on response.text, but anything binary lives on response.content. To see why it matters, save this at the project root as text_vs_binary.py:

text_vs_binary.py
import requests

# Get JSON (text data)
json_response = requests.get("https://httpbin.org/json", timeout=5)
print("JSON Response:")
print(f"  Type: {type(json_response.text)}")  # str
print(f"  First 50 chars: {json_response.text[:50]!r}")
print("  Can read as text: Yes")

print("\n" + "=" * 50 + "\n")

# Get image (binary data)
image_response = requests.get("https://httpbin.org/image/png", timeout=5)
print("Image Response:")
print(f"  Type: {type(image_response.content)}")  # bytes
print(f"  First 10 bytes: {image_response.content[:10]}")
print("  Can read as text: No (will look like gibberish)")

Run it:

Terminal
JSON Response:
  Type: <class 'str'>
  First 50 chars: '{\n  "slideshow": {\n    "author": "Yours Truly", \n'
  Can read as text: Yes

==================================================

Image Response:
  Type: <class 'bytes'>
  First 10 bytes: b'\x89PNG\r\n\x1a\n\x00\x00'
  Can read as text: No (will look like gibberish)

Text is human-readable strings; binary is raw bytes. Reach for .text or .json() on one, .content on the other. Mix them up and you'll get encoding errors or corrupted files.

The second rule: use "wb" when writing

Saving binary data to disk requires opening the file in binary-write mode, "wb". Text mode ("w") will not accept bytes at all: the naive f.write(response.content) raises TypeError: write() argument must be str, not bytes. To get bytes into a text-mode file you have to decode them to a string first, and that decode is what destroys the data here. Save this as save_image_wrong.py first:

save_image_wrong.py
import requests

response = requests.get("https://httpbin.org/image/png", timeout=5)

# DON'T DO THIS - regular write mode corrupts binary data
with open("image.png", "w", encoding="utf-8") as f:  # โŒ Text mode
    f.write(response.content.decode("utf-8", errors="replace"))

# Result: Corrupted file that won't open

The fix is a single character: swap "w" for "wb". Save this as save_image.py:

save_image.py
import requests

response = requests.get("https://httpbin.org/image/png", timeout=5)

# DO THIS - binary write mode preserves exact bytes
with open("image.png", "wb") as f:  # โœ… Binary mode
    f.write(response.content)

# Result: Perfect image file

The four file modes in Python read like a small reference card, so here it is for quick recall:

  • "w", text write, interprets data as text and may alter bytes to normalise line endings.
  • "wb", binary write, writes exact bytes with no interpretation.
  • "r", text read, expects valid string encoding.
  • "rb", binary read, reads raw bytes as-is.

The rule of thumb: binary mode (wb, rb) for images, PDFs, videos, and any non-text file. Text mode (w, r) only for actual text like CSVs or HTML. Stick to that and you'll stop producing corrupted downloads entirely. The next section brings everything in the chapter together into one real project.