Skip to main content

Command Palette

Search for a command to run...

Serialization and Deserialization: The Art of Flat-Packing Data

How backend systems turn meaning into bytes

Updated
4 min read
Serialization and Deserialization: The Art of Flat-Packing Data
S

I'm a passionate backend dev

In your application code, data is "alive."

If you have a User object, it’s a complex structure sitting in your computer's RAM. It has connections to other objects, specific memory addresses, and "methods" (things it can do).

But when you want to send that user over the internet or save it to a disk, you hit a wall. You can't send a memory address over a wire. You have to turn that "living" object into a flat, simple string of data.

The Wardrobe Analogy

Think of your data like a fully assembled IKEA Wardrobe standing in your bedroom.

  • Serialization is taking that wardrobe apart and laying the pieces flat in a box. Now it’s "flat-packed." It’s not a wardrobe anymore—it’s just a collection of wood and screws that are easy to ship.

  • Deserialization is what happens when the box arrives at the destination. The receiver follows the instructions to put the pieces back together into a standing wardrobe.

Serialization is not about formats. It is about representation.

It answers one simple question: How do we turn in-memory meaning into something that can travel?

Deserialization answers the reverse: How do we turn traveling bytes back into meaning?

Why Do We Need a "Neutral" Format?

If you are writing in Java and you send a Java-specific object to a Python server, the Python server will have no idea what to do with it. It’s like trying to put together a wardrobe using instructions written in a language you don't speak.

To fix this, we use Neutral Formats. These are the "Universal Languages" of the backend:

  1. JSON (The Plain Text Box): It looks like text. It’s easy for humans to read and easy for every programming language to understand. Downside: It’s a bit bulky.

  2. Binary (The Compressed Box): It looks like gibberish to humans (just zeros and ones), but it’s incredibly small and fast for computers to process. Downside: You need a special "decoder ring" (a schema) to read it.

What Gets Lost When Data Is Serialized

Serialization is lossy by nature.

You lose: object identity, references, execution context, implicit assumptions.

Two objects pointing to the same thing in memory may become two separate copies on the wire.

Nothing tells you they were ever connected.

Why Deserialization Is Harder Than Serialization

Turning meaning into bytes is mechanical.

Turning bytes back into meaning requires assumptions.

You must assume: field names, types, ordering, optionality, defaults.

If any assumption is wrong, deserialization fails.
Or worse, it succeeds incorrectly. Silent corruption is the real danger.

The Two Golden Rules of Serialization

1. Don't Open "Mystery Boxes" (Security)

Deserialization is dangerous because you are letting external data create objects inside your server's memory. If a hacker sends a "poisoned" box, the act of "putting the wardrobe together" could actually trigger a command that deletes your files.

Rule: Never deserialize data from a source you don't trust without validating it first.

2. Plan for the "Extra Screw" (Versioning)

What happens if you update your code to add a middle_name field, but your database is still sending "old boxes" without that field?

Serialized data represents the past. It captures: what was true, at a specific moment, under specific assumptions.

When you deserialize later, the system may have changed. New fields exist. Old fields are gone. Meaning has evolved.

Rule: Always use a format that allows your data to grow. Your "assembly instructions" should be smart enough to handle missing parts or extra pieces without crashing.

Formats Don’t Save You

JSON, Protobuf, Avro, MessagePack… They solve encoding. They do not solve meaning.

A well-encoded message with the wrong assumptions is still broken. The format doesn’t know: what fields matter, which defaults are safe, what can be ignored.

Only backend engineers decide that.

Serialization Forces Explicit Thinking

In memory, systems rely on: shared context, implicit guarantees, language rules. Serialized data removes all of that. What remains must be: explicit, intentional, documented by behavior.

This is why serialization exposes weak designs so quickly.

Summary

  • Serialization: Packing your "living" code objects into a "flat" format for travel.

  • Deserialization: Rebuilding those objects when they arrive.

  • The Choice: Use JSON when you want things to be easy to debug; use Binary when you need extreme speed.

Thinking in Backend means realizing that data is constantly changing shape. It’s a "living" object on your server, but it’s a "flat-pack" box on the wire.

What Comes Next

In the next one, we’ll talk about Task Queues and Background Jobs.

This article is part of the Thinking in Backend series, where we learn backend engineering by understanding where meaning is lost, not just where code executes.