Serialization

In computer programming, serialization has two meanings:

one is to force one-at-a-time access for the purposes of concurrency control
the other means to render an instance of an abstract data type into a byte stream.

This article is about the latter meaning.

Serialization is the process of taking an in memory data structure of an object and encoding it into a serial (hence the term) sequence of bytes. This encoded version can then be saved to disk, sent across a network connection, or otherwise communicated to a recipient.

The serialized data is decoded by the receiving program into an in-memory copy of the data.

Serialization is used as a simple way to make objects persistent, send messages, or even to distribute objects. There are certainly other ways to accomplish these things, and serialization is often viewed as a quick and dirty method. There are cases, however, where it is perfectly adequate to the task at hand.

Serialization breaks the opacity ideal of an abstract data type by potentially exposing private implementation details. For this reason and for other reasons, publishers of proprietary software often keep the details of their programs' serialization formats a trade secret. Some deliberately obfuscate, or even encrypt the serialized data.

On the other hand, interoperability requires that applications be able to understand each other's serializations for a given object. For this reason, remote method call architectures such as the CORBA architecture define their serialization formats in detail and often provide methods of checking the consistency of any serialized stream when converting it back into an object.

Starting in the late 1990s, the XML standard has become a popular and widely supported means of data serialization into text. Because of its flexible syntax, XML can represent a wide variety of data structures.