The four stages are

- (A) Preliminary pre-processing steps.
- (B) Organization by context.
- (C) Probability estimation.
- (D) Length-reducing code.

- With (A) we mean various pre-processing steps that may be appropriate before the final compression engine.

Lossy compression often follows the same pattern as lossless, but with one or more quantization steps somewhere in (A). Sometimes clever designers may defer `loss' until suggested by statistics detected in (C); an example of this would be modern zerotree image coding. - (B) Organization by context often means data reordering, for which a simple but good example is JPEG's "Zigzag" ordering. The purpose of this step is to improve the estimates found by the next step.
- (C) A probability estimate (or its heuristic equivalent) is formed for each token to be encoded. Often the estimation formula will depend on context found by (B) with separate 'bins' of state variables maintained for each conditioned class.
- (D) Finally, based on its estimated probability, each compressed file token is represented as bits in the compressed file. Ideally, a 12.5%-probable token should be encoded with three bits, but details become complicated.

Table of contents |

1.1 Preliminary pre-processing steps
2 Further examples1.2 Organization by context 1.3 Probability estimation 1.4 Entropy coding |

Preliminary compression steps is a catch-all including such ideas as compaction transforms (e.g. subband image coding), hierarchal decompositions, template matching, string searching, and miscellaneous tricks like the Burrows-Wheeler transform. We aren't concerned with this stage here, except to note that many such preprocessing steps fall nicely within the scope of "Organization by Context." In lossy systems, any loss is usually introduced in (A), so remaining steps are the same for both lossy and lossless.

These probabilities can be reconstructed by the decoder, so what is encoded into the compressed file is just the outcome token. Instead of using {0, 1} as the outcome token set when k=2, it is often convenient to use {MPS, LPS} -- the More and Less Probable Symbol.

Eventually this section should discuss some topics in probability estimation:

- Statistical Coding: Conditional probabilities
- Statistical Coding: The `Overtraining' problem
- Statistical Coding: Distributed estimation
- Axioms for Probability estimation: efficacy, ergodicity, insensitivity
- Static Probability estimation: offline, semi-adaptive
- Stationary Probability estimation: Bayesian
- Window-based Probability estimation: e.g. Lempel-Ziv
- Decaying-average Probability estimation: e.g. QM-coder

The encoded events are converted into the bits of the compressed data file. Let us first separate these systems into two types. K is relatively large. Huffman coding is popular and useful, especially when symbol probabilities were already estimated in an earlier "offline" effort. Adaptive Huffman coding is also in use.

K = 2. Golomb coding and its variants like Langdon coding are simple, effective and popular; these produce output bits only when LPS is encountered. Arithmetic coding is also effective, as are their close cousins, quasi-arithmetic codes and FSM codes. An advantage of arithmetic and related codes over Golomb coding is the automatic interleaving of compressed data from multiple contexts, though chips implementing interleaved Golomb-type codes have recently been introduced by Ricoh Co., Ltd.

Interemdiate values like K=3 might seem to pose a problem. In fact, practical systems may be slightly more complicated than indicated here. Also, in some systems, e.g. LZ text coding, stages (A)-(B) of the compression system may be designed to get good compression even with trivial (C)-(D).

Instead of a Huffman code, a simple often applicable coding device is MTF (Move To Front) followed by a simple length-reducing code such as the so-called Fibonacci code. That might be viewed as MTF is stage (B), the simple "Fibo code" as stage (D), with stage (C) disappearing entirely, if the Fibo code fits the model statistics well.

*See also: Data compression*