Main Page | See live article | Alphabetical index


bzip2 is an open source data compression algorithm and program developed by Julian Seward. bzip2 compresses most files more effectively than more traditional gzip or Zip, but is generally somewhat slower and less efficient in limited-memory situations. Nonetheless, with the constant effect of Moore's Law making computer time less and less important, compression methods like bzip2 have become more popular. Indeed, according to the author, bzip2 gets within ten to fifteen percent of the "best" class of compression algorithms currently known (PPM), although it is roughly twice as fast at compression and six times faster at decompression.

bzip2 uses the Burrows-Wheeler transform to convert frequently recurring character sequences into strings of identical letters. In bzip2 the blocks are all the same size in plaintext, which can be selected by a command-line argument, and are marked in compresstext by bit-sequences derived from the decimal representation of pi.

Originally bzip used arithmetic encoding after the blocksort; this was discontinued because of the patent restriction.

In Linux, bzip2 can be used combined or indepently of tar: bzip2 file to compress and bzip2 -d file.bz2 to uncompress.

See also: gzip, TAR.

External links