Main Page | See live article | Alphabetical index

Gzip

Gzip is short for GNU Zip, a GNU open-source replacement for the Unix 'compress' program.

Gzip is based on the 'deflate' algorithm, which is a combination of LZ77 and Huffman coding. 'Deflate' was developed in response to patents that covered LZW and other compression algorithms and limited the usability of 'compress' and other popular archivers.

In order to make it easier to develop software that uses compression, the Zlib library was created. It supports the Gzip file format and 'deflate' compression. The library is widely used, because of its small size, efficiency and versatility, although since the late-1990s there has been some movement from gzip to bzip2 which produces somewhat smaller files.

The Zlib compressed data format, the 'deflate' algorithm and the Gzip file format were standardised respectively as RFC 1950, RFC 1951 and RFC 1952.

The usual file extension for gzipped files is .gz. Unix software is often distributed as files ending with .tar.gz, called tarballs. They are files first packaged with tar and then compressed with gzip. They can be decompressed with gzip -d file.tar.gz. Nowadays more and more software is instead distributed as .tar.bz2 archives because bzip2 compresses files better than gzip.

See also: bzip2

External Links