Main Page | See live article | Alphabetical index

Project Gutenberg

Project Gutenberg (PG) was launched by Michael Hart in 1971 in order to provide a library on the Internet of free electronic versions (sometimes called e-texts) of physically existing books. The texts provided are mostly in the public domain, either because they were never under copyright, or because their copyrights have expired. There are also a few copyrighted texts that Gutenberg has made available with the authors' permission. The project was named after the 15th-century German printer Johannes Gutenberg who propelled the movable type printing press revolution.

General information

For the most part, Project Gutenberg concentrates on historically significant literature and reference works. The slogan of the project is "break down the bars of ignorance and illiteracy", chosen because the project hopes to continue the work of spreading public literacy and appreciation for our literary heritage that public libraries began in the early 20th century. All Gutenberg releases are available in plain ASCII text, and occasionally in other file formats as well. Because maximum availability is a goal, the project eschews prettier but bulkier and not-universally-compatible data formats such as PDF.

All Project Gutenberg texts may be obtained and redistributed by readers for no fee: the only restriction placed on redistribution is that the unaltered text must contain the Project Gutenberg header. If the redistributed text has been modified, the file must not be labelled as a Gutenberg text.

As of 2003, the project has released over ten thousand electronic books, almost entirely produced by volunteers, and remains active. Anyone can become a proofreader by signing up to the Distributed Proofreaders site [1], and volunteering for pages one by one from a variety of texts. While most are in English, there are some in French, Latin, Old English, and a few in other languages. Proofreading does not actually demand knowledge of the language.

History

In 1971, Michael Hart was attending the University of Illinois. Hart obtained access to a Xerox Sigma V mainframe computer in the university's Materials Research Lab, as his best friend and his brother's best friend were two of the four operators of that particular machine. He was given an operator's account with a virtually unlimited amount of computer time; that access has since been variously estimated to have been worth $100,000 or $100,000,000. Hart spent the next hour and a half trying to think of something to do with the computer that would be worth that much money. This particular computer happened to be one of the 15 nodes on the computer network that would become the Internet. Hart believed that computers would one day be accessible to the general public and decided to make works of literature available for free in electronic form. He happened to have a copy of the United States Declaration of Independence in his backpack, and this became the first Project Gutenberg e-text.

By the time U. of I. stopped hosting Project Gutenberg in the mid-1990s, Hart was running it from Illinois Benedictine College. Later he came to a similar arrangement with Carnegie Mellon University, which agreed to administer Project Gutenberg's finances. It was not until the year 2000 that Project Gutenberg was formally organized as an independent legal entity, and it is now a non-profit corporation chartered in Mississippi with an IRS ruling that donations to it are tax-deductible.

Since the Project's early days, the time required to digitize a book has decreased dramatically. Books are generally not typed in, but are instead converted into text with the aid of optical character recognition (OCR) software. Despite these advances, books still need to be heavily proofread and edited before they can be added to the collection.

Other projects inspired by Project Gutenberg

Literature

Project Gutenberg of Australia is an official sister project of PG. While the primary Gutenberg site is bound by U.S. copyright law, PG Australia produces e-texts in accordance with Australian copyright law, which differs from US law in defining when works enter the public domain. Thus, PG Australia is able to produce and host e-texts that would be illegal for Project Gutenberg in the United States while some texts from the US project cannot be hosted there. PG Australia also focuses on digitizing Australian material.

Aozora Bunko is a similar project in Japan, which focuses on digitizing non-copyrighted texts under Japan's copyright law and distributing them for free. Most of the texts provided are Japanese literature and translations from English literature.

Project Runeberg is a similar project for the Nordic language texts, begun in 1992.

Project Ben-Yehuda brings public domain Hebrew texts to the internet, and was inspired by Project Gutenberg. It was begun in 1999. A project by the National Yiddish Book Center in Amherst, Massachusetts is attempting to produce digital versions of its entire collection of Yiddish books.

In 2000, Charles Franks founded Distributed Proofreaders, which allows the proofreading of scanned texts to be distributed among many volunteers over the Internet. To make this possible, volunteers scan and run optical character recognition software on books, then place the results on a website for volunteer "proofers" to check. With thousands of volunteers each working on one or more pages, a reasonably-sized book can be proofed in several hours.

The Million Book Project aims to digitize a million public domain books by 2005. In order to process such a large number of books in such a short time, they generally skip the time-consuming transcription process and store their books as compressed image files.

Music

The Mutopia project attempts to do for music what Project Gutenberg does for literary works.

Related projects

See list of digital library projects for a more comprehesive list of digital library efforts. See also open content.

External links