Distributed Proofreaders (commonly abbreviated as DP or PGDP) is a web-based project that supports the development of e-texts for Project Gutenberg by allowing many people to work together in proofreading drafts of e-texts for errors. As of July 2024,[update] the site had digitized 48,000 titles.[2][3][4][5]
History
Distributed Proofreaders was founded by Charles Franks in 2000 as an independent site to assist Project Gutenberg.[6] Distributed Proofreaders became an official Project Gutenberg site in 2002.
On 8 November 2002, Distributed Proofreaders was slashdotted,[7][8] and more than 4,000 new members joined in one day, causing an influx of new proofreaders and software developers, which helped to increase the quantity and quality of e-text production. In July 2015, the 30,000th Distributed Proofreaders produced e-text was posted to Project Gutenberg. DP-contributed e-texts comprised more than half of works in Project Gutenberg, as of July 2015[update].
Public domain works, typically books with expired copyright, are scanned by volunteers, or sourced from digitization projects and the images are run through optical character recognition (OCR) software. Since OCR software is far from perfect, many errors often appear in the resulting text. To correct them, pages are made available to volunteers via the Internet; the original page image and the recognized text appear side by side.[9] This process thereby distributes the time-consuming error-correction process, akin to distributed computing.
Each page is proofread and formatted several times, and then a post-processor combines the pages and prepares the text for uploading to Project Gutenberg.
Besides custom software created to support the project, DP also runs a forum and a wiki for project coordinators and participants.
Related projects
DP Europe
In January 2004, Distributed Proofreaders Europe started, hosted by Project Rastko, Serbia.[10] This site had the ability to process text in UnicodeUTF-8 encoding. Books proofread centered on European culture, with a considerable proportion of non-English texts including Hebrew, Arabic, Urdu, and many others. As of October 2013[update], DP Europe had produced 787 e-texts, the last of these in November 2011.
The original DP is sometimes referred to as "DP International" by members of DP Europe. However, DP servers are located in the United States, and therefore works must be cleared by Project Gutenberg as being in the public domain according to U.S. copyright law before they can be proofread and eventually published at DP.
DP Canada
In December 2007, Distributed Proofreaders Canada launched to support the production of e-books for Project Gutenberg Canada and take advantage of shorter Canadian copyright terms. Although it was established by members of the original Distributed Proofreaders site, it is a separate entity. All its projects are posted to Faded Page, their book archive website. In addition, it supplies books to Project Gutenberg Canada (which launched on Canada Day 2007) and (where copyright laws are compatible) to the original Project Gutenberg.
In addition to preserving Canadiana, DP Canada is notable because it is the first major effort to take advantage of Canada's copyright laws which may allow more works to be preserved. Unlike copyright law in some other countries, Canada has a "life plus 50" copyright term. This means that works by authors who died more than fifty years ago may be preserved in Canada, whereas in other parts of the world those works may not be distributed because they are still under copyright.
On 9 March 2007, Distributed Proofreaders announced the completion of more than 10,000 titles. In celebration, a collection of fifteen titles was published:
Slave Narratives, Oklahoma (A Folk History of Slavery in the United States From Interviews with Former Slaves) by the U.S. Work Projects Administration (English)
Eighth annual report of the Bureau of ethnology. (1891 N 08 / 1886–1887) edited by John Wesley Powell (English)
R. Caldecott's First Collection of Pictures and Songs by Randolph Caldecott [Illustrator] (English)
Como atravessei Àfrica (Volume II) by Serpa Pinto (Portuguese)
^Gentry, Craig; Ramzan, Zulfikar; Stuart Stubblebine (February 28 – March 3, 2005). "Secure Distributed Human Computation". In Andrew S. Patrick; Moti Yung (eds.). Financial cryptography and data security: 9th International Conference. Lecture Notes in Computer Science. Vol. 3570. Roseau, The Commonwealth of Dominica: Springer. p. 329. doi:10.1145/1064009.1064026. ISBN3-540-26656-9.