Image by Jason “Textfiles” Scott, via WikiÂmeÂdia ComÂmons
All books in the pubÂlic domain are free. Most books in the pubÂlic domain are, by defÂiÂnÂiÂtion, on the old side, and a great many aren’t easy to find in any case. But the books now being scanned and uploaded by libraries aren’t quite so old, and they’ll soon get much easÂiÂer to find. They’ve fallÂen through a loopÂhole because their copyÂright-holdÂers nevÂer renewed their copyÂright, but until recentÂly the techÂnolÂoÂgy wasÂn’t quite in place to reliÂably idenÂtiÂfy and digÂiÂtalÂly store them.
Now, though, as Vice’s Karl Bode writes, “a coaliÂtion of archivists, activists, and libraries are workÂing overÂtime to make it easÂiÂer to idenÂtiÂfy the many books that are secretÂly in the pubÂlic domain, digÂiÂtize them, and make them freely availÂable online to everyÂone.” These were pubÂlished between 1923 and 1964, and the goal of this digÂiÂtiÂzaÂtion project is to upload all of these surÂprisÂingÂly out-of-copyÂright books to the InterÂnet Archive, a glimpse of whose book-scanÂning operÂaÂtion appears above.
“HisÂtorÂiÂcalÂly, it’s been fairÂly easy to tell whether a book pubÂlished between 1923 and 1964 had its copyÂright renewed, because the renewÂal records were already digÂiÂtized,” writes Bode. “But provÂing that a book hadn’t had its copyÂright renewed has hisÂtorÂiÂcalÂly been more difÂfiÂcult.” You can learn more about what it takes to do that from this blog post by New York PubÂlic Library Senior ProdÂuct ManÂagÂer Sean RedÂmond, who first crunched the numÂbers and estiÂmatÂed that 70 perÂcent of the titles pubÂlished over those 41 years may now be out of copyÂright: “around 480,000 pubÂlic domain books, in othÂer words.”
The first imporÂtant stage is the conÂverÂsion of copyÂright records into the XML forÂmat, a large part of which the New York PubÂlic Library has recentÂly comÂpletÂed. Bode also menÂtions a softÂware develÂopÂer and sciÂence ficÂtion author named Leonard RichardÂson who has writÂten Python scripts to expeÂdite the process (includÂing a matchÂing script to idenÂtiÂfy potenÂtialÂly non-renewed copyÂrights in the InterÂnet Archive colÂlecÂtion) and a bot that idenÂtiÂfies newÂly disÂcovÂered secretÂly pubÂlic-domain books daiÂly. RichardÂson himÂself underÂscores the necesÂsiÂty of volÂunÂteers to take on tasks like seekÂing out a copy of each such book, “scanÂning it, proofÂing it, then putting out HTML and plain-text ediÂtions.”
This work is now hapÂpenÂing at AmerÂiÂcan libraries and among volÂunÂteers from orgaÂniÂzaÂtions like Project GutenÂberg. The InterÂnet Archive’s Jason Scott has also pitched in with his own resources, recentÂly putting out a call for more help on the “very borÂing, VERY BORING (did I menÂtion borÂing)” project of deterÂminÂing “which books are actuÂalÂly in the pubÂlic domain to either surÂface them on @internetarchive or help make a hitlist.” Of course, many more obviÂousÂly stimÂuÂlatÂing tasks exist even in the realm of digÂiÂtal archivÂing. But then, each secretÂly pubÂlic-domain book idenÂtiÂfied, found, scanned, and uploaded brings humanÂiÂty’s print and digÂiÂtal civÂiÂlizaÂtions one step closÂer togethÂer. WhatÂevÂer comes out of that union, it cerÂtainÂly won’t be borÂing.
RelatÂed ConÂtent:
11,000 DigÂiÂtized Books From 1923 Are Now AvailÂable Online at the InterÂnet Archive
British Library to Offer 65,000 Free eBooks
DownÂload for Free 2.6 MilÂlion Images from Books PubÂlished Over Last 500 Years on Flickr
Free: You Can Now Read ClasÂsic Books by MIT Press on Archive.org
Based in Seoul, ColÂin MarÂshall writes and broadÂcasts on cities, lanÂguage, and culÂture. His projects include the book The StateÂless City: a Walk through 21st-CenÂtuÂry Los AngeÂles and the video series The City in CinÂeÂma. FolÂlow him on TwitÂter at @colinmarshall or on FaceÂbook.
Leave a Reply