Libraries & Archivists Are Digitizing 480,000 Books Published in 20th Century That Are Secretly in the Public Domain

Image by Jason “Textfiles” Scott, via Wiki­me­dia Com­mons

All books in the pub­lic domain are free. Most books in the pub­lic domain are, by def­i­n­i­tion, on the old side, and a great many aren’t easy to find in any case. But the books now being scanned and uploaded by libraries aren’t quite so old, and they’ll soon get much eas­i­er to find. They’ve fall­en through a loop­hole because their copy­right-hold­ers nev­er renewed their copy­right, but until recent­ly the tech­nol­o­gy was­n’t quite in place to reli­ably iden­ti­fy and dig­i­tal­ly store them.

Now, though, as Vice’s Karl Bode writes, “a coali­tion of archivists, activists, and libraries are work­ing over­time to make it eas­i­er to iden­ti­fy the many books that are secret­ly in the pub­lic domain, dig­i­tize them, and make them freely avail­able online to every­one.” These were pub­lished between 1923 and 1964, and the goal of this dig­i­ti­za­tion project is to upload all of these sur­pris­ing­ly out-of-copy­right books to the Inter­net Archive, a glimpse of whose book-scan­ning oper­a­tion appears above.

“His­tor­i­cal­ly, it’s been fair­ly easy to tell whether a book pub­lished between 1923 and 1964 had its copy­right renewed, because the renew­al records were already dig­i­tized,” writes Bode. “But prov­ing that a book hadn’t had its copy­right renewed has his­tor­i­cal­ly been more dif­fi­cult.” You can learn more about what it takes to do that from this blog post by New York Pub­lic Library Senior Prod­uct Man­ag­er Sean Red­mond, who first crunched the num­bers and esti­mat­ed that 70 per­cent of the titles pub­lished over those 41 years may now be out of copy­right: “around 480,000 pub­lic domain books, in oth­er words.”

The first impor­tant stage is the con­ver­sion of copy­right records into the XML for­mat, a large part of which the New York Pub­lic Library has recent­ly com­plet­ed. Bode also men­tions a soft­ware devel­op­er and sci­ence fic­tion author named Leonard Richard­son who has writ­ten Python scripts to expe­dite the process (includ­ing a match­ing script to iden­ti­fy poten­tial­ly non-renewed copy­rights in the Inter­net Archive col­lec­tion) and a bot that iden­ti­fies new­ly dis­cov­ered secret­ly pub­lic-domain books dai­ly. Richard­son him­self under­scores the neces­si­ty of vol­un­teers to take on tasks like seek­ing out a copy of each such book, “scan­ning it, proof­ing it, then putting out HTML and plain-text edi­tions.”

This work is now hap­pen­ing at Amer­i­can libraries and among vol­un­teers from orga­ni­za­tions like Project Guten­berg. The Inter­net Archive’s Jason Scott has also pitched in with his own resources, recent­ly putting out a call for more help on the “very bor­ing, VERY BORING (did I men­tion bor­ing)” project of deter­min­ing “which books are actu­al­ly in the pub­lic domain to either sur­face them on or help make a hitlist.” Of course, many more obvi­ous­ly stim­u­lat­ing tasks exist even in the realm of dig­i­tal archiv­ing. But then, each secret­ly pub­lic-domain book iden­ti­fied, found, scanned, and uploaded brings human­i­ty’s print and dig­i­tal civ­i­liza­tions one step clos­er togeth­er. What­ev­er comes out of that union, it cer­tain­ly won’t be bor­ing.

via Vice

Relat­ed Con­tent:

Pub­lic Domain Day Is Final­ly Here!: Copy­right­ed Works Have Entered the Pub­lic Domain Today for the First Time in 21 Years

11,000 Dig­i­tized Books From 1923 Are Now Avail­able Online at the Inter­net Archive

British Library to Offer 65,000 Free eBooks

Down­load for Free 2.6 Mil­lion Images from Books Pub­lished Over Last 500 Years on Flickr

Free: You Can Now Read Clas­sic Books by MIT Press on Archive.org

The Library of Con­gress Launch­es the Nation­al Screen­ing Room, Putting Online Hun­dreds of His­toric Films

Based in Seoul, Col­in Mar­shall writes and broad­casts on cities, lan­guage, and cul­ture. His projects include the book The State­less City: a Walk through 21st-Cen­tu­ry Los Ange­les and the video series The City in Cin­e­ma. Fol­low him on Twit­ter at @colinmarshall or on Face­book.


by | Permalink | Comments (0) |

Sup­port Open Cul­ture

We’re hop­ing to rely on our loy­al read­ers rather than errat­ic ads. To sup­port Open Cul­ture’s edu­ca­tion­al mis­sion, please con­sid­er mak­ing a dona­tion. We accept Pay­Pal, Ven­mo (@openculture), Patre­on and Cryp­to! Please find all options here. We thank you!


Leave a Reply

Quantcast
Open Culture was founded by Dan Colman.