Crumbling Paper: An Email to Google Book Search on the Digitization of Old Newspapers

Above: an image from The American Newspaper Repository website, which is now housed at Duke University. There are a number of fascinating images there.

I just sent the below letter to Google Book Search at the email address listed on their blog… inside-book-search at google.com. I encourage others to do the same… feel free to cannibalize from my letter.

Hello there,

I’ve received a great deal of enjoyment from browsing in Google Book Search… it is a fantastic reference tool, and I thank you for it.

I wanted to bring to your attention a vast area of history that is being lost which I believe Google may be in a unique position to preserve.

Newspapers are one of our best primary sources of world history. One would assume that such important documents would be well preserved, but this is unfortunately not the case. Newspapers have faced numerous challenges in archiving, and many original newspapers have already been lost.

Initially stored in enormous bound books, these volumes were challenging for libraries to store for a variety of reasons. The size of the books was much larger than a typical book, and they built up fast with each daily newspaper printed. In addition to taking a huge amount of physical space to store, their immense size made them difficult to browse and easy to damage. This was compounded by the fact that newspapers are printed on the cheapest paper available… the old volumes yellow, become brittle, and crumble to dust. Few libraries are left that have any bound volumes of newspapers left in their collections.

The solution to these storage problems adopted by most libraries was to destroy their collections of old newspapers and replace them with microfiche copies. This was a terrible and catastrophic solution. Microfiche does not do a good job of preserving old newspapers for a variety of reasons. First of all, the image quality is terrible. Photographs, illustrations and comics are become blotchy blurs and color is lost, text becomes patchy and often is unreadable. Even if the microfiche had been a reasonable replacement, the life of microfiche is much, much shorter than newsprint.

Thus, the vast majority of old newspapers are already destroyed. It is imperative that what we have left is preserved. The best way to make sure that they are preserved for generations to come is to digitize what we have left (and to make sure that the originals remain intact as well).

Digitizing old newspapers provides many unique challenges:

-The pages are huge, much larger than a typical scanner bed.

-The pages are often yellow, brittle or even crumbling.

-They contain much more than text, such as historical photographs and the history of the early comic strip, a unique American art form. These make it so they need preservation at a higher resolution than something which merely contains text to do them justice when preserving them.

-The large pages at high resolutions, often in color make for potentially very large files.

-The many columns of copy and images on a newspaper page make them presumably considerably more difficult to make searchable than a typical book.

As I said before, I think Google may be in a unique position to tackle these numerous challenges. You have already digitized numerous libraries, and made them searchable online. Providing old newspapers this way would be invaluable to numerous researchers. Saving old newspapers in this way I believe could also be a cornerstone and crown jewel for why what you are doing is so unique, essential and valuable.

I imagine there would be an enormous market for viewing this information once it became easily searchable, and thus much more profitable for you than typical library materials. What better resource is there when researching history than the account of an event from the day that it occurred? Since the papers most in need of digitizing (the oldest ones) are well within the public domain, there should be no legal issues with putting the entirety of them on the web.

Here are two excellent starting points for starting digitizing newspapers:

The American Newspaper Repository at Duke University:
http://home.gwi.net/~dnb/newsrep.html
http://home.gwi.net/~dnb/former_newsrep.html

This collection includes, among many other things, a large run of the graphically striking New York World, which is by far the largest run of it still in existence.

The Cartoon Research Library at Ohio State University
http://cartoons.osu.edu/

This collection includes, among other things, The San Francisco Academy of Comic Art Collection, which is the largest collection of early comic strips existing.

I greatly hope you will consider adding old newspapers to the many things you have made searchable through Google Book Search.

Thanks much for listening.

Best wishes,

Steven Stwalley
Webmaster, The International Cartoonist Conspiracy, cartoonistconspiracy.com
Read my blog at stwallskull.com
Read my webcomic at soapythechicken.com