Library digitizing more than books

Students and scholars may soon no longer need to sift through the Main Stacks to find the book or document they want.

In the past four years, the University Library has been involved in several digitization projects in partnership with other organizations.

Over 23,000 books have already been digitized through the library’s contract with the Internet Archive, a non-profit digital library.

The library is digitizing books mainly because “we want to make primary source material more accessible to scholars and we also want to make sure people can find them and use them more efficiently,” said Beth Sandore, associate university librarian for information technology.

“Having the ability to search across the text of a book is a big advantage to digitization,” said Betsy Kruger, head of the digital content creation unit. “You decide whether you want to check out the book before you go to the library.”

In addition to the University’s work with the Internet Archive, the University is also part of a consortium of Big Ten universities and the University of Chicago known as the Committee on Institutional Cooperation (CIC), which has partnered with Google in efforts to digitize collections that will be available through Google Books.

According to a press release in June 2007, Google said it would work with all CIC libraries as a whole to digitize up to 10 million volumes.

Kruger said books already digitized and in the process of being digitized have been identified as representing the strengths of the library’s collection.

“The library has one of the most extensive holdings of emblem books in the world, which were published in the 17th and 18th century,” Sandore said.

“They often contained some social, political or moral message. Digitizing them gives access to scholars around the world.”

All items being digitized by the University Library, the Internet Archive and the Google Books Library Project will be deposited in HathiTrust, a shared digital repository for the nation’s research libraries.

However, certain copyright issues have to be worked out before the University can have access to other items in HathiTrust. The library can only provide patrons access to material they physically own a copy of, Sandore said.

The library’s digitization efforts are not only limited to books, but also different forms of print media such as microfilms, photographs, newspapers and government documents.

In 2009 the CIC started a project to digitize U.S. federal documents held at its member libraries, said Mark Sandler, director of CIC Center for Library Initiatives.

“We’re now moving on to the University of Illinois, which has worked carefully to identify a substantial number of volumes to send. Then, in five or six months, we’ll move on to another CIC university,” he added.

Over 100,000 government documents from the University of Minnesota, Penn State and the University of Michigan have already been digitized, Sandler said.

“There are all sorts of documents being digitized,” Associate University Librarian for Collections Tom Teper said, citing examples such as congressional hearings and reports.

The U.S. Government Printing Office has been distributing federal documents to depository libraries since 1861.

“The CIC universities hold somewhere between one and 1.5 million of these, covering the political and social legacy of the United States, including Congressional, Judicial and Executive Branch documents,” Sandler said.

Local newspapers have also been digitized by the library, including copies of The Daily Illini, an Illini Media publication, from 1916-1945 and the now-defunct Urbana Daily Courier from 1903-1935.

“Normally the starting point for looking for events in a newspaper is the date,” said Mary Stuart, history and philosophy librarian at the University. “Digitizing allows us to search newspapers by topic.”

Newspapers offer a good insight into the local history of Champaign-Urbana, Stuart added. For instance, she said newspaper records show a black University law student suing a Campustown diner that refused to serve him in 1927.

Although there is a cost to digitizing, Kruger said the overall cost is very inexpensive.

“We have had sufficient funding to do approximately 5,000 books per year through the Internet Archive,” Kruger said.

“The costs of digitization include the processing of files, formatting and storage of the material.”

The scanning and transportation of books to the facility where they scanned are paid for by Google if it is done through the Google Books Library Project.

The University Library does pay for “the preparation, retrieving and creating an electronic record of the material being sent for scanning, and the re-shelving of the material,” Kruger said.

Sandore added that other items can be quite expensive to digitize, such as a map that might take one to two gigabytes of storage space and require a special wide format scanner.

On average, about 500 pages per hour are scanned using two scribe book scanner machines owned by the Internet Archive, which equates to about 15 to 20 volumes being digitized per day.

The scanners are designed to minimize the wear and tear on the books, but not all material can be scanned, because “we have identified some material as too fragile to scan and irreplaceable,” Sandore said.

By the end of the Google Books Project, the University Library hopes to digitize between 800,000 to 1,000,000 volumes.

“We are building the library of the future, by preserving materials for scholars for years and years,” Kruger said.