News

Other Campuses: Stanford, Google plan massive digitization project despite doubts

Apr 27, 2005

Last updated on May 11, 2016 at 08:55 p.m.

(U-WIRE) STANFORD, Calif. – Four months ago, Google announced its plans to partner with five major institutions – the University of Michigan, Harvard University, Stanford University, the New York Public Library and Oxford University – in an effort to make thousands of books digitally available to the public. The logistics of the project are still uncertain. Moreover, publishers are concerned about possible copyright violations. However, according to Google and its partners, digitizing the thousands of texts held in each institution’s collection will make reliable resources readily available to the public.

“The founders of Google had been trying to find out how to organize the world’s information and how to make it universally accessible,” said Adam Smith, product manager for Google Print, the division of Google that is working on the project.

Universities such as Stanford are eager to cooperate. The project allows for more “user-friendly access to the texts [Stanford] has invested in,” said Andrew Herkovic, foundation relations and strategic projects manager for Stanford’s libraries. “Digitizing texts allows us to have much superior discovery and access to the content we hold. There’s a whole new future by analyzing texts through automated means. Think ‘laser precision-guided research.'”

The “new future” may be a long way off. Considering Stanford’s enormous collection – more than 7.5 million books – the project will take years to complete, although Stanford hopes to have its entire collection digitized within a decade.

Get The Daily Illini in your inbox!

In order for Google to digitize Stanford’s books, Stanford must loan books in its collections to Google, which will transport the volumes to its Mountain View, Calif., headquarters. The books will be scanned and returned to Stanford’s shelves.

However, logistics of the project are up in the air, as Google and Stanford are caught up in the very early stages of planning.

“We really don’t know how many books Google will be processing at a time,” Herkovic said. “We don’t know if we can work out a process for getting the books into their hands, scanned, and back into our system in volumes that would suit their ambitions.”

– Jessica Yu

However, according to Herkovic, every effort will be made to avoid disruption of library services.

“We will manage [disruptions] as best as we can. The people here are committed to making sure this will not disrupt our service to the community,” said Herkovic. “You might be out of luck with a book for a few weeks, but we will ask Google for the book you are looking for, if it comes to that. We’re working very closely with the library to ensure that there is none or very little disruption over at the library.”

In addition to exercising caution in ensuring that library services are able to continue smoothly, the two partners are taking heed in ensuring that they are not infringing copyright laws.

“Copyright issues are huge in the digital age,” said Herkovic. “It’s very complicated. There are no tidy ways to know: Do I have the right to copy this? We tend to err on the side of: If in doubt, pay the fees.”

Jonathan Zittrain, co-director of Harvard Law School’s Berkman Center for Internet and Society, said that Google and Stanford?s partnership does not violate copyright law.

“Under U.S. law, such digitization is what one would call a ‘prima facie’ or ‘at first glance’ infringement,” Zittrain wrote in an e-mail to The Daily. “One would then look to legal defenses and privileges, such as fair use, to suggest that on balance, digitizing books that are no longer in print for archival purposes is not, in fact, copyright infringement.”

According to Google, books considered part of the public domain will be available for full viewing online, while those under copyright or from publishers will only be available in small excerpts.

To be considered part of the public domain, books must be copyrighted before 1923 in the United States. Outside of the United States, books must be copyrighted before the 1900s to be considered part of the public domain because different copyright laws apply internationally.

By showing only sentences from books not considered in the public domain, Zittrain said, Google appears to abide by the law.

“Showing only snippets of a book, rather than entire pages, seems to be classic fair use,” he wrote.

While some publishers have expressed their concern over the project, none have threatened a lawsuit against Google or any of the participating institutions.

“Surely there is a way – through some combination of compulsory licensing, fair use, micropayments, and privileges for educational use – to make millions of obscure but desired works instantly searchable and available to those with interest,” said Zittrain.