Heiko Recktenwald on Sat, 20 Oct 2007 14:26:31 +0200 (CEST)

[Date Prev] [Date Next] [Thread Prev] [Thread Next] [Date Index] [Thread Index]

<nettime> Books want to be free


Research libraries close their books to Google and Microsoft
By Katie Hafner
Friday, October 19, 2007

Several major research libraries have rebuffed offers from Google and
Microsoft to scan their books into computer databases, saying they
were put off by restrictions these companies wanted to place on the
new digital collections.

The research libraries, including a large consortium in the Boston
area, are instead signing on with the Open Content Alliance, a
nonprofit effort to make digital material as widely accessible as
possible. Libraries that agree to work with Google do so on Google's
terms, which involve access to the material only through the Google
search engine, as well as restrictions on how much of it can be

Google pays to scan the books and does not directly profit from the
resulting Web pages, although the additional material makes its search
offering more useful and thus more valuable. The libraries are free to
have their books scanned again by another organization.

There are obvious financial benefits to libraries of Google's
wide-ranging offer, first announced in 2004. Many prominent libraries
have accepted the offer — including the New York Public Library and
libraries at the University of Michigan, Harvard, Stanford and Oxford.
Google expects to scan 15 million books from those collections.

But the resistance from some libraries suggests that many in the
academic and nonprofit world are intent on pursuing a vision of the
Web as a global repository of knowledge that is free of business
interests or restrictions.

"There are two opposed pathways being mapped out," said Paul Duguid,
an adjunct professor at the School of Information at the University
of California at Berkeley. "One is shaped by commercial concerns,
the other by a commitment to openness, and which one will win is not
clear." Last month, the Boston Library Consortium of 19 research and
academic libraries throughout New England announced a plan to work
with the Open Content Alliance to begin digitizing the libraries' 34
million volumes.

"We understand the commercial value of what Google is doing, but
we want to be able to distribute materials in a way where everyone
benefits from it," said Bernard Margolis, president of the Boston
Public Library, which has in its collection roughly 3,700 volumes from
the personal library of John Adams.

Margolis said his library had spoken with both Google and Microsoft,
and entirely rejected the idea of working with them. Adam Smith,
project management director of Google Book Search, emphasized that
the company's deals with libraries were not exclusive, and said the
company welcomed other scanning projects.

Smith said Google was "excited" that the Open Content Alliance "has
signed more libraries, and we hope they sign many more." Google
executives had hoped that the Library of Congress would be one of its
first major partners when it embarked on its scanning effort. It does
have a pilot program with the library to digitize some books.

But last January the Library of Congress announced a project with a
more open approach. With $2 million from the Sloan Foundation, the
library's first mass digitization effort will scan 136,000 books and
make them accessible to any search engine through the Open Content
Alliance. The library declined to comment on its future digitization
plans. The Open Content Alliance is the brainchild of Brewster Kahle,
founder of the Internet Archive, which was created in 1996 with the
aim of preserving copies of Web sites and other material. The group
includes more than 80 libraries and research institutions and focuses
on works that are out of copyright.

"Google could be privatizing the library system by offering a large,
but private interface to millions of books," Kahle said. The Open
Content Alliance, he said, "is fundamentally different, coming from
a community project to build joint collections that can be used by
everyone in different ways."

Kahle's group focuses on out-of-copyright books, mostly those
published in 1922 or earlier. Google scans copyrighted works too, but
it does not allow users to read the full text of those books online,
and it allows publishers to opt out of the program. Microsoft joined
the Open Content Alliance at its start in 2005, as did Yahoo, which
also has a book search project. That year, Google was also speaking
with Kahle about joining, but they did not reach an agreement.

A year after joining, Microsoft added a restriction that prohibits a
book it has digitized from being included in commercial search engines
other than Microsoft's.

"Unlike Google, there are no restrictions on the distribution of
these copies for academic purposes across institutions," said Jay
Girotto, group program manager for Microsoft's Live Book Search.
Institutions working with Microsoft, he said, included the University
of California, the New York Public Library, Cornell and the British
Library. Some in the research field view the issue as a matter of

"You don't want any for-profit company having control of the
world's knowledge," said Doron Weber, a program director at the
Sloan Foundation, which has made several grants to libraries for

Weber said many institutions that have been approached by Google
have spoken to his organization about their reservations. "Many are
hedging their bets," he said, "taking Google money for now while
realizing this is, at best, a short-term bridge to a truly open
universal library of the future." The University of Michigan, a Google
partner since 2004, does not seem to share this view. "We have not
felt particularly restricted by our agreement with Google," said Jack
Bernard, a lawyer at the university. "We have found Google very good
to work with."

The University of California, which started scanning books with
the Open Content Alliance, Microsoft and Yahoo in 2005, has added
Google as well. Robin Chandler, director of data acquisitions at the
California Digital Library, the electronic library for the University
of California library system, said working with everyone helps
increase the volume of the scanning.

But some have found Google to be an inflexible partner. Tom Garnett,
director of the Biodiversity Heritage Library, a group of 10 prominent
natural history and botanical libraries that have agreed to digitize
their collections, said he had had discussions with various people at
both Google and Microsoft. "Google had a very restrictive agreement,
and in all our discussions they were unwilling to yield," he said.

Garnett said the most striking example of this came when he asked the
Google representatives about a theoretical example.

"We asked, 'Suppose we allowed you to digitize all our literature, and
there was an ant researcher who wanted to peel off 10,000 pages of ant
literature and load it on his own server and perform advanced analysis
to correlate it with climatological data over the last 100 years,
using software he had developed to study trends in species research,'"
Garnett recalled.

He said the Google executives told him this would not be possible.
"They said, 'We'd be sympathetic but it doesn't fit in with our
model.'" Smith of Google said this was not the case. "It's certainly
something we would work with libraries to do," he said.

The Boston Library Consortium's scanning project is self-funded,
with $845,000 for the next two years. The consortium pays 10 cents
a page to the Internet Archive, which has installed ten scanners
at the Boston Public Library. Each scanned image will be stored at
the Internet Archive in San Francisco, and anyone can download the

On Wednesday the Open Content Alliance announced, together with the
Boston Public Library and the Woods Hole library, that it would start
scanning out-of-print but in-copyright works to be distributed through
a digital interlibrary loan system.

"God bless Google and Microsoft, and they'll do what they do," said
Weber of the Sloan Foundation. "But we need to do the right thing,
because we're in the privileged position of thinking about what's good
for the country and society over the long- term." >>

#  distributed via <nettime>: no commercial use without permission
#  <nettime>  is a moderated mailing list for net criticism,
#  collaborative text filtering and cultural politics of the nets
#  more info: http://mail.kein.org/mailman/listinfo/nettime-l
#  archive: http://www.nettime.org contact: nettime@kein.org