Beyond Google Books: The Open Library and the HathiTrust Digital Library

Michael Hancher
University of Minnesota
mh@umn.edu

Beyond Google Books: The Open Library and the HathiTrust Digital Library

Call for proposals, Special Session, MLA 2012 Seattle

GOOGLE BOOKS is a familiar phenomenon, still growing in importance. Announced in December
2004, it has so far digitized more than fifteen million books, many of them fully available to the
public at no cost. The similar Live Search Books project, which Microsoft launched in December
2006, was closed in May 2006; by then it had digitized some 750,000 books, many of them freely
available. Few people remember Live Search Books, even though it was a Microsoft brand and
provided digital files of distinctively high quality.

Aside from Google Books the two other principal repositories for digitized books today are the Open
Library and the HathiTrust Digital Library. Launched by the Internet Archive in 2006, the Open
Library provides free access, in a variety of viewing formats, to about one million books in the pubic
domain. Many of these books were contributed by Microsoft’s Live Search Books project; many
others are duplicates of Google Books, uploaded by volunteers (presumably to guarantee access
should Google withdraw it, as sometimes happens).

The HathiTrust Digital Library is a cooperative project of more than fifty major research libraries.
Launched in October 2008, it has gathered about eight million books from Google Books, from the
Internet Archive, and from independent scanning initiatives, about two million of which are in the
public domain; and it provides the user with a variety of search and access options.

Google Books, the Internet Archive Open Library, and HathiTrust are now the Big Three of the
digitized library industry in the United States. Google Books has attracted much public and scholarly
attention, not only for the undeniable revolution in learning that it has brought about, but also for its
copyright adventures and for the sometimes doubtful quality of its digitized images, underlying texts,
and metadata. The special qualities of the Open Library and the HathiTrust Digital Library have yet
to be assessed by scholars, who stand to be benefit from those projects, as they have from Google
Books.

What are the merits and prospects of these two projects? How can they be improved? What role
should scholars play in their improvement? I invite members of the Modern Language Association
to propose papers for a panel, or topics for a roundtable discussion, for a possible Special Session
at the next meeting of the MLA, which will take place in Seattle on January 5–8, 2012.

Deadline: March 1, 2011. If there is sufficient interest I will file a Special Session proposal with
the MLA in March, and look for a decision early in June.

The proposed Special Session would be a kind of sequel to “The Library of Google: Researching
Scanned Books,” which was part of the MLA meeting held in San Francisco in December
2008—soon after the HathiTrust began its work.