Moving into batch cataloging

The article “Cataloging Then, Now, and Tomorrow” cites three trends in cataloging: “the increasing reliance on vendor-supplied records and services, the explosion of electronic resources, and the growing interrelatedness of local library catalogs with systems outside the library.”

Well, I’m excited about getting to address the first two at work. I was able to slightly shift the focus of my role and am now one of two people responsible for managing the automated cataloging of vendor/publisher-supplied ebook data. After retrieving the data packages via FTP, we run shell scripts to modify them according to our needs, to load them into the ILS and to create holdings. There are some plans to support our consortium members with patron-driven acquisitions, and I’ll be involved in that project, too.

So I now have a foot in both worlds – the traditional cataloging world of one item (or one card) at a time, and the world of using “power tools” to manipulate large quantities of metadata without having to touch each record.

A meta catalog for digitized works

Imagine a user wants to read a public-domain book in electronic form. She’d be faced with the same situation as users before the advent of unified resource discovery systems – she has to go to various places on the web and do separate searches. Wouldn’t it be nice if there was a meta catalog for digitized works that brings together data from the likes of the Internet Archive, HathiTrust, Project Gutenberg, Europeana or Google Books? It could show what books were digitized by whom, whether they are downloadable, in what format, on what devices they can be read etc. Such a directory could also enable users to compare the quality if the same work is available in different versions. Another benefit would be the reduction of duplications of effort. Having duplicate electronic versions is not necessarily bad, but are time and money not better spent on unique materials not digitized elsewhere? Local priorities could be determined on a more informed basis.

All of this occurred to me while reading an article about the eBooks-on-Demand (EOD) service discovery platform (from p. 229 here, in German). EOD is a joint initiative of over 30 libraries from 12 European countries that each run their own digitization activities. Together they offer the (paid) service that lets users order a public-domain book to be digitized and delivered as an ebook. Instead of relying on users discovering EOD books “by chance” in the respective libraries’ catalogs, a VuFind search interface was built that allows finding books for digitization from all participating libraries in one central place and gives direct access to alre­ady digitized items. Records are ingested via OAI or FTP batch upload. For the future the project team plans to enhance the search platform to include links (via API queries of players like those I mentioned above) to works already digitized elsewhere. And this is where the idea of a central overarching catalog for digitized public-domain works popped up. Existing portals such as the Zentrales Verzeichnis digitalisierter Drucke (ZVDD, central catalog of digitized printed works, which covers digital versions created in Germany) go into the right direction, but we definitely have to think more globally and on a larger scale.

New HathiTrust blog

HathiTrust has launched a new blog, Perspectives from HathiTrust. The first post is by John Wilkin, Executive Director. He describes the strategy HathiTrust follows to make its content discoverable. Integration into the broader bibliographic access landscape, i.e. making the digitized material findable in a number of environments, is a central mission. Major points:

  • The temporary catalog based on VuFind will be retired and replaced by a new OCLC WorldCat Local catalog (press release here). Inclusion of HathiTrust content into a database where many libraries manage their collections emphasizes the role it can play in collection management and analysis not only for partner libraries but for the broader community. Of course, APIs and record distribution via OAI are also important for access to HathiTrust content.
  • Much of this content is already in Google Books and Internet Archive, but HathiTrust wants to open more avenues for discoverability by incorporating its full text indexes into the Summon discovery service (see press release). Availability in similar tools is likely to follow.
  • The standalone HathiTrust full-text search service will be enhanced with new features such as faceting or weighting of results.