Monthly Archives: October 2011

Indexes for ebooks

Some people wonder why, with full-text search available, an ebook might still need an index. If you happen to be one of them, go read “Missing Entry: Whither the eBook Index?” ;). This article is a great summary of the value of indexes (even or especially for books in electronic form) and gives examples (with nice illustrations!) of what enhanced indexes might look like. Indexes with enhanced functionality can be much more interactive and appealing to the user than pure lists of words with a page indication.

Just like subject cataloging, indexes offer a value that cannot be replaced by full-text search. They chart a structured map of the content, show paths into the information, expose relationships and go beyond pure search (which just pulls up instances of terms) in that content is analyzed and arranged meaningfully.

Experienced indexer Jan Wright points out in a fascinating podcast on ebook indexing that an index is a discovery feature just like other metadata. She says: “The more tools for getting into information readers are given, the happier they will be.”

The potential of what ebooks can be (beyond static representations of regular print books) has not been tapped yet – indexes are only one example. We’ll just have to wait for EPUB to recognize its importance and address it explicitly in its specification, and for publishers to incorporate smarter indexes into their products.

A meta catalog for digitized works

Imagine a user wants to read a public-domain book in electronic form. She’d be faced with the same situation as users before the advent of unified resource discovery systems – she has to go to various places on the web and do separate searches. Wouldn’t it be nice if there was a meta catalog for digitized works that brings together data from the likes of the Internet Archive, HathiTrust, Project Gutenberg, Europeana or Google Books? It could show what books were digitized by whom, whether they are downloadable, in what format, on what devices they can be read etc. Such a directory could also enable users to compare the quality if the same work is available in different versions. Another benefit would be the reduction of duplications of effort. Having duplicate electronic versions is not necessarily bad, but are time and money not better spent on unique materials not digitized elsewhere? Local priorities could be determined on a more informed basis.

All of this occurred to me while reading an article about the eBooks-on-Demand (EOD) service discovery platform (from p. 229 here, in German). EOD is a joint initiative of over 30 libraries from 12 European countries that each run their own digitization activities. Together they offer the (paid) service that lets users order a public-domain book to be digitized and delivered as an ebook. Instead of relying on users discovering EOD books “by chance” in the respective libraries’ catalogs, a VuFind search interface was built that allows finding books for digitization from all participating libraries in one central place and gives direct access to alre­ady digitized items. Records are ingested via OAI or FTP batch upload. For the future the project team plans to enhance the search platform to include links (via API queries of players like those I mentioned above) to works already digitized elsewhere. And this is where the idea of a central overarching catalog for digitized public-domain works popped up. Existing portals such as the Zentrales Verzeichnis digitalisierter Drucke (ZVDD, central catalog of digitized printed works, which covers digital versions created in Germany) go into the right direction, but we definitely have to think more globally and on a larger scale.