Monthly Archives: January 2011

Authority control and identity management

Kathleen’s comment on my last post made me realize that “identity management” and “authority control” are actually two faces of the same coin. It’s just using two expressions (the first scoped “information technology” and the latter “library”, maybe) for essentially the same thing, although implemented differently. Library identity management has some flaws, the major one probably being that for the most part the identification system has not been brought up to speed and transferred into the digital age.

Moreover, I can think of a number of entities that would be better off with a more stringent IM / AC instead of string matching: serials (which in the German and Austrian cataloging community do have identifiers that are linked to from the records and are treated as kinds of authorities), publisher or place of publication (collocating different names of the same place, e.g. Wien, Vienna, Vienne, Beč, Виена etc.).

Librarians / Catalogers have gathered a lot of expertise in this particular area over the years, we just need to take it a few steps further to fit the online digital environment and build authority control / identity management platforms that can make a real contribution to all kinds of efforts (semantic technologies, digital libraries, digital humanities etc.). One such entity authority tool set that moves beyond traditional library authority practice is EATS, developed and used by the New Zealand Electronic Text Centre. Jamie Norrish anticipated my wanting to bring it up in his comment. 😉 I’ll mention it anyway: see his paper for more in-depth information about EATS (the paper is especially compelling because of its comparison between existing authority control mechanisms and EATS) and another paper he co-authored on the topic maps approach to authority management underlying EATS.

Extending our notion of authority and identity management beyond authors, titles and subjects to other entities in library data creates the opportunity to share links and identifiers with outside communities or across collections and increase search quality through consistency.


Consistency and identity management

Consistency is a strange thing. We are in dire need of it to give computers something reliable to work with, yet we are unlikely to achieve the necessary level of consistency in our data due to various reasons. First, we are human, and inconsistency can be said to be part of human nature; second, there are different catalogers entering data into the same pool who don’t do things exactly the same. We can (and as catalogers, should) strive for as much consistency as possible in our own work, but factors such as the ones just mentioned get in the way.

Current ILS match strings for indexing, so it’s hard for them to tell whether “Oxford UP” and “Oxford Univ. Pr.” and “Oxford University Press” (I’ll spare you other ways to write this – which exist!) are the same or not. Users wanting to browse titles of a certain publisher are left to click through lists of variant names (typos and such included…). Or even worse, failing such an index, they have to search for all kinds of variations.

Why not cluster / merge these under one term? The technical possibilities are there (the freely available Google Refine, or topic maps, for that matter), I’m sure it could be implemented into library systems. A simple list of values to choose from while cataloging would be another, although limited, option. Here software can help straighten out human errors or inconsistencies (which, let’s face it, will continue to exist) and users will benefit from a more time-sparing and useful display. Identity management, anyone?

Moving Forward – blog

In one of his recent postings to NGC4Lib, Jonathan Rochkind mentioned the University of Wisconsin resource discovery blog Moving Forward. As a cataloging geek, I had to go and check it out ;). If you are keen to learn about the inner workings of a discovery system based on Solr and Blacklight (without too much technical detail, unless you want it), about indexing and searching and the interaction between back-end and front-end, this blog is for you. In particular I enjoy the clear and accessible language of the posts.

Just as an example, let me warmly recommend “Bibliographic Description? Bibliographic Interaction!. Enabling users to combine terms across subject headings empowers them to pursue their own semantic interpretations of subjects – they don’t necessarily need to match the subject strings the cataloger came up with. To be honest, these possibilities of subject browsing are really impressive to me, never having seen such an implementation before. It goes to show that with cleverness and the available technology, some of the rigidity of MARC can be overcome and data can begin to “dance” – not clumsily but elegantly.