Consistency and identity management

Consistency is a strange thing. We are in dire need of it to give computers something reliable to work with, yet we are unlikely to achieve the necessary level of consistency in our data due to various reasons. First, we are human, and inconsistency can be said to be part of human nature; second, there are different catalogers entering data into the same pool who don’t do things exactly the same. We can (and as catalogers, should) strive for as much consistency as possible in our own work, but factors such as the ones just mentioned get in the way.

Current ILS match strings for indexing, so it’s hard for them to tell whether “Oxford UP” and “Oxford Univ. Pr.” and “Oxford University Press” (I’ll spare you other ways to write this – which exist!) are the same or not. Users wanting to browse titles of a certain publisher are left to click through lists of variant names (typos and such included…). Or even worse, failing such an index, they have to search for all kinds of variations.

Why not cluster / merge these under one term? The technical possibilities are there (the freely available Google Refine, or topic maps, for that matter), I’m sure it could be implemented into library systems. A simple list of values to choose from while cataloging would be another, although limited, option. Here software can help straighten out human errors or inconsistencies (which, let’s face it, will continue to exist) and users will benefit from a more time-sparing and useful display. Identity management, anyone?


10 thoughts on “Consistency and identity management

  1. Bryan Campbell

    Like you, I favor “[a] simple list of values to choose from while cataloging…” The requirement that the value be transcribed (even taken as seen according to RDA) is a hindrance rather than a help. The value has more value when controlled, in my view. I would venture to say likewise for place of publication. You suggested the presence of technical possibilities. Have you seen OCLC’s WorldCat Publisher Pages prototype | ? Here is a page for a well known (to librarians) Publisher: Libraries Unlimited, Inc. | What do you think?

    1. Saskia Post author

      Thank you very much for the link to the WorldCat Publisher Pages. I didn’t know they existed, but had hoped there would be some tool that is more expressive than what we have in OPACs and that someone would be aware of it 😉 These Publisher Pages are fantastic, it’s a shame they are apparently not worked on or updated anymore! Even so, the question arises: how to seamlessly incorporate them into a discovery system with the respective library’s resources.

  2. annewelsh

    And AACR2 tells us to give publishers in the shortest recognisable form, except for university presses, which should be given in full –

    260 $aOxford : $bOxford University Press, $c2011

    RDA plans on asking us to write out publisher names in full, too.

  3. jamie

    Authority control needs to move beyond what particular name/term is used and to a model where the identifier carries no information itself, but has as many names and other pieces of information associated with it – just as with Topic Maps’ subject identifiers.

    The matter of which name associated with a subject identifier should be used in any particular context has nothing to do with the identity of the thing identified. Rather, the context is the determining factor: language, script, user preferences, etc.

    This is (part of) the model used in the Entity Authority Tool Set, which I am currently rewriting to be a Topic Maps application. You can see it somewhat in action at — OUP is there, in an uninteresting record; the records for Saint Petersburg and Katherine Mansfield might give some idea of what the system can do.

    1. Saskia Post author

      Thanks again for pointing to this excellent work. A question that just occurred to me: since the code is open source, do you know of any use cases of EATS in libraries?

      1. jamie

        I’m afraid I don’t know of any library installations, and while that’s not to say there aren’t I’d be very surprised to find even one. Since EATS only does one part of what a library system needs to do, it’s always going to be tough for it to compete (as it were) against more integrated systems that do much or all of what is required.

        I must admit that I have barely done anything to make EATS known outside of my personal contacts, and while the documentation has been improving, the whole system is likely far too forbidding for anyone to set up without serious incentive.

