Category Archives: topic maps

Ambiguity and identity

Patrick Durusau suggests that properties help identify a subject representative in topic maps. For example, he says that

… All naming conventions are subject to ambiguity and “expanded” naming conventions, such as a list of properties in a topic map, may make the ambiguity a bit more manageable.

That depends on a presumption that if more information is added and a user advised of it, the risk of ambiguity will be reduced. …

In another post, he writes:

… Topic maps provide the ability to offer more complete descriptions of subjects in hopes of being understood by others.

With the ability to add descriptions of subjects from others, offering users a variety of descriptions of the same subject. …

I have some doubts whether this description approach, as opposed to the naming approach (subject indicator and subject locator) now in place in topic maps, will work. IMO, adding more descriptions (by different people with different models of the world, opinions, values, perceptions …) increases the risk of introducing divergent viewpoints and thus runs counter to the original intent (to determine identity and reduce ambiguity). Adding more information is likely to make description and reference more ambiguous, as Pat Hayes and Harry Halpin have argued in their paper “In defense of ambiguity”. Common sense might tell you that the more you define something, pin it down, the more precise it becomes. But precision is not necessarily achieved by more descriptions – subject identification rather gets prone to ambiguity.

Some more questions about identity and disambiguation:

  • How do you decide that a description has added a dimension to a concept that makes that concept so different that it becomes “another” concept? Who draws the line (someone else might disagree)? Can we formulate criteria for this process, or is it arbitrary?
  • How can we handle time- and context-sensitive descriptions – properties that change over time or according to context? Does a property that changes still define identity, and does identity change too, then? Maybe there are “minor” and “major” properties – if a “major” property changes, does that entail a change of identity, whereas with the change of a “minor”, less typical property identity stays the same? How can either of these “property classes” be determined?

I wonder whether we fall back into the Aristotelian system by stressing description, attributes and properties. Or perhaps description and naming strategies in subject identification can coexist and complement each other.

Topic maps and ILS

“Topic maps and the ILS: an undelivered promise” (Library Hi Tech, 26 (2008), 1, pp. 12 – 18) – a great, accessible way for librarians to explore possible applications for topic maps in a library setting. The authors are Suellen Stringer Hye and Edward Iglesias (who wrote some thoughtful comments on “Data, not records” on the ITIG ACRL-NEC blog).

The main merits of the paper are the demonstration of potential use cases of topic maps in libraries and the comparison of topic maps to discovery applications. Examining the assets and advantages that distinguish topic maps from these tools, the authors point to the power of topic maps: through associations, “each item or topic carries with it information about its context”, for example.

As mentioned in the paper, vendors of library software have not (yet), despite internal use of topic maps, included the technology in ILS development. Why not? What would it take for them to actively promote topic maps? And what about open source software? Of course this cuts both ways – there is no specific demand from libraries either. Maybe librarians need a clearer understanding of the benefits of topic maps compared to the fashionable discovery systems.

A discovery tool only goes so far, topic maps go further.

Associative index model

The paper “An associative index model for the results list based on Vannevar Bush’s selection concept” by Charles Cole, Charles-Antoine Julien and John E. Leide of McGill University, Montreal, which appears in the latest issue of Information Research, contends that algorithmically created methods of refining results lists in online catalogs are not well suited to meet the users’ information needs. The authors draw on Vannevar Bush (whose seminal text, “As we may think” (1945), is available here) and Charles Cutter to develop an associative index model.

Based on an understanding of cognitive processes during a search, the model establishes a second collocation set, triggered by the user’s associative thinking while perusing the first, system-derived results list. This second set is considered to better match the user’s actual information need. In my view it is only an externalization and formalization of thought processes at work in a more or less conscious way, including epistemological questions like, how do we look for information, how do we formulate a query, i.e. use natural language to reflect our information need, how do we process and organize the findings, how is association involved in search and selection.

Some questions remain open, for instance, why didn’t the authors revert to the FRBR user tasks instead of creating their own with slightly different meanings, or how would relevance feedback relate to their approach. I wonder what role topic maps could play in an associative retrieval tool – enabling users to identify subjects in their own words, i.e. from their thought associations, feeding improvements suggested by users back. A dynamic system getting “smarter” through user input which complements computational algorithms.

The Recovery Act and topic maps

Improving Federal Spending Transparency: Lessons Drawn from” by Raymond Yee, Eric C. Kansa and Erik Wilde of the UC Berkeley School of Information “explores the effectiveness of accountability measures deployed for the American Recovery and Reinvestment Act of 2009 (‘Recovery Act’ or ‘ARRA’).” Although data has been released as part of Open Government initiatives, the authors point out a lack of transparency due to data silos, highly distributed information sources and lack of controlled access points, among other reasons.

According to the authors, ARRA data resembles a jigsaw puzzle – the legislation is complex and there are many players and sources of data. In my view topic maps could help with a number of problems cited in the paper: they could build a bridge between several budgetary disclosure systems, they could expose the structure behind ARRA and make explicit the relationships between legislation and the wishes of Congress, implementation by the Treasury Dept., allocation of money to different accounts, and spending patterns (including agencies and recipients). Links could go back and forth, connecting data from across agencies (e.g. spending data –> program documentation –> legislation authorizing funding for that program). Obviously, machine-processable and unambiguous identifiers as well as controlled vocabularies are needed for various entities – this seems to be a weakness in the data so far, though.

The authors also call for an account of the data sources, which can be “first-class citizens” in topic maps, i.e. topics in their own right that can be talked about. Moreover, they stress the importance of efficient information retrieval systems – if you can’t find the information, what use is access to data? Budgetary metadata of high quality is critical to findability and useful display.

Classification would also be conducive to discovery, keeping in mind that “… classification is not necessarily an objective process. It is shaped by the assumptions and goals of people and organizations. These worldviews and goals often see disagreement and evolve over time.” Topic maps have mechanisms to reflect changes in terminology without discarding older terms, and different views of the world can coexist and be indicated by scope.

Access to data doesn’t automatically imply transparency and findability. The increasing number of Open Government efforts (so far primarily in the U.K. and the U.S.) look like a great opportunity for topic maps.