After reading some very informed blog posts about the Google Books Ngram viewer, I’ll throw in my 2 cents too.
The question with visualizations is: what do they actually tell us, in what way are they helpful? They are well suited for providing an overview of large amounts of data and their functionalities will surely be enhanced in the future. However, we should always be aware of the imperfections arising from OCR or metadata glitches and of the limits of such a visualization.
By way of example, the Ngram viewer doesn’t account for changes in semantics of words (or for disambiguation, for that matter). Terms are likely to have different meanings at different points in time. Nor does it tell you more about the cultural context which may be crucial for understanding the distribution of highs and lows in the graph. Counting occurrences is not enough to reflect semantic drifts or contextual information.
All in all, here’s another example of “computers only process symbols” – it’s up to people to create meaning from the results. On the other hand, we wouldn’t be able to explore the abundance of data without such analytical tools.
Just a quick note to direct your attention to AustLit, a project I ran across in a post by Jim Weinheimer. You have to be a subscriber to access the full database but some examples can be explored here.
I was curious to know what the underlying structure was and what strategy of collocation they use for the FRBR views and found these two pages:
Data models and technology.
It turns out that AustLit draws on the topic maps “topic” and “association” constructs to model the relationships in FRBR. It’s interesting to actually see this in action.
While I am skeptical about the representation of ontologies and human semantics in computer systems, topic maps concepts seem to lend themselves very well to expressing bibliographic relations and collating bibliographic information.
The FRBR display is a choice librarians make in the hope of helping users navigate the bibliographic and resource universe – one example for more “elegant” ways of organization.
At this year’s Tools of Change for Publishing (TOC) Frankfurt, Jeff Jarvis spoke to and about the publishing trade, but he could also have spoken about libraries. His keynote was entitled “The end of the parenthesis” and is well worth watching:
We are emerging from 500 years of text-based culture and going through a transition like the one Gutenberg brought. What does this mean for media and our view of the world?
To me the following points are particularly noteworthy and transferrable to the library world:
- After Gutenberg’s invention, people didn’t know what to make of books, they were scared by them. The current situation is a bit like that – we still try to fit the old (print) into the new (digital environment), we haven’t yet fully arrived at and embraced the possibilites of the digital era. Nobody really knows where it is going to lead us, so invention and innovation is key.
- Content is everywhere – on Twitter, on blogs, in addition to the traditional content providers. Analyzing Twitter data allows you to make predictions through what people are talking about.
- Add value – with content everywhere (digitized/born digital, full-text searchable books online), this is no longer the unique selling point (neither for libraries nor for publishers, really). So the question becomes: what kind of value can we add to that content, how can we enhance it, what tools can we offer our users to better organize information, put it into context and glean knowledge?
- Give users “elegant organization” – libraries can play a role in helping users do what they did before, but better and in a more “elegant” way; what could that mean for the library catalog, for example?
- Beta – the beta status implies that something is imperfect and unfinished, which, according to Jarvis, is a “statement of humanity and humility”. Why not demonstrate that libraries are human and humble by for example releasing a catalog relaunch as beta and not as a finished product? It could allow users to take part in determining which way it will go, it’s an “invitation to collaboration”.
- One factor that stands in the way of leveraging the beta status in libraries is “Perfection as standard” – is it still useful to keep up this approach? Of course we want to provide reliable information, but can libraries come up with a new, less static “business model”? I think “perfection as standard” prevents us from doing experimental and innovative things.