Structured data (stored in a relational database with its controlled rows and columns in tables where the pieces of information fit neatly and are identifiable and addressable) and unstructured data (in textual documents, for example, that may be marked up by XML but whose information is not forced into a strict table structure) appear to be something like the two ends on a data management continuum.
In his article for the recent edition of the Code4Lib journal entitled “Using XSLT’s SQL Extension with Encyclopedia Virginia”, Matthew Gibson shows a way to bridge the gulf between these two worlds, using an SQL extension to XSLT to leverage the contents of a relational database in an XML context.
I particularly liked his description of the strength of the relational database for certain tasks and information needs like the ones of his specific project, the Encyclopedia Virginia:
- “version control over every piece of content that goes into the encyclopedia
- one-to-many relationship management of, for instance, one author and/or editor to many articles, one chronological event referenced by many articles, and one media object shared by many articles
- most importantly, more efficient and scalable performance in looking up and retrieving data.”
Certain pieces of information require efficiency and consistency through reference to unique keys in the relational database, which is harder to achieve within a pure XML environment.
Both XML and relational databases have strengths and weaknesses, and whether or not you choose a hybrid approach like the one described in the article depends on your data and workflow requirements.