“Damn You and Your Ontologies!” Some Thoughts on Folksonomies and DITA

Slide from CMS/DITA 2010 Keynote Panel Presentation
Slide from CMS/DITA 2010 Keynote Panel Presentation

I attended several interesting presentations while I was at the CMS/DITA 2010 Conference in Santa Clara, but one conversation I had while down there stands out from the rest, especially in terms of its implications.

I met with a former work colleague and friend who now works for Google down in The Valley. I asked him whether or not Google has anyone who does nothing but provide metadata to describe their internal content. He responded by jokingingly saying the quote that is the main title if this piece. There may be some metadata buried in their own doc materials, but there is an expected reliance on search-fu abilities to find information.

This get-together came at the close of the first day of the conference, with another statement from that morning’s keynote panel rattling around in my head. The panel keynote was good, but one key phrase from it struck me as being fundamentally wrong-headed: that the new DITA 1.2 specification had hooks in it that could tie into “better and more robust taxonomies”.  I think we need to look less into building better top-down hierarchies of information and into bottom-up, folksonomic ways to let people find and tag the information they actually search for and use.

Sample Taxonomy: a "Tree of Life" (from the Old Dinosaur Gallery at the Royal Ontario Museum)
Sample Taxonomy: a "Tree of Life" (from the Old Dinosaur Gallery at the Royal Ontario Museum)

One of the defining characteristics of any taxonomy is that it is a hierarchical arrangement of information, typically devised by experts. Taxonomies are manifest in things like the way that books are arranged on library shelves, museums (think the “tree of life” that used to be evident in older natural history museums), and to online catalogs of goods and to a lesser extent even to the Table of Contents for any technical publication.

Taxonomies have been the way we have traditionally arranged information. But would better and more improved taxonomies do away with the need for search? I for one don’t see Google disappearing after the construction of the perfect ontology.

When I asked blogger Euan Semple after a talk he gave last week on the evolution of social media as to whether he thought there was a future for taxonomies in an increasingly social architected online world, his reply was a simple “no”. He illustrated his point with an anecdote: when the lint collector on his dryer began to fail, he traversed the hierarchies of the Hoover web site (which lacks a search function) searching for information on what to do. Giving up, he then put out a tweet asking people in his network what to do. He then discovered that there a core of informal “dryer geeks” who knew exactly what to do, and told him how to fix his problem. Twitter 1, Hoover 0.

I don’t think this is whole story however, and that the Tech Writing community (or the firms that employ them) shouldn’t just abandon tech docs in the hopes that end users will end up creating their own. No matter how good a mashup of information might be, there still has to be something for it to connect to. Somebody somewhere has read the manual at some point. The more technical the domain (aerospace, electrical engineering, pharmaceuticals) the less likely it appears there will be information available in informal formats.

So what’s all this got to do with DITA? I think that one of its advantages is its topic-based approach to conveying information, compartmentalizing all you need to know about a task, concept or reference material in a single “atomic” unit of information. I suspect that if we can find ways to tie folksonomies and “the power of the crowd” with DITA topics, people can more readily search for the specific info they want, and tag it so that other people can more easily find it.

An alternative to devising more/better taxonomies is to allow the user to tag material in a way that they find useful. I suspect that the new subjectdef element in the 1.2 DITA specification might be a good taxonomical hook upon which users can lay their folksonomic hats. While designed for adding taxonomical information, I suspect it can be adapted for more folksonomic purposes — assuming that there is some mechanism in place for two-way communication between the user and tech writer (which in many circumstances simply isn’t the case).

This is not the only way to tackle this problem, and there may be better ways of doing this without the need to tie things back directly to elements within the DITA specification, but as a technical communicator I do think we need to find ways to make information more findable and useful for the end-user, and I doubt that improved taxonomies are the solution. Nor do I believe that users are going to write their own content when a better alternative is available. I think JoAnn Hackos set the right tone near the end of the keynote talk when she asked the open question “How do we help improve the ability of our information consumers to find what they need?” DITA or no, as Technical Communicators we need to figure this one out. I suspect a bottom-up, folksonomic approach to making information more findable (and mashable) is the way to go.


"DITAWriter" is Keith Schengili-Roberts. I work for IXIASOFT as a DITA Specialist/Information Architect. And I like to write about DITA and the technical writing community. To get ahold of me you can email me at: keith@ditawriter.com.

View all posts by

3 thoughts on ““Damn You and Your Ontologies!” Some Thoughts on Folksonomies and DITA

  1. I made a similar observation during Erik Hennum’s talk, “Taming the herd: Get a handle on content by managing metadata”. The taxonomy is one person’s way of thinking of the content (or even a group’s) but doesn’t necessarily represent how the consumer thinks of it. Indexing has the same inherent problem in that it needs to anticipate how the subjects come to someone’s mind. The key to becoming more accurate with a taxonomy is being able to capture how folks look for information they need to find.

    What we really need is a way to capture what a person selects after searching, or the tags they use on content and reflect that information back into the metadata. While it may not be perfect, it will broaden the taxonomy (with the folksonomy) in such a way as the content becomes easier to find by a larger amount of people.

    I won’t profess to know how to accomplish this goal, but we may need the content to be more self-aware, so that when it’s viewed, it reports how it came to be viewed to some central database. Sure it sounds like Big Brother is watching, but it’s probably the only way to become more accurate with our metadata.

  2. I think that we should be wary of writing off methods of information retrievability that have worked (and continue to work) for vast amounts of information. Search is not a fix-all. It leaves much to be desired in terms of providing users precisely what they are looking for. It is better at helping users find lots of relevant information. Receiving a bunch of relevant search results is fine for Web searches much of the time, but it flies in the face of what we’re trying to achieve with DITA and minimalist writing — give the user what they need and only what they need.

    Top-down metadata methods (such as ontologies and thesauri) help to improve the precision of information searches, but metadata is expensive to produce and it reflects the perspective of the content from a limited number of writers and SMEs. If you work for an organization that can’t afford to generate lots of metadata on its own, then a folksonomy might be the only alternative. However, folksonomies are also problematic for DITA and minimalism because they do not really enhance precision with information results. Furthermore, folksonomies are dependent on having a sizable community of users who are at least somewhat vested in the content they are using. If only a small percentage of users in a user community apply tags to content, they you will effectively have an even worse problem than you would have had with a top-down metadata model. You would have a small collection of users generating the metadata for your content.

    It really depends on the capabilities of your organization and on your user community as to the type of search and metadata model that is best. A combination of search, top-down metadata, and folksonomies is often ideal because it gives users lots of different avenues to finding information.

  3. Just to clarify, and as I said at the time, I was re-telling a story originally told by David Weinberger. Also you why should what I said mean that “the Tech Writing community (or the firms that employ them) shouldn’t just abandon tech docs in the hopes that end users will end up creating their own.”? We weren’t talking about the origination of trustworthy information but about the ability to navigate to it.

    Isn’t this miss-quote and misinterpretation an example of just what we are talking about? If This had been a more conventional setting I wouldn’t have been alerted that you had written this and wouldn’t have had the opportunity to correct it!

Comments are closed.