Some Thoughts on DITA-based Technical Writing Metrics

DITA Topic MetricsI liked a recent post on technical communications metrics on the the Mark Lewis blog recently, and while I don’t agree with the conclusion (any metric based on word counts and a subjective evaluation of quality is problematic IMHO) it did get me thinking about the topic of documentation metrics in general. I have done some work in this area and have some ideas to share. This will be the start of several articles I plan to do covering DITA production metrics, and first I want to talk about the types of doc metrics that are out there, as they are often lumped together as if there are a single thing, but in fact they are often used for distinct goals and purposes.

The Types of Tech Doc Metrics
Good material on effective tech docs metrics are few and far-between, but they seems to fall into four distinct categories:
• Return On Investment (ROI) Predictive Metrics
• Cost-effectiveness Metrics:
• Production Metrics
• Quality Metrics

R.O.I. Predictive Metrics
The first category measures the anticipated efficiencies and/or cost reductions resulting from switching from a legacy documentation toolset to DITA XML, and in my experience is most often used to justify the purchase of a Content Management System, usually based on the amount of content reuse anticipated and lowered localization costs if applicable. Put very simply: if the estimated $ R.O.I. is less than the cost of a CMS (along with attendant costs), get the CMS. This set of metrics has been well-covered by Mark Lewis and Mark Lewis, who go into detail on how to create credible numbers that will stand up when a tech docs manager goes cap-in-hand to upper-management and asks for a Component Content Management System (CCMS).

Cost-effectiveness Metrics
Cost-effectiveness metrics tries to measure the value of tech docs based on its use by users against how much it would cost to deliver that same information by other means – such as via technical support. While it is easy to measure how often HTML-based Help is accessed by going through web logs, or how useful end-users rate the material, or how often it is shared using social media, it is harder to put a dollar value on what it would cost if the same material had to be delivered over the phone by a technical support representative. Call center people do not simply read tech docs over the phone, and they also cover thing not handled by tech docs (such as QA work, support for languages the tech docs are not translated to, and some customers just need an actual person to talk to).

Production Metrics
Production metrics are used to determine the quantity and costs of producing content. This set of metrics – the one I am the most familiar with and will talk about here in future posts – is used by managers to measure the efficiency of their team’s content production. They can also be used to further the ROI argument with real numbers, which is why it is important to get some sort of baseline measurements using the legacy toolchain before making the switch.

I think DITA and most XML-based systems in general are well suited to this type of metric, especially when combined with the data-mining capabilities that CMSes provide via their search mechanisms, because instead having to make guesses on predicted savings or the “value” of content, you have real numbers to work with that you can mine for useful information.

There is general agreement on what not to measure in this area, such as the number of hours needed to produce a page of content, and the number of documents released per writer. With the first you have to ask what a “page” is in XML, especially when you are looking at multiple output types (double-spaced pages anyone?) With the latter there is a comparing apples to oranges problem, as the relative size of the documents a technical writer produces over a year may be very different, ranging from single-sheet application notes, to end-user manuals, to API – all of which take a different amount of time to research and write.

A CCMS filled with DITA XML topics is a gold mine of information for Documentation Managers looking to understand how their team works and the costs involved. More on this in future articles.

Quality Metrics
This is the hardest metric to measure objectively; existing measures usually include some subjective variable of quality, such as determining how “hard” it is to write the content, assigning a grade to reviewed material, or using the rank of the technical writer(s) (Junior, Intermediate, Senior or Staff) against the material produced.

I’d advocate something more concrete and less susceptible to subjective measures: look at things like average sentence length and the average word size in a document. A long-recognized Best Practice when crafting in topics is to use minimalist writing techniques. Over time this ought to be measurable, not just in the reduced size of a document, but also in the size of the words and sentences being used.

Expect more on DITA metrics in future articles here!


"DITAWriter" is Keith Schengili-Roberts. I work for IXIASOFT as a DITA Specialist/Information Architect. And I like to write about DITA and the technical writing community. To get ahold of me you can email me at:

View all posts by

One thought on “Some Thoughts on DITA-based Technical Writing Metrics

Comments are closed.