Thursday, April 15, 2010

Thoughts on Building a Useful Taxonomy for Transactional Documents

One of the most important tasks in developing contract standards is to construct a taxonomy of all legal agreements. It is a daunting task. And, it probably does not have a right answer. But you have to start somewhere.

I have been fortunate to work with one of the best classifiers: Dan Dabney of ThomsonReuters. His paper—A Brief Practical Introduction to Taxonomies—is a must read.

1. Taxonomy Structure

Dan describes the three main determinates of a taxonomy. "The broad, structural issues that seem to attract attention are these: (1) how many lines should the classification have; (2) how deep or flat should the classification be; and (3) what should be the classifications at the top level or two." To summarize Dan's position: the organization should present a clearly articulated, editorialized viewpoint that captures the essence of the particular field of study. The structure is then determined by the level of specificity desired (i.e. the number of the lines in the taxonomy). Each topic page should typically contain about 10 sub-items. The rule of 10 then dictates the number of levels by "the common logarithm of the number of lines it contains". In other words for a 100 line taxonomy, you need two levels; for a 1,000 line taxonomy, you will require 3 levels.

2. Taxonomy Organization or Theme(s)

While the number of levels can be determined by objective guidelines; the organizational theme presents far more subjective challenges. A clear articulation of the organization requires definition of the core concepts or characteristics. But, there are typically many overlapping themes that cannot be separated. To quote from Dan again: "Nearly any legal idea has several aspects to it—the legal theory involved, the jurisdiction, the nature of the parties, the procedural posture, and so forth. In a browsable taxonomy the ideas are ordered in such a way that the most basic or general ideas are the principles of classification for the higher levels, and the more particular ideas determine classification for the lower levels. Setting up the high levels of a taxonomy is an exercise in deciding what is important." To some degree, you have to accept that the categories will not be mutually exclusive. In the case of legal contract, for example, some of the competing organizational themes may include:

  • Type of contract (unilateral, multi-lateral)
  • Nature of the bargain (sale, exchange, license)
  • Nature of the asset, right or interest

3. Taxonomy Levels (Document Anatomy)

As a general rule, developing a taxonomy can be approached from both a top-down (deductive reasoning methodology) and a bottom up (inductive approach). Whittaker and Breininger offer sound advice: "First, develop the upper levels of structure into the major categories. Try not to have more than ten large subject areas; if you have more than that, it will make it difficult to navigate through the hierarchy. One structure might be to organize around major domains (products, human resources, geographies, for instance)." Taxonomy Development for Knowledge Management

(a) Top Level-Organizing Theme. In addition to understanding the theory of taxonomy development, there are a few practical tips. First, draw from existing classifications.

Second, look to the next level in the taxonomy, use technology to help see the patterns, and group lower level by its common characteristics. I do this by creating a list of all agreement types and then running word frequency analysis on the agreement names. I've been keeping a running track of agreement types. My list is by no means complete and it probably has duplicate entries and concepts. However, should there be an interest in collaboratively developing a taxonomy, I have made the list available here. And, for those who want to try the word frequency approach, there are many macros available on the web, such as:

(b) Contract Types

The second level of the taxonomy can be an organized list of agreements. This list might capture key variations, such as:

  • nature of the parties (individual, trust, corporation, partnership etc.);
  • nature of the asset or consideration (cash, real estate, stock, intellectual property);
  • nature of the transaction (purchase, exchange or license

(c) Clause Library

The third level lists the clause elements of each contract type. Here variations might capture:

  • Party weighting or bias e.g. an employer weighted severance clause
  • Geographic or jurisdictional variation


Agreement type and clause levels of the taxonomy can be based on empirical observations: what types of contracts exist and what do they contain? The top level organization presents a much greater challenge and perhaps is best realized through discussion, identifying an emerging consensus.


