Draft OntologySummit2007 Framework Assessment Criteria (12KZ)
Initial draft by KenBaclawski / 2007.06.25 (12L0)
This document is an attempt to provide criteria for assessing the framework dimensions of an ontology. The criteria given here were based on the OntologySummit2007_Communique, the evaluations done during the OntologySummit2007, and other sources. All dimensions use a scale of 1 to 5 for the sake of uniformity. (12L1)
- Expressiveness is a property of the knowledge
representation language which describes the extent
and ease with which the KRL can describe increasingly
complex semantics, cf. propositional logic,
description logic(s), first order logic, sorted
logics, modal logics, ... T (12L2)
- Level 1: Glossary. Catalogs, controlled vocabularies, dictionaries and glossaries are all included. Tag ontologies are also on this level. (12L3)
- Level 2: Thesaurus. Terms are related by synonymy and antonymy, and they may be organized in an "broader than"/"narrower than" hierarchy. Levels 1 and 2 are informal. (12L6)
- Level 3: Class and Structure Hierarchies. Terms represent formal classes. This is the lowest level that can be considered formal (in the sense of being defined using mathematical structures). XML schemas and relational database schemas are on this level, but an ER diagram would be on level 4. (12L9)
- Level 4: Properties. This level includes frame-based languages, RDF and ER diagrams. (12LC)
- Level 5: Logic. OWL and FOL ontologies are all on this level. (12LF)
- The levels were obtained by combining pairs of levels in the ontology continuum of DeborahMcGuinness. (12LI)
- Structure is a property of the ontology, which
records how elaborate (or well organized)
are the semantics encoded by the
ontology. It may be the same as the expressiveness of
the KRL in which the ontology is encoded, or it may
be less than the expressiveness of the knowledge
representation language. Thus a simple taxonomy,
e.g., a tree, may be encoded in RDF/S, a description
logic language such as OWL-DL, or first order logic,
e.g., Common Logic. Viewed from a graph theoretic
perspective level of structure might be either
a simple set of terms (glossary), a
tree structures (taxonomy), a directed acyclic
graph, e.g., a partial order (faceted classification
schemes), or an arbitrary directed graph (e.g., RDF). T (12LJ)
- Level 1: Informal; unstructured. For example, folksonomies. (12LK)
- Level 2: Low structure. For example, dictionaries, glossaries. (12LL)
- Level 3: Medium structure. For example, taxonomies based on broader/narrower rather than subclass. (12LM)
- Level 4: High structure. For example, faceted classification schemes. (12LN)
- Level 5: Formal structure. For example, directed graphs. (12LO)
- The difference between this dimension and expressiveness is how well organized the semantics is encoded. So the levels are the same as the expressiveness levels except that they were shifted by one level to fit with the description in the communique a little better. The evaluation for this dimension cannot be higher than the corresponding expressiveness level. The reason for having a second dimension is to deal with unstructured ontologies that are specified in a highly expressive language (e.g., a folksonomy specified using OWL). (12LP)
- The granularity dimension concerns the level of
detail at which the ontology is specified.
A crude measure of granularity measure would
be the number of concepts (nodes) and the number
relation instances (links or edges in graph
representations). However, this fails to
recognize that some ontologies may have larger
scopes (domains) than others. A coarse grained
ontology might be suitable for use as an upper
ontology, or a broad subject index while a
fine-grained ontology (such as SNOMED CT with
300K concepts) may be better suited for
encoding medical diagnoses. T (12LQ)
- Level 1: Very coarse; limited. Broad subject index with around 10 or 20 classifications. (12LR)
- Level 2: Coarse. For example, an upper ontology with about 100 classes. (12LS)
- Level 3: Medium. Ontologies with 1K to 10K classes. (12LT)
- Level 4: Fine. Ontologies with 10K to 100K classes. (12LU)
- Level 5: Very fine. Ontologies with 100K or more classes. (12LV)
- This dimension can be measured in many ways. The size and density may be the most useful: (12LW)
- The number of nodes, links and/or axioms (shown above). This is measuring the size of the ontology. (12LX)
- Relative size with respect to the scope or domain. This takes into account the size of the domain. However, it it not clear how one measures the size of the domain. (12LY)
- The average density (e.g., average number of axioms per term). For the less formal ontologies it is the average density of connections at each node or term. Thus a term catalog has density 0, a glossary or tag ontology has density 1, a simple taxonomy has density 2, and so on. It differs from expressiveness in that it measures the extent to which features are actually used rather than whether features are available. (12LZ)
- Intended use is the dimension which records the
orginal purpose(s) of the ontology. These may include
semantically informed search, data semantics specification
for databases or data entry, data integration across
multiple data sources, agent communication languages,
controlled vocabularies for recording medical diagnoses,
etc. T (12M0)
- Level 1: Multiple intended uses (12M1)
- Level 2: Two intended uses (12M2)
- Level 3: Classification; search; retrieval (12M3)
- Level 4: Interoperability; integration (12M4)
- Level 5: Mathematics; system specification (12M5)
- There is no meaning to the ordering of the levels. I ordered them in this way so that there is some correlation with the other dimensions. (12M6)
- Comments: (--PeterYim / 2007.06.26-04:19PDT) (12MZ)
- (a) Ken, can you elaborate on your L-1 & L-2, why 'Multiple intended uses" and 'Two intended uses', instead of, say, 'Single intended use:______' and 'Multiple intended uses'? (12N0)
- (b) rather than 'Level(s)', may I suggest using the label 'Type(s)' instead (for this dimension, especially, but possibly for others, or even all dimensions too!) (12N1)
- Response: (--KenBaclawski / 2007.06.27-20:53:42EDT) (12N2)
- (a) The problem is that some other dimensions are linear while this dimension does not seem at first to have such an interpretation. However, perhaps it does. When the intended use is for a system specification, then the ontology is highly focused on that application. I set this as the highest level (or should it be the lowest??) Interoperability and integration are less focused and broader uses, but they still have a focus on a class of applications. So I set them at the next lower level. Search and retrieval are even less focused, so I set them at the next level. Finally, if the intended use includes several of these, then the purpose of the ontology is still less focused. So these were assigned to the lowest levels. (12N3)
- (b) The dimensions do seem to split between three that have a linear structure (expressiveness, structure and granularity), and four do not seem to fit very well as linear dimensions. Attempting to fit them all on the same scale is a bit procrustean, but any other uniform mechanism would also have this problem. At least the dimensions that do work as linear dimensions should be kept that way. (12N4)
- Comments: (--PeterYim / 2007.06.26-04:19PDT) (12MZ)
- Automated reasoning is a dimension which records
the extent to which it is anticipated that an ontology
will be used by automated reasoning software, e.g.,
for question answering, etc. If so, then one would
expect that the ontology would likely be encoded as
using some form of logic, e.g., first order logic. T (12M7)
- Level 1: No reasoning. (12M8)
- Level 2: Ad hoc reasoning. (12M9)
- Level 3: Some reasoning. (12MA)
- Level 4: Complex reasoning but not necessarily logical or rule-based. Reasoning is encoded with queries or procedures. (12MB)
- Level 5: Logical or rule-based reasoning. (12MC)
- A variety of automated reasoning strategies are currently being employed. The levels differ in the degree of complexity and sophistication. (12MD)
- Prescriptive vs. Descriptive is a dimension which characterizes whether the intent of the ontology developer is simply to describe contemporary semantic usage without much regard as to the scientific correctness of the encoded knowledge (e.g., a whale might, in common parlance, be described as a large fish.) Examples of such descriptive ontologies include folksonomies and most linguistic ontologies. Alternatively, an ontology may be intended as a normative prescriptive document whose correctness is considerable concern, e.g., a whale is a mammal not a fish. Other prescriptive ontologies include medical diagnostic terminologies, legal or regulatory ontologies, accounting ontologies, mathematical or engineering ontologies, etc. T (12ME)
- Governance The governance dimension addresses how
decisions concerning the structure and (especially) content
of an ontology are made. There was agreement at
the summit that ontology with legal or regulatory implications will need to defer to
existing legal, regulatory, and professional organizations
concerning the natural language definitions
of entities and semantic relationships.
Ontology development should be viewed
as an effort to organize and formalize concept definitions
and relationships which are conventionally defined by
existing institutions, not as an attempt to replace
existing definitions with de novo definitions generated
by autonomous computer scientists. As a corollary,
it was observed that it is necessary to record the
provenance of every definition, etc. incorporated into
an ontology, e.g., the controlling legislation, regulation,
standard, etc. from which a definition is taken. T (12ML)
- Level 1: Casual. No organization controls any aspect of the ontology. (12MM)
- Level 2: (12MN)
- Level 3: Controlled. While terms in the ontology (syntax) are controlled, the semantics is not. A controlled vocabulary would be at this level. (12MO)
- Level 4: (12MP)
- Level 5: Normative. Both the ontology and its semantics are tightly controlled. (12MQ)
-- This page is maintained by: KenBaclawski Please post any comments about the content as a subtopic within each framework dimension. (12MR)