[Top] [All Lists]

[ontolog-forum] Some Comments on Descriptive vs. Prescriptive Ontologies

To: "[ontolog-forum]" <ontolog-forum@xxxxxxxxxxxxxxxx>
From: Thomas Johnston <tmj44p@xxxxxxx>
Date: Sat, 14 Mar 2015 14:01:10 -0700
Message-id: <1426366870.18295.YahooMailNeo@xxxxxxxxxxxxxxxxxxxxxxxxxxxx>
To: Ontolog Discussion Group
From: Tom Johnston (new member)

I would like to comment on the current discussion about SMEs and ontologies.

I will mention my experiences with SMEs in meetings whose purpose was to clarify requirements for creating new databases and applications, or for extending the scope and functionality of existing ones. The development of ontologies other than those whose purpose is to facilitate the semantic interoperability of databases is outside the scope of these comments.

  1. To begin with, some terminology.

SME: subject matter expert.
A member of the business community (as opposed to someone in the IT department) who engages with the objects and events which will be represented in a new or an extended database, and with the processes that populate that database with instances of those objects or events.

Example: the objects customers and invoices, the events of issuing invoices and processing payments.

(note: in the upper-level ontology I developed in my recent book “Bitemporal Data: Theory and Practice” (Morgan-Kaufmann, 2014), objects and events divide the world between them; they are exhaustive of what there is, and nothing is both an object and an event. Objects come into existence, cease to exist and, while they exist, change from one state to a successive state by participating in events. I consider this the formalization of an upper-level folk ontology which is the ontology common to all relational databases. I relate this ontology to the mathematics of relational databases, and associate it with a basic referential semantics, in Chapters 3-6 of BDTP.)

BA: business analyst.
Usually a member of the IT department, but always someone familiar with the current infrastructure of databases and applications, and someone who “speaks the language” of data modelers, DBAs and programmers. It is the job of the BA to transform the initial statement of requirements that led to the creation of a specific IT project into a target set of requirements for a new or extended database managed by a new or extended set of applications, such that the SMEs agree that the resulting database/applications meet their requirements.

During the JAD sessions (see below), the initial statement of requirements will be transformed into a different set of requirements that are not simply the initial requirements stated in greater detail. The initial set of objects, events and transformations will be similarly transformed as the BA helps the SMEs realize (a) ambiguities inherent in their original statement, (b) generalizations of their requirements that will do what they require but also additional useful things; (c) restrictions on their requirements because the current state of technology at the enterprise would make their satisfaction unacceptably expensive; and (d) a sorting of initial requirements into do-now and do-later categories, based on dependencies among the requirements, and on the need to keep the project on-time and under budget (so both the BA and the SMEs, whose names are most directly attached to the project, will look good to their bosses when the whole thing eventually moves into production status).

JAD: joint application development (a somewhat outdated term).
A series of meetings between the BA and the SMEs in which the initial requirements for a project are transformed into the final statement of requirements.

  1. Next, a comment on SMEs.

It is this: SMEs generally do not know what they are talking about. To repeat: SMEs generally do not know what they are talking about.

I have reached this conclusion at the end of a career in which I have worked for, contracted with, or consulted for twenty-four different enterprises, over half of them Fortune 500 companies and, during the latter half of my career, in the role of business analyst and data modeler. In that role, I have led hundreds of hours of JAD sessions, met with SMEs outside of JAD sessions countless numbers of times, and exchanged volumes of emails with SMEs, all designed to transform their requirements into something that data modelers, DBAs and programmers could start to work on.

telecom engineers who could not define what a circuit is.
SMEs in all enterprises who could not define what a customer is.
SMEs in a manufacturing company who could not distinguish between work in progress and finished goods.

For anyone familiar with Plato's Socratic dialogues (early and middle period dialogues), I can make my point like this: SMEs (Gorgias, Meno, Protagoras, etc.) are the protagonists of Socrates (the BA) in those dialogues. Those SMEs are the ones who profess to know something – about knowledge, justice, courage, etc. Socrates engages each of them in a dialog which always ends with Socrates demonstrating, usually by eliciting a contradiction from his protagonist, that the SME actually doesn't know what he claims to know.

But there is one difference between Socrates and today's BAs. Socrates is content (pleased, in fact, his protestations to the contrary) to show that his protagonists don't know what they claim to know. Today's BAs, however, cannot afford that luxury. Today's BAs must somehow guide her SMEs from ignorance to knowledge, from vague, ambiguous, incomplete or otherwise inchoate initial statements of what they want to a final statement which will mediate between them and the developers who will implement their requirements.

One conclusion from all this is that the (ontologically-adept) BA must take a very active role in eliciting and clarifying definitions of the objects and events of concern to the enterprise. Her role must not be tidying up around the edges of what the SMEs initial come up with as a requirements statement. She must not use a light touch. She must challenge her SMEs as aggressively as Socrates challenged the self-proclaimed experts he engaged with.

Is there any additional guidance I can suggest, other than these very general comments?

There is. I would like to suggest that before we begin eliciting ontological commitments from SMEs, we should clarify (a) what we are defining, and (b) what a definition is.

(3) What are we defining when we ask SMEs for definitions?

Let's take Customer as an example. In any enterprise, in any JAD session, with any group of SMEs, when we ask “What is a customer?” (the same “What is X?” question form as Aristotle's most basic ontological question, ti esti?), surely we must be asking for something besides a dictionary definition.

We don't need SMEs to formulate general definitions, whether they are do-it-yourself dictionary definitions, or definitions defining nodes in a taxonomy whose linearly parent nodes, up to the root node, have already been defined. We are asking our SMEs what a customer of our enterprise is, that is, what a customer of our enterprise in fact is, not what the SMEs think a customer of our enterprise ideally should be. And when SMEs tell us what they think a customer of our enterprise actually is, you can be assured that their definition will be, at best, a gloss on the real definition, and a gloss that will be incomplete, vague, ambiguous, and replete with other semantic anomalies.

We should all be familiar with the distinction between Aristotelian definitions and Wittgensteinian definitions, the former often referred to as definitions by genus and specific difference, and the latter often referred to as “family resemblance” definitions. John Sowa's “knowledge soup” is replete with Wittgensteinian definitions. But what we want are Aristotelian definitions. And so, what are they? And why do we want them and not Wittgensteinian definitions?

The “Why” is the easier question. The reason we want Aristotelian definitions is that they are the kind of definitions needed to support the Semantic Web with formal ontologies. They are the kind of definitions needed to enable machine-mediated semantic interoperability across different databases.

In short, we want software that is querying customer data in two different databases to “understand” the differences involved, to understand what counts as a customer in one of those databases that does not in the other, and vice versa. And this means, not what differences exist in the ordinary language definitions provided by SMEs, the definitions I have called “glosses” on the real definitions. It means the differences that exist in the definitions of the sets known as relational tables of customer data in those databases.

Family resemblance definitions will not do, at least not until such definitions are formalized in some kind of fuzzy logic. So what are these Aristotelian definitions, these definitions by genus and specific difference, that we need to formulate instead?

I will assume that the notions of set and set member, as used in naive set theory, are understood. In those terms, an Aristotelian genus is a set, and an Aristotelian specific difference is a rule (a set membership criterion), which picks out the members of a subset of that set.

In any relationship of a set and its immediate superset, the immediate superset defines a universe of discourse from which the members of the set are chosen by means of that rule. For example, the set Customer will have (whether represented as such in a database or not) as an immediate superset the set Party, which we can think of as being the set of all those individuals or organizations with which our organization engages in some way.

This immediately excludes from the universe of discourse for Customer such things as dogs, cars, and also any persons or groups not able to enter into a legal agreement (which a customer relationship is). Now, to define what a customer of our enterprise is, all we need to do is to state the rule which picks out a subset from that universe of discourse.

How do we discover this rule and, for that matter, this universe of discourse? Do we ask the SMEs? Well, the best of them may be able to point us in the right general direction. But there is a two-tiered source of precise information. The first tier are the enterprise's policy manuals defining what must be the case before we will accept a person or organization as a customer. The second tier is the actual code which implements those policy statements.

To accept a person or organization as a customer is to add a row to the enterprise's Customer table representing that person or organization. If we wish to support the software-mediated (without human supplementation) interpretation of the similarities and differences between two sets of customer data stored in the databases of two different organizations, then this is what it must be. This is what a customer of our enterprise is – a subtype of a Party with whom we have entered into a customer relationship, a relationship subject to conditions stated in our policy manuals and implemented in our code.

But relational tables are a special kind of set. They are time-varying sets. Over time, new members of the set may be added, as new persons and organizations enter into a customer relationship with us. Over time, existing members of the set may be removed, as they cease to satisfy the criteria for remaining customers of ours.

Nonetheless, relational tables are well-defined sets, defined on a universe of discourse represented by their immediate supersets, and picked out of that universe by means of criteria for becoming members of those sets, for remaining members of those sets, and for ceasing to remain members of those sets. This is the formal _expression_, in relational databases, of the Aristotelian definition of ontological types, and in which rows in those tables represent instances of the types represented by their tables.

Finding these definitions – which clearly can be done – is doing something a lot more concrete than talking to a group of SMEs with the objective of obtaining consensus definitions of such key terms as “customer”. It is against these real and set-theoretically precise definitions than the verbal definitions of SMEs are no more than glosses. And it is these real and set-theoretically precise definitions, not those verbal consensus definitions, that are the only kind of definitions that will be accessible to software which mediates the differences among different definitions associated with different databases, thus realizing the semantic interoperability promise of the Semantic Web.

So we have steered away from the dragon of Wittgensteinian definitions, and reached the safe fortress of Aristotelian definitions. To wit: the category Customer (of enterprise X) is represented by a relational table (hopefully named Customer, or something like it). A relational table is a set. A set is a collection of set members drawn from a universe of discourse such that the members of the set satisfy a specific set membership criterion. That membership criterion is expressed in policy manuals, and in the rules expressed in code that determine whether or not someone will be added to the Customer table.

Often, the universe of discourse is not defined, in which case it effectively defaults to “everything there is”. If, following Aristotle, we called that universe of discourse ousia, then the taxonomy we will construct over these sets/types/kinds/concepts will be very, very flat and very, very wide. But if that is the fact of the matter, for that enterprise, at that point in time, then so be it. The ontology implicated in the databases of that enterprise will in fact be very, very flat and very, very wide.

To summarize: I have been arguing that semantic interoperability is facilitated by a formal ontology (or at least taxonomy) which is a descriptive ontology, not a prescriptive ontology. And in most discussions of ontologies, especially discussions related to ontologies to facilitate semantic interoperability, I think the distinction between a descriptive ontology and a prescriptive one has been lost. I propose that we recover it, and begin to treat as a subject worthy of serious attention, the ways and means for developing descriptive ontologies.

Prescriptive ontologies come into play, on my view, when our objective is to construct higher-level ontologies, for example industry-level ontologies. For these higher-level ontologies to play the role of facilitating semantic interoperability across those industries, each enterprise subscribing to the industry-level ontology must realize that their responsibility is not to simply play lip service to the industry ontology. It is to begin the difficult work of adjusting their de facto ontologies, including the set membership rules for the sets represented as tables in their databases, so that those lower-level ontological categories – the ones corresponding one-to-one with their database tables, are consistent extensions of those higher-level ontologies.

This is the basic, boots-on-the-ground work that is required to make prescriptive ontologies a reality. But the foundation from which we must begin is what ontological commitments are in fact, right now, in place in individual databases. The prescriptive work of integrating these de facto low-level ontologies, however, is not simply a bottom-up process of supertyping the types we begin with. It is a process of working with a well-developed upper-level ontology as well as a set of de facto low-level ontologies, combining top-down guidance towards an ideal goal with real-world realizations of ontological categories that have been proven, over time, to actually work.

Perhaps this is something of a Manifesto – a description of a research and a development program of work guided by strong theoretical commitments and also a commitment to objects and processes that are time-tested in the real world. I don't like the term “Manifesto”, simply because of its creaky 19th century feel. But I am proposing that we clearly distinguish descriptive from prescriptive ontologies, clearly recognize the importance of descriptive ontologies, and begin to formalize them in the manner described above.


Message Archives: http://ontolog.cim3.net/forum/ontolog-forum/  
Config Subscr: http://ontolog.cim3.net/mailman/listinfo/ontolog-forum/  
Unsubscribe: mailto:ontolog-forum-leave@xxxxxxxxxxxxxxxx
Shared Files: http://ontolog.cim3.net/file/
Community Wiki: http://ontolog.cim3.net/wiki/ 
To join: http://ontolog.cim3.net/cgi-bin/wiki.pl?WikiHomePage#nid1J    (01)

<Prev in Thread] Current Thread [Next in Thread>