ontolog-forum
[Top] [All Lists]

Re: [ontolog-forum] Simplifying the language and tools for teaching and

To: "'[ontolog-forum] '" <ontolog-forum@xxxxxxxxxxxxxxxx>
From: "Matthew West" <dr.matthew.west@xxxxxxxxx>
Date: Fri, 4 Jan 2013 10:40:08 -0000
Message-id: <50e6b187.ec5ab40a.253f.ffffbe1f@xxxxxxxxxxxxx>
Dear John and William,    (01)

> Before saying anything further about the details, I want to emphasize
> that Mathew West's HQDM ontology is actually quite good of its kind.
> He worked at Shell in using the EXPRESS notations to develop an upper
> level ontology that was successfully used in wide range of
> applications.
> For a brief summary of EXPRESS, see
> 
>     http://en.wikipedia.org/wiki/EXPRESS_(data_modeling_language)
> 
> EXPRESS is a text notation.  The graphic subset (EXPRESS-G) represents
> type-subtype and Entity-Relationship diagrams.  To support full FOL,
> the text form adds a WHERE-clause that can state arbitrary constraints.
> 
> In principle, this is a very good combination.  The WHERE-clause in
> EXPRESS is far more readable than the UML Object Constraint Language.
> For ontology, this combination of diagrams + text is more flexible and
> readable than OWL 2.  I recommend it for OWL 3.
> 
> I apologize for extracting an excerpt from a note by Matthew and
> posting it in a comment to this forum without giving enough context.
> But William's reactions illustrate the point that a better choice of
> terminology would be desirable.
> 
> WF
> > these examples seem to be so contrived and unnatural that I wonder
> > about the enterprise they are related to.
> 
> Following is another excerpt from a note by Matthew:
> 
> MW
> > I am more than 4 years out of Shell now, and the Downstream Data
> Model
> > was certainly Shell Confidential. However, my book "Developing High
> > Quality Data Models"
> > http://store.elsevier.com/product.jsp?isbn=9780123751065
> > provides a further development of some parts of that work. The data
> > model (but not the explanation) from that is also available on the
> web at:
> > http://homepages.rya-online.net/matthew-west/hqdm_framework/
> 
> Click on the last link and then click on HQDM Framework.  That will
> show
> 229 links for entity types.  Click on any of them for more detail,
> which is stated in the EXPRESS text notation.    (02)

MW: And if you click on the graphic symbol before the name, you will get an
EXPRESS-G diagram.
> 
> HQDM
> > kind_of_activity
> > A class_of_activity all of whose members are of the same kind.
> 
> JFS
> > Much simpler:
> > kind_of_activity:  a one-place relation that is true of every
> activity
> > of the same kind.
> 
> WF
> > I agree with JS, vis-a-vis predicates vs. 'classes'
> > But what seems to me to be most fundamentally wrong in this
> discussion
> > is the notion that there is a good reason to define 'kind of
> activity'
> > separately from 'kind of stone' or kind of hope'.
> 
> Yes.  The word 'class' is useless baggage that creates more confusion
> than it's worth.    (03)

MW: Not at all. You need to have a way of dealing with different levels of
classification. It is helpful to have a regular way to do this in an
ontology. You can use different conventions than in ISO 15926 or HQDM, but
if you use none, you are likely to create something very confusing.    (04)

MW: The problem is that in ordinary English we often use words ambiguously.
For example, you can write:
Singing is an activity(1).
And you can write:
Matthew West singing Baa Baa blacksheep to his grandson on Wednesday 2nd
January 2013 is an activity(2).    (05)

MW: The problem is, that activity here has two different senses, in one
sense it is referring to something happening, in the other sense it is
referring to a class of things happening. Interestingly, Singing is an
instance of activity(1) - but not a subtype of it, and a subtype of
activity(2) but not an instance of it. So it is going to be practically
important to distinguish between these two senses, and you need to make it
as easy as possible to both teach and understand the differences for
superusers and domain experts to make the distinction.    (06)

MW: In ISO 15926 we broadly adopted the convention that subtypes of
possible_individual (spatio_temporal_extent in HQDM) being named activity
(activity(2)), physical object, etc and to go with
class_of_spatio_temporal_extent, class_of_activity (Activity(1)) etc. This
was not an exercise in elegance, but an exercise in avoiding ambiguity,
which turns out to be much more important.    (07)

MW: Alternatively, we might have gone for activity instance, and activity.
What was important was that we made a choice, and stuck to it.
> 
> MW
> > I might not choose class if I had my time again, but in ISO 15926
> that
> > is history now, and changing it would be more confusing than leaving
> > it the same.
> 
> Unfortunately, the people who defined ISO 15926 made a poor choice.
> But the EXPRESS notation does not require the word 'class'. Avoiding
> that word should not violate ISO 15926.    (08)

MW: Obviously I disagree. I think you completely misunderstand the way the
word is being used. What would you suggest in its place? Just as we had to
make a choice about how to name different levels of abstraction, there is no
choice but to pick some word from class/type/set/category/kind/sort/??? To
distinguish them. But perhaps you think there is only ever a need for one
level of abstraction by classification? If I had my time again I would
probably choose set rather than class, but to be honest it is not that
important. The only people who care about these words are people with
philosophical baggage that want to reserve these words for particular
philosophical purposes. The users do not have this philosophical baggage, so
considerations like which word had the fewest number of letters comes to the
fore (you're going to write it a lot of times).    (09)

MW: In ISO 15926 we found that we needed routinely 3 levels of
classification, and occasionally 4 levels of classification. If you were
using a logic based approach you would be able to do the same thing
generally with one less level.    (010)

MW: So let's look where these levels come from. 
The base is spatio-temporal extent and its subtypes. However, we do not
attempt to provide all the subtypes of spatio-temporal extent. It would just
not be practical. We have identified over 50,000 (so far) relevant to the
oil industry, I estimate several million once you widen the scope, and there
is no realistic chance of identifying all of them. The question then is how
to accommodate future expansion in a consistent way. You could just say you
have to update the data model each time you come across a new subtype, but
that is in principle quite expensive, it suggests new tables and new code to
support them. An alternative is to accommodate the new subtypes as data.
This can just be added simply (in commercial systems this is called Master
and Reference Data).    (011)

MW: To do this, you can use the idea of a powerset. The powerset is the set
of all subsets of a set. So class_of_activity can be thought of as the
powerset of activity. Now I have a data structure where I can store the
subtypes of activity without having to add them as subtypes to the data
model. Generally, the subtypes of spatio-temporal extent need some
organization, and this is done with the powerset of
class_of_spatio_temporal_extent, which logically enough is
class_of_class_of_spatio_temporal_extent.    (012)

MW: The question then is where to make the divide between the data model and
the data stored in it. For both ISO 15926 and HQDM this division is made at
the paradigm level. So the data model contains what is necessary to the
overall paradigm at an ontological level, with the domain level ontology
being held as data. Some advantages from this are that the data model is
suitable for developing a family of ontologies that will be compatible (with
a little care). Imposing the things that need to be shared to achieve
compatibility, but not making choices at the domain level.
> 
> MW
> > The principle purpose of an upper ontology is that you relate the
> > terms of domain experts and SMEs to that upper ontology together,
> > thereby bringing together similar concepts, and distinguishing
> > different uses of the same terms. You can then also apply templates
> > from the upper ontology to the domain terms and improve the
> > consistency of the ontology at the domain level.
> 
> I agree that the mid-level terms used by the Shell engineers must be
> supported by the upper ontology. That requirement is non-negotiable.
> 
> But the Shell engineers don't use the word 'class'.  I don't know how
> ISO 15926, the EXPRESS and EXPRESS-G notations, and the terminology of
> the Shell engineers are interrelated.  But I suspect that there should
> be some way to simplify the definitions to eliminate that word.    (013)

MW: No, you just misunderstand what the ontology is doing and how it is
structured. Generally, engineers would not see the data model, they would
only be dealing with data held within it, and generally only looking at
supertypes and subtypes and properties relevant to them. The purpose of the
data model is to enable those things.    (014)

Regards    (015)

Matthew West                            
Information  Junction
Tel: +44 1489 880185
Mobile: +44 750 3385279
Skype: dr.matthew.west
matthew.west@xxxxxxxxxxxxxxxxxxxxxxxxx
http://www.informationjunction.co.uk/
http://www.matthew-west.org.uk/    (016)

This email originates from Information Junction Ltd. Registered in England
and Wales No. 6632177.
Registered office: 2 Brookside, Meadow Way, Letchworth Garden City,
Hertfordshire, SG6 3JE.    (017)




_________________________________________________________________
Message Archives: http://ontolog.cim3.net/forum/ontolog-forum/  
Config Subscr: http://ontolog.cim3.net/mailman/listinfo/ontolog-forum/  
Unsubscribe: mailto:ontolog-forum-leave@xxxxxxxxxxxxxxxx
Shared Files: http://ontolog.cim3.net/file/
Community Wiki: http://ontolog.cim3.net/wiki/ 
To join: http://ontolog.cim3.net/cgi-bin/wiki.pl?WikiHomePage#nid1J    (018)

<Prev in Thread] Current Thread [Next in Thread>