[Top] [All Lists]

Re: [ontolog-forum] Difference between XML and OWL

To: "John F. Sowa" <sowa@xxxxxxxxxxx>
Cc: "[ontolog-forum]" <ontolog-forum@xxxxxxxxxxxxxxxx>
From: Ed Barkmeyer <edbark@xxxxxxxx>
Date: Fri, 24 Oct 2008 19:20:06 -0400
Message-id: <49025826.4070208@xxxxxxxx>
John F. Sowa wrote:    (01)

> EB> But DL's modern academic claim to fame is the computationally
>  > bounded behavior of tableaux reasoners.
> That may be useful for writing PhD dissertations, but UML is far
> more important for commercial applications.     (02)

I did say "academic" claim to fame.  But what motivated the choice of 
DLs in the SemWeb community was the predictability.  The importance of 
OWL to commercial applications is the cache of W3C, and the presence of 
many supporting tools (of varying quality).    (03)

You are right about UML, but XML Schema is even more important in many 
commercial applications.  And the great advantage of OWL is that it is 
vastly better than XML Schema as an information modeling language.  And 
as a consequence of bearing the magic W3C stamp of approval, and being 
of academic repute, OWL is gradually supplanting XML Schema as an 
information modeling language in the (otherwise illiterate) XML 
communities.  And that can only be good.    (04)

> A glaring omission of
> the SemWeb effort is that they did not build on UML and RDBs.    (05)

No comment.  One pejorative view is that W3C (a kind of reverse 
synecdoche) couldn't own those.  At the same time, it must be observed 
that the UML world in 2000 was dominated by people who could only think 
in Java and C++.  And we should be thankful that the SemWeb wasn't built 
on that kind of thinking.    (06)

> An ontology must support *all* the languages and tools for an entire
> enterprise or even an entire industry.    (07)

Here we disagree completely.
Who is going to tie that bell around the cat's neck?    (08)

I see many worthwhile ontologies being built to address a well-defined 
knowledge domain, which has application across some swath of industry, 
but is not by any means aimed at all the languages and tools for that 
industry, or even all the languages and tools for that domain.    (09)

I also see many worthwhile ontologies being built to address a 
well-defined communications problem, which has application across some 
swath of industry, or just some segment of a specific enterprise.  These 
do not cover more than a small area of the enterprises that are 
involved, and yet they can have multimillion dollar/yen/euro impacts on 
every player involved.    (010)

These domains have to be small and well-defined in order to get 
agreement on the content of the ontology.  The larger the scope and the 
less clear the objective, the longer it takes to get one fold and one 
shepherd.    (011)

A cross-ocean 'trade lane' -- one parts supplier to one major 
manufacturer (an example of direct interest to us) -- can involve over 
20 players, including both manufacturers, customs organizations, ocean 
carriers, port authorities, logistics service providers, warehousing, 
rail freight, insurers, etc., and their various software providers.  It 
takes a lot of "joint work" to get a useable ontology for that small 
segment of industrial supply.  So we are reluctant to expand the scope 
any more than the manufacturing participants demand.    (012)

> Knowledge capture from a domain expert and the methods for using
> that knowledge for a particular problem are related, but distinct.
> The former is ontology development, and the latter is technically
> called "programming."    (013)

Obviously we have different ontologies for this.  To me, capturing 
knowledge from domain experts is "problem space analysis", and using 
that knowledge to devise a solution to a set of problems is called 
"design".  They are both elements of "engineering", sometimes "systems 
engineering", sometimes "software engineering", depending on the nature 
of the solution space.  One possible form of software engineering is the 
direct encoding of some part of the captured knowledge in a form to be 
used by some class of automated reasoning tool, and that is "knowledge 
engineering", and it may bridge the analysis and design activities. 
"ontology development" is a subset of, or a synonym for, "knowledge 
engineering", depending on your religion.  "programming", to me, is 
rendering the design for a solution into a 'platform-specific' 
implementation of it.  And we can argue about the level of abstraction 
at which design becomes programming.    (014)

This is why it takes a lot of work to build an ontology with 20 
interested contributors.  We all start with different ones.    (015)

> Although OWL is supposed to be a scrambled acronym for Web Ontology
> Language, the numerous "work arounds" necessary to make it useful
> have caused it to be more of a programming language than a logic.
> More properly, it should be called a "logic programming language"
> that happens to use a different subset of logic than Prolog.    (016)

This is your ontology, and your criteria for satisfaction of the terms 
you use for the classifier relations.  Others will disagree.    (017)

>   But many of the UML diagrams
> are sufficiently general that they can be used with a wide range of
> languages and logics.  The type hierarchies, the E-R diagrams, and
> the activity diagrams are very general, and they can be useful
> notations for aspects of an ontology.    (018)

Absolutely, and I know a lot of modelers who do just that.  But the 
devil is in the details.  If you are challenged by a UML lawyer about 
the correctness of your use of some UML feature, you may find that the 
standard effectively says it means a particular RDB or Java construct, 
thinly disguised by some half-hearted attempt at formulating an 
abstraction.  That is what I meant about having to extend the 
interpretation of some UML concepts to a more abstract plane.    (019)

> But one of my major complaints about OWL is that it is far more
> narrow and more specialized than the UML diagrams.  I would prefer
> to use UML diagrams for ontology development than OWL.    (020)

Well, you can't express a classifier definition in a UML diagram; you 
can only write the definition in the documentation or possibly in some 
OCL "rules" that might really be axioms.  (OCL is yet another screwy 
logic language that is neither fish nor fowl.)  In OWL, you can express 
SOME classifier definitions directly, and that can have real value in 
clearly conveying intent.    (021)

> EB> You will not be pleased to know that I think of the Zachman
>  > Framework as a useful insight into "systems analysis" IN ITS TIME,
>  > but the discipline has clearly moved on...
> I am pleased, because I agree.     (022)

We do find strange points of agreement.    (023)

> I wrote that article with Zachman
> in order to demonstrate how any such framework can be mapped to
> logic.  My claim is that all such diagrams are declarative ways of
> highlighting certain relations, which can be mapped to a general
> notation for logic, such as CL (or the CG subset in that article).    (024)

I was aware that that was your primary concern in the paper.  And I 
remember thinking at the time that those points were clearly made.    (025)

> My definition of 'logic programming' is tailoring a general
> logical representation to optimize it for a particular
> inference engine.  That is an important kind of work, most
> of it is currently done by hand, but more of it can and
> should be done by automated and semi-automated means.    (026)

Then we agree on the significance of the activity.  I understand 'logic 
programming' to refer, as it usually does in academic circles, to 
capturing knowledge in languages specifically for Prolog(-like) and or 
Rete(-like) engines.    (027)

> JFS>> The Semantic Web should encompass at least as much as Zachman
>  >> considered more than 20 years ago.
> EB> I have no idea what that means.
> In short, it means doing everything related to semantics.    (028)

Wow!  Even the craziest W3C people wouldn't go that far.
They do realize that the impossible takes a little longer.    (029)

> EB> We are talking about knowing what kind of reasoning we can do
>  > in a problem space, with all kinds of constraints, yes.
> That is logic programming.  Prolog is an excellent language for
> that purpose.  OWL is a half-vast language compared to Prolog.    (030)

As you say, there are two parts to everything.    (031)

> EB> What if the program was correct?  That is what bounding
>  > computational complexity is about -- knowing how long it will
>  > take a correct program to execute all the loops it must execute
>  > to get the intended result.
> You have swallowed one of the "talking points" that the academics
> who are trying to sell their dissertations have been bandying about
> for the past 20 years.  Programmers have been doing such things
> very well since the 1950s. (In fact, they did it much better in
> the '50s than today, largely because they had such small machines.)    (032)

For the record, the discipline predates "programmers" as we know them, 
but not by much.  The "programmers" of the 1930s and 1940s used 
electronic calculators.  And Ops research as a discipline emerged in the 
mid-1940s, necessity being the mother of invention.    (033)

> EB> I worked on operations research software that regularly generated
>  > enormous search spaces and pruned them and generated another set in
>  > its search for near optimal solutions.  And you learn to do bounding
>  > estimators, lest the search space explosions exceed the computational
>  > resources.  Seen solely from the outside, it is my impression that
>  > FOL reasoners have similar behaviors, and explosion is a very
>  > practical consideration in getting solutions.
> That is absolutely true.  But the brute-force technique of reducing
> the expressivity of the language does nothing to solve such problems.
> The most it can do is to make it impossible to state the problems.    (034)

Not quite.  What it can do, which we both discussed earlier, is force 
the engineer to cast the problem in a solvable way.  It is not 
'natural', and it may be extremely difficult to figure out what problem 
was actually solved as a consequence.  But that approach has been the 
bread-and-butter of mathematicians for a long time.  (Etienne Galois 
springs to mind.)  And it is the position description for "applied 
mathematician".    (035)

> That is like solving the problem of throwing out the baby with the
> bathwater by making is impossible to have babies.    (036)

:-)    (037)

> EB> Perhaps, but my point was that, primarily because of OWL, there is
>  > a lot more government and industry interest in the use of reasoning
>  > technologies than there was 10 years ago.
> That is one of my greatest fears.  The AI field has been plagued
> by boom-and-bust cycles for the past half century.  The booms are
> created by people who hype their software in order to get funding.
> That may get them some funds for a few years.  But when the
> promises fail to materialize, the bubble bursts, and the whole
> field gets a bad reputation.  For the SemWeb, there is vastly
> more hype than the technology can possibly satisfy.    (038)

Agreed.  But there are some good investments that are getting real 
results, because the field as a whole has matured.    (039)

Like you, many toolsmiths accept the OWL ontology, convert the captured 
knowledge into a form their tool can really work with, ADD the knowledge 
that couldn't be captured properly in OWL, and provide something of value.    (040)

No one really expects software to work all the time, except those who 
fly airplanes and drive cars.  If we did, we wouldn't be using Windows. 
  It only takes some really significant successes to sell large industry 
groups on the real value of a technology.  That was the story with RDBs, 
and it can be the story with knowledge engineering technologies, now as 
never before.    (041)

> EB> Excuse my ignorance, but how does a language run circles around
>  > another language?  Do you mean your Prolog engine, running your
>  > interpretation of OWL ontologies into Prolog, runs circles around
>  > every tableaux reasoner that accepts OWL?
> What we do is to translate any RDF or OWL we get into Prolog clauses
> and use the Prolog inference engine to do the reasoning.  The results
> were one to two orders of magnitude faster than the native OWL
> reasoners.      (042)

To be honest, I have never met a "native OWL reasoner", but presumably 
you mean tools like Protege and the Clark-Parsia tool and the Manchester 
tools.  (I have a mental block against DL tool names.)  This is a tall 
claim, and I assume you are working from observed or reported 
performances for the same ontologies.    (043)

(Bear in mind that this exploder has people who are overseeing contracts 
that are using, or even essentially based on, some of those tools.  So 
we can throw brickbats at languages and methodologies, with the 
inevitable YRMV, but making claims about tool performance is a much more 
sensitive issue.  Maybe we should just drop this.)    (044)

> As another data point, Arun Majumdar had a 4-week consulting project
> with a major telecom that had spent several person years of effort
> in trying to implement an RDF+OWL solution to a problem that had very
> tight timing constraints.  And they could not meet the requirements.
> What Arun did was to use CGIF as the representation instead of RDF
> and OWL.  He implemented a CGIF inference engine in Java that was
> adequate for the task.      (045)

Read: supports a particular subset of the expressiveness extremely well, 
by not worrying about the performance impact (or even support perhaps) 
of features or combinations that won't be used.  In effect, an efficient 
tool for a subset language.    (046)

We are currently engaged in development of an "IKL reasoner" with 
similar caveats.    (047)

I'm only guessing that this is the case, John, and I will be happy to be 
corrected in my reading between the lines.  But this is another version 
of constraining the language, and it is a common academic ploy.  Build 
an efficient tool for the parts of the language you use, or for the 
style of usage, and pay lip service to the rest.    (048)

> One of the main reasons why Prolog and CGIF are so much faster is
> that they are about 10 times more compact.  The RDF gang insist that
> you can use data compression to reduce the space, but inside the
> CPU, you have to process the expanded form.  If you use a notation
> that is 10 times smaller, you're likely to run 10 times faster.    (049)

If you operate directly on RDF triples inside your engine, then you can 
compress them to your internal representation of triples, but you still 
get gazillions of triples, yes.  But your Prolog rendition is hardly the 
only conversion of RDF triples to a more useful "native form".  Purist 
processing of an RDF triple store is to reasoning with OWL ontologies as 
the Turing machine was to the IBM 704.  But unlike the Turing machine, 
it is popular in academic circles, because a lot of children read the 
literature and implement verbatim.  They only need to get their degrees 
with toy ontologies, and they don't get extra credit for devising an 
efficient internal form.    (050)

> Those are indeed important issues, but the "talking points" that
> you mentioned above and many others that have been bandied about
> are either false or misleading.  That is one of the main reasons
> why I wrote the article "Fads and Fallacies About Logic":
>    http://www.jfsowa.com/pubs/fflogic.pdf    (051)

I looked at this a while back, and I thought it quite good.    (052)

Please understand that my concern is not to dismiss the OWL work 
out-of-hand.  I regard it as successful in some regards, inadequate in 
others, and largely untried.  You see it as weighed in the balance and 
found wanting.    (053)

> I mentioned Zachman because he had a broad perspective that
> shows the many different ways of looking at the same problems.
> The main point about his 30 different views is that they're
> a better approximation to infinity than the 6 diagrams of UML,
> or the one-trick pony of OWL.  But I certainly wouldn't claim
> that Z's 30 diagram types are adequate to model the infinity
> of perspectives that are possible.    (054)

UML is now up to something like 18, but only a few of them can be said 
to match any of Zachman's 30.  That was the gist of my concern -- that 
his perspectives are regrouped in the commonly used perspectives of 
enterprise modelers and software modelers these days.    (055)

It has been fun, John.  I think we have now driven away any possible 
interest in this discussion.  ;-)    (056)

-Ed    (057)

Edward J. Barkmeyer                        Email: edbark@xxxxxxxx
National Institute of Standards & Technology
Manufacturing Systems Integration Division
100 Bureau Drive, Stop 8263                Tel: +1 301-975-3528
Gaithersburg, MD 20899-8263                FAX: +1 301-975-4694    (058)

"The opinions expressed above do not reflect consensus of NIST,
  and have not been reviewed by any Government authority."    (059)

Message Archives: http://ontolog.cim3.net/forum/ontolog-forum/  
Subscribe/Config: http://ontolog.cim3.net/mailman/listinfo/ontolog-forum/  
Unsubscribe: mailto:ontolog-forum-leave@xxxxxxxxxxxxxxxx
Shared Files: http://ontolog.cim3.net/file/
Community Wiki: http://ontolog.cim3.net/wiki/ 
To Post: mailto:ontolog-forum@xxxxxxxxxxxxxxxx    (060)

<Prev in Thread] Current Thread [Next in Thread>