ontolog-forum
[Top] [All Lists]

Re: [ontolog-forum] Difference between XML and OWL

To: "[ontolog-forum]" <ontolog-forum@xxxxxxxxxxxxxxxx>
From: Ed Barkmeyer <edbark@xxxxxxxx>
Date: Wed, 22 Oct 2008 12:47:17 -0400
Message-id: <48FF5915.9020206@xxxxxxxx>
John F. Sowa wrote:    (01)

> Your comments raise three kinds of issues:  theoretical points
> about computational complexity, compiler-based methods for
> dealing with them, and methodologies for the humans who use
> the KR languages.  All three of them are very important, but
> the idea that all three should be addressed by the brute force
> method of restricting the KR language is hopelessly misguided.    (02)

Well, John, then DARPA and others have poured a lot of money into 
hopelessly misguided research in the area of knowledge engineering, and 
a very large group of would-be ontologists are doing hopelessly 
misguided things that have proved to be valuable.  Let's start with Horn 
clauses and rules engines and work our way through description logics to 
FOL reasoners.  It seems to me they all restrict the KR language.    (03)

> MH> For a particular problem, if you know ahead of time which
>  > complexity class it lives in, you can gear the algorithm
>  > to that.
> 
> I agree.  But that time should be just before the algorithm
> is invoked, not the time when the knowledge engineer or the
> domain expert expresses the knowledge in some formalism.    (04)

Well, the algorithm has to be written before it is invoked, which means 
someone has to define a problem class so that s/he can write the 
algorithm.  The issue is whether the knowledge engineer has to recognize 
the problem class and select the tools and languages that can solve the 
problems, or can use some general-purpose reasoning tool that can look 
at the statement of the problem (in the general-purpose language) that 
the knowledge engineer wrote down, determine the class of problem from 
the characteristics of that particular phraseology, and select a 
suitable specialized algorithm to solve it.  This is the thrust of 
John's later comment:    (05)

>   But Mitch and others
> have ignored the R & D on knowledge compilers.  A good compiler
> can and very easily does translate the operators to a form
> suitable for a particular algorithm.
> 
> Just look at the technology used in the theorem provers for
> the Thousands of Problems for Theorem Provers (TPTP).  All the
> best theorem provers use a collection of multiple special-case
> algorithms for specialized subproblems.  When they get a new
> problem, syntactic checks and transformations determine which
> algorithm(s) to invoke.    (06)

I agree that this is a subject of current research.  I really don't know 
how effective these tools are, by comparison with educated knowledge 
engineers, or journeyman knowledge engineers whose hands are somewhat 
tied by an imposed constraint on the language.    (07)

> PH> ... many purely empirical studies of large sets of problems
>  > have shown again and again that there are real 'trade-off cliffs'
>  > in practical problem sets which clearly partition the problem
>  > space into easy vs. near-worst-case-hard (the latter usually
>  > in a 'boundary zone' between two 'easy' areas.)
> 
> I very strongly agree.  That is why it is absurd to force
> a domain expert with no knowledge of the complexity issues
> to pick a KR language that is bound to a specific algorithm.    (08)

With all due respect, I think Pat's point was that it takes a lot of 
research to find the cliffs in some of these problem spaces.  For 
example, in Description Logic land, it took more than 10 years of 
research to determine maximal combinations of specialized extensions 
that were still computationally bounded.  And that work is still going on.    (09)

With respect to knowledge of complexity, therefore, I think Huxley's 
comment applies:  "Who is so rich in that commodity as to be out of danger?"    (010)

> MH> Some algorithms are 'adaptive' in that they are faster for
>  > simpler inputs (less expressive; using fewer of the fancy
>  > operators), but some algorithms are not, and no matter what
>  > the input (for a given size) it runs in the same time (e.g.
>  > tableau algorithms vs automata based ones for description
>  > logics).  There is no guarantee that a set of problems
>  > has an adaptive algorithm.
> 
> I agree.  But to quote Don Knuth (and others), "Premature
> optimization is the root of all evil."  An ontology must be
> usable for an open-ended number of different kinds of problems.    (011)

At this time in history, an ontology that is really usable for the 
problems we intend to use it for is a great leap forward.  An ontology 
isn't Truth; it is an organized collection of some set of accepted 
knowledge that is designed to be adequate for some purposes.  If it 
happens to be usable, or consistently extensible, for additional 
purposes and kinds of problems, that is nice, but is not a requirement.    (012)

Knuth's barb is cute, but it presumes that you can define "mature 
optimization".  Optimization is a critical part of good engineering 
design, but "maturity" is in the engineer and the product.    (013)

> The idea of designing a KR language around an algorithm forces
> a knowledge engineer (or worse, a domain expert) to choose a
> solution before anyone knows what problem(s) it will be used for.    (014)

See above.  SAIL, Prolog, KIF, CycL, OWL/DL, and the beat goes on.  We 
have been forced to do this for 35 years.  Some are more restrictive 
than others.  Why do we engineer FOL work-arounds for modalities? 
Because we have FOL hammers.    (015)

> MH> Increasing expressiveness increases the complexity of solving
>  > the whole class.
> 
> There is zero evidence to support that claim.     (016)

Well, John, the class in question wasn't identified.  There are a dozen 
or so worthwhile papers that support that claim for description logics. 
  And there is a much older set that deals with that issue in modal 
logics.    (017)

> Just look at the
> practical results from theorem provers, from systems such as Cyc,
> and from experience with knowledge compilers.    (018)

But CycL itself is a strange combination of small FOL restrictions and 
canned extensions.  It was designed to do just what you describe, 
recognize and solve certain kinds of subproblems rapidly.    (019)

> EB> ... the OWL/DL language forces a discipline on the knowledge
>  > engineer that the more powerful and less complex KIF/CLIF
>  > language doesn't.  The knowledge engineer is forced to cast
>  > the problem at hand into the carefully honed elements of the
>  > OWL/DL language.
> 
> Such discipline can be extremely valuable.  But it should be
> supported by the methodology for knowledge acquisition and
> knowledge engineering.      (020)

Strongly agree.    (021)

> For example, look at the UML diagrams
> and tools, which use a variety of different notations that
> eventually map to programming languages that are Turing
> complete -- i.e., it is undecidable whether or not they will
> get hung up in a loop.
> 
> Nobody has ever suggested that we must restrict the expressive
> power of those languages.  Instead, we use tools, methodologies,
> patterns, and compilers for software design and development.    (022)

That assumes that "Turing complete" means "unbounded in expressiveness". 
  First of all, none of them is really Turing complete -- there are 
practical bounds on the length of the information strings, and those 
bounds have a great deal to do with the development of certain kinds of 
algorithms for real analysis and data management.  And second, there are 
restrictions on the expressive power of those languages.  Some are 
designed in, such as the inability to treat a datum as an instance of 
two classifiers; and some restrictions are imposed precisely because the 
restriction enables a particular compilation algorithm.    (023)

It is all about building an adequate software engineering tool within 
practical limitations that affect performance.    (024)

> EB> In Mitch's terms, this means that the knowledge engineer is
>  > forced to determine the category of problem and express an OWL
>  > ontology that can deal with that category of problem.
> 
> That is absolutely the worst conceivable approach.  Any piece
> of knowledge can be used in an open-ended number of ways, and
> the kn. engineer has no idea how it might be used in any
> particular application.     (025)

Then I don't want that engineer working on my project.  Knowledge has 
open-ended usage, but knowledge is "engineered" for the purpose of 
solving known categories of problems.  Whilst doing the engineering, we 
strive to avoid unnecessary creating unnecessary restrictions on use. 
But we do come to tradeoffs where we have to make choices that create 
restrictions.    (026)

> EB> In my experience, people who are constrained to use a language
>  > like OWL/DL to express their ontology can, in many cases, find
>  > a way to express all or nearly all of their intent.
> 
> Yes, I have seen the kinds of things that OWL programmers write.
> They go through endless contortions to express statements that
> are trivial to express in CL.     (027)

Exactly.  And this was my later point.  It is a lot easier for the 
humans to read and understand the CLIF in those cases.    (028)

> And for specific problems those
> CL statements can be compiled to vastly more efficient code than
> the excruciatingly contorted, hopelessly unreadable OWL versions.    (029)

I'm sure that is true for some problems.  And for some problems, it goes 
the other way.  The apparently simple CLIF sends the reasoner off into a 
huge search space, while the carefully crafted OWL bounds that search 
very tightly.    (030)

> EB> Humans have an easier time following a straightforward
>  > presentation of the problem, even if the axioms become more
>  > complex.  The simpler but less constrained language is easier
>  > for humans to follow, but harder for the reasoner to work with.
> 
> False. As an example, look at the medical "English", which I
> translated to controlled English and then to CLIF and CGIF.
> The original statement is in slide 15 of the following talk:
> 
>     http://www.jfsowa.com/talks/cl_sowa.pdf
> 
> My translation to controlled English is in slide 21, the CLIF
> is in slide 22, and the CGIF is in slide 23.  The person who
> translated it to OWL couldn't get it on one slide.  Instead,
> he scrolled through page after page after page of hopelessly
> unreadable -- I don't have a good word for it, and my usual
> comment is "garbage".    (031)

John, we are in violent agreement.  I think of CLIF and CGIF as "a 
straightforward presentation of the problem."  Any reasonably 
interesting ontology is easier to write in CLIF and easier to understand 
in CLIF than in OWL.  All OWL practitioners also use some kind of 
drawing aid, analogous to CGIF, to make it easier to "grok" segments of 
the ontology.  And when you have to do work-arounds in OWL, even the 
drawings are ugly.    (032)

> EB> But if you take the CLIF ontology that the KIF knowledge
>  > engineer prefers to build, it will take substantially longer.
>  > Because "thinking in KIF" will often result in axioms that
>  > force the reasoner "to do all sorts of expensive, time
>  > consuming things."
> 
> There is zero evidence for that claim.     (033)

I should have said "it will _often_ take substantially longer..."    (034)

There is not a lot of _published_ evidence for that, but there is a lot 
of experience.  Have you never heard knowledge engineers talk about 
"avoiding disjunctions" or "adding helper axioms" or "steering the 
reasoner"?  Outside of DL land, I have never met an AI student who 
doesn't understand those concepts, and the need for them.  That kind of 
"additional engineering" comes of watching the reasoner go off into 
never never land with the vanilla ontology.  So you create intermediate 
relations to split/merge disjunctions.  Or you add axioms (lemmas) that 
could have been deduced from the originals, but assist the reasoner in 
finding shorter search paths.  Or you change "controls" to make it 
exhaust shorter paths before generating more simple intermediate goals.    (035)

It is all about tuning the reasoning process to a class of problem, 
after you realize that writing down the problem the way you understood 
it did not produce acceptable performance from the reasoner(s).    (036)

-Ed    (037)

-- 
Edward J. Barkmeyer                        Email: edbark@xxxxxxxx
National Institute of Standards & Technology
Manufacturing Systems Integration Division
100 Bureau Drive, Stop 8263                Tel: +1 301-975-3528
Gaithersburg, MD 20899-8263                FAX: +1 301-975-4694    (038)

"The opinions expressed above do not reflect consensus of NIST,
  and have not been reviewed by any Government authority."    (039)

_________________________________________________________________
Message Archives: http://ontolog.cim3.net/forum/ontolog-forum/  
Subscribe/Config: http://ontolog.cim3.net/mailman/listinfo/ontolog-forum/  
Unsubscribe: mailto:ontolog-forum-leave@xxxxxxxxxxxxxxxx
Shared Files: http://ontolog.cim3.net/file/
Community Wiki: http://ontolog.cim3.net/wiki/ 
To Post: mailto:ontolog-forum@xxxxxxxxxxxxxxxx    (040)

<Prev in Thread] Current Thread [Next in Thread>