ontolog-forum
[Top] [All Lists]

Re: [ontolog-forum] English number of words/concepts that cannot be comp

To: "'[ontolog-forum] '" <ontolog-forum@xxxxxxxxxxxxxxxx>
From: "Patrick Cassidy" <pat@xxxxxxxxx>
Date: Mon, 5 May 2014 15:05:21 -0400
Message-id: <00f601cf6894$f6736580$e35a3080$@micra.com>
Good question from Ed B:    (01)

>So, your goal of " accurate semantic interoperability among databases and
 >applications" is in fact to be attained by having effective communication
 >among the human authors of the databases and applications.  And this very
 >practical engineering goal is thus dependent on an adequate solution to a
 >very complex linguistic and philosophical problem.  Given your acceptance
of
 >the maxim that "people don't understand each other perfectly", how
 >"accurate" can the "semantic interoperability" be?  
 >    (02)

Short Answer: a lot, lot more accurate than we have now.  Because:
  (1)    The understanding of the meanings of the ontology elements will be
better than that typically obtained from people sharing definitions in a
controlled vocabulary because the ontology also has many logical
restrictions, evident in the text or in a viewer such as Protégé,  that
severely constrain the possible interpretations of the elements.  Adding in
well-known specific examples (e.g. instances of classes) can also help.  One
can add more detail by including pictures or diagrams, but I haven't reached
that stage yet.
 (2)    because the ontology is in a logical form suitable for reasoning,
local users can test the application of the foundation ontology to their own
local problems by creating the FO-based logical specifications for their
elements and verifying that their local application works as intended when
the logic executes.    (03)

(3) But beyond the logical constraints and opportunity to test
interpretations (without any communication from the individuals who created
the ontology) , there should be (and is, in CYC and COSMO) a deeply
ingrained habit, when creating new ontology elements, to carefully describe
(linguistically) the meanings of the elements in the comments, trying to
anticipate possible ambiguities or misinterpretations, and including, where
ambiguity may exist, positive examples and counterexamples to avoid
misinterpretation.  This is not typical for databases or controlled
vocabularies or even ontologies.  In fact I have seen many ontologies where
the descriptions of the meanings are very sparse, and often non-existent.     (04)

It is also true in many (most? All?) databases that there are very sketchy
descriptions of the meanings of the elements, and often none at all.  Worse
yet, some of the elements may have cryptic names such as "MAB" without any
description of what they mean, forcing one to try to figure out from context
what that was supposed to stand for.  This was one really  big problem in
one project where I was trying to map database elements to an ontology,
especially when the creator of the database was no longer was available.    (05)

When one builds an ontology quickly, having a limited time to create a whole
bunch of domain entities, the definition may be left out.  That even happens
occasionally in CYC.  If the context of the element (topic, parents,
children, relations) suffice, and the element name sufficiently unambiguous,
this may not be too bad.  In such cases I still recommend trying hard to
anticipate ambiguity and at least put in some documentation, even if (to the
creator) the meaning of the term seems self-evident.    (06)

(4) by trying to focus on the necessary semantic primitives, one keeps the
ontology to the minimum size that will accomplish the task.   This makes itr
easier to learn and easier to use.    (07)

"Easier to learn" is a critical point.  If one wants to use an interlingua
to translate among many  local terminologies, it will have to have the same
level of expressivity as a human language.  So it will be as hard to learn
as the basic (non-techincal) set of words in a non-native language.   That's
the bad news.  The good news is that, for any given enterprise that wants to
communicate accurately with others using such an ontology,only one member of
the enterprise has to become "bilingual" in their local terminology, and
that of the Foundation Ontology.    (08)

(5) (maybe not a direct response) - thus far I haven't seen any suggestions
for alternative means to general semantic interoperability that appear any
more likely to achieve the goal of accuracy.  I am always eager to learn of
new possibilities.  Please, specifics, not pointers to papers or projects
that may have peripheral relevance to the issue, if any at all.    (09)

   Even though learning a common foundation ontology takes some effort,
people will do it if they have sufficient desire to communicate.  Go to an
international scientific conference and notice that even those whose native
language is not English learn that language so that they can communicate
with the larger community.   Learning how to use the Foundation Ontology
should be to some extent easier, because the vocabyulary is more limited.    (010)

  Of course,  those who don't have sufficient motivation to make such an
effort, they won't bother and won't interoperate.      (011)

But people will only make such an effort if there already exists a
widespread connon FO.  There's the rub.  There has to first be a
pump-priming project to build enough data based on the FO to motivate people
to make the effort.   I expect that will happen eventually, because I do not
anticipate any alternative and I do believe there is sufficient reason and
benefit to broad semantic interoperability.  But I have no idea when it will
finally catch on.    (012)

Pat    (013)


Patrick Cassidy
MICRA Inc.
cassidy@xxxxxxxxx
1-908-561-3416    (014)

 >-----Original Message-----
 >From: ontolog-forum-bounces@xxxxxxxxxxxxxxxx [mailto:ontolog-forum-
 >bounces@xxxxxxxxxxxxxxxx] On Behalf Of Barkmeyer, Edward J
 >Sent: Monday, May 05, 2014 1:01 PM
 >To: [ontolog-forum]
 >Subject: Re: [ontolog-forum] English number of words/concepts that cannot
 >be composed of others
 >
 >Pat,
 >
 >So, your goal of " accurate semantic interoperability among databases and
 >applications" is in fact to be attained by having effective communication
 >among the human authors of the databases and applications.  And this very
 >practical engineering goal is thus dependent on an adequate solution to a
 >very complex linguistic and philosophical problem.  Given your acceptance
of
 >the maxim that "people don't understand each other perfectly", how
 >"accurate" can the "semantic interoperability" be?  And at what point does
 >your minimal subset of natural language become an improvement on the
 >jargon-laden "natural" language these people normally use in speaking with
 >one another?
 >
 >In the engineering world there has been a moderately effective effort in
this
 >area called 'Simplified Technical English' (see
http://www.asd-ste100.org/),
 >which involves two sets of ideas:
 > - a standard vocabulary for words frequently used in stating engineering
 >requirements (with the rule:  if you mean this, use this word); and
 > - a set of guidelines for forming clear simple sentences that convey
 >requirements.
 >It is expected that users of STE will augment the base vocabulary with
 >common and less common terms for the technical things and properties in
 >their particular domain of interest.  Using STE is known to achieve
effective
 >but imperfect communication between engineers, even in different
 >disciplines, that work on a common project, like Airbus.
 >
 >The important idea in STE is exactly the opposite of what is taught in
writing
 >classes:  all sentences are short and plain, you say nothing needless, and
you
 >use the same word for the same thing all the time, every time.  It creates
a
 >natural reduction in irrelevant vocabulary, while maintaining the
possibility
 >that the vocabulary of a given domain of work is rich enough to
distinguish
 >the concepts in the domain, and that set may be quite rich.
 >
 >In STE, a definition typically has the form:
 >An X has the following properties:
 >  The X is a Y.
 >  The X always has a Z.
 >  The (value of) Property1 of the X is greater than N1 and less than N2.
 >  ...
 >That is, the definition is a set of simple axioms, as distinct from the
form:
 >  An X is a Y that has a Z and whose Property1 ... and ...
 >STE doesn't provide any vocabulary for X, Y,Z, Property1, etc.  It does
provide
 >'is a' and 'has a' and 'always' and 'is greater than', and so on.
 >
 >All of this may, of course, be a solution to a different problem from the
one
 >you envisage, but I see no reason why it would not be useful in
 >communicating the meaning of database content or application behaviors.
 >They are just different engineering activities from building an aircraft
or a
 >motor vehicle.
 >
 >-Ed
 >
 >> -----Original Message-----
 >> From: ontolog-forum-bounces@xxxxxxxxxxxxxxxx [mailto:ontolog-forum-
 >> bounces@xxxxxxxxxxxxxxxx] On Behalf Of Patrick Cassidy
 >> Sent: Saturday, May 03, 2014 10:17 PM
 >> To: '[ontolog-forum] '
 >> Subject: Re: [ontolog-forum] English number of words/concepts that
 >> cannot be composed of others
 >>
 >> To answer the comments of John S and Matthew W:
 >>
 >>   First, I need to mention (again) that the goal of the work on a
 >> primitives- based ontology is twofold:
 >>      To provide a basis for accurate semantic interoperability among
 >> databases and applications
 >>      To provide a knowledge representation that can at least in part
 >> support computerized human-level language understanding (better than
 >> alternatives)
 >>
 >>   These are very practical  *engineering* goals, which are unlikely to
 >> benefit much from broad general theorization.  More specifically:
 >>
 >>   The objections of John and Mathew seem to be based on the
 >> misunderstanding that I am claiming the existence of some finite set
 >> of primitives that will precisely define all words that anyone would
 >> ever want to use or invent.  Of course that is improbable in the
 >> extreme.  This misunderstanding perfectly illustrates one of the
 >> reasons that the NL goal appears to me not to suffer fatally from the
 >> problems described in Kilgariff's paper (which I read more than once)
 >> - that is, if one wants to reproduce human-level language
 >> understanding one has to remember that people don't understand each
 >> other perfectly - even those who are well acquainted with their native
 >> language, and particularly when emotions are engaged and debating points
 >are being
 >> made.   The computer doesn't have to be perfect, just as good as people.
 >>
 >>   To illustrate, consider one point from John's reference slide-set:
 >> [JS] (goal3.pdf):
 >> >> No finite set of words can have a fixed, precise set of mappings to
 >> >> a dynamically changing world
 >>
 >>    Duh.  Of course not, but a finite set of well-specified ontology
 >> elements
 >> **can** have a "fixed, precise set of mappings" to a finite set of
 >> databases, the meanings of whose elements are determined by the
 >> operations and goals of their applications.  For natural language, we
 >> expect inaccuracies, even among people using language.  Any idiot can
 >> say things that no genius could understand.
 >>
 >>   Of course, people stretch meanings of words in new situations, but
 >> unless all communicating parties are aware of the circumstances, that
 >> is what will often lead to misunderstanding.  We have to explain new
 >> uses to other people, as well as to our computers, by providing
clarifying
 >definitions.
 >>
 >>
 >>
 >> [MW]
 >> >2. A concept is primitive unless it is the intersection of 2 or more
 >> >other
 >>  >concepts.
 >>  >
 >> This is formal-logic definition of 'primitive'.  For the COSMO, most
 >> concepts are specified by necessary conditions (not necessary and
 >> sufficient), and I am not overly concerned to identify those that are
 >> truly "primitive" from those that might actually be constructed from
 >> others in the ontology.  I am concerned that, at the first iteration,
 >> I will have a basic set of concepts sufficient to specify the meanings
 >> of domain concepts used in multiple different applications, to a level
 >sufficient to support those applications.
 >>
 >> This is a very pragmatic engineering goal.  In an engineering task the
 >> aim is to devise an artifact that will accomplish some function.  But
 >> such artifacts can rarely if ever be proven by theoretical arguments
 >> to have the proper attributes.  Reality is complex, and when artifacts
 >> are actually used, unanticipated problems reveal inadequacies (remember
 >the Obamacare
 >> website?  Hardly a rare case of bugs in an artifact).    Proof of
concept
 >> for an engineering task can only be accomplished by building the
 >> artifact and testing it.  That is what the COSMO project is intended
 >> to do - test the notion of an primitives-based ontology to support
 >> (initially) semantic interoperability among databases - and eventually
 >language understanding.
 >>
 >> If anyone has a (hopefully simple) test case for database
 >> interoperability, that would be a useful contribution to the
 >> discussion.  Theorizing about the flexibility of human use of words
 >> does not help - the issue is well known and does not answer the
 >> practical questions involved in this project.  The initial base of
 >> 'defining' concepts in the ontology  will be supplemented as required
 >> for new applications.  The issues involved in supplementation have
 >> also been thoroughly considered.  How much supplementation will be
 >> necessary as time goes on can only be determined in practice.  I will
 >> be very disappointed if, after a hundred applications have been mapped
 >> with the ontology,
 >> *every* new application still requires some new primitives to be
 >> added.  At that point, John may be entitled to gloat.
 >>
 >> Pat
 >>
 >> Patrick Cassidy
 >> MICRA Inc.
 >> cassidy@xxxxxxxxx
 >> 1-908-561-3416
 >>
 >>
 >>  >-----Original Message-----
 >>  >From: ontolog-forum-bounces@xxxxxxxxxxxxxxxx [mailto:ontolog-forum-
 >> >bounces@xxxxxxxxxxxxxxxx] On Behalf Of Matthew West
 >>  >Sent: Saturday, May 03, 2014 4:23 PM
 >>  >To: '[ontolog-forum] '
 >>  >Subject: Re: [ontolog-forum] English number of words/concepts that
 >> cannot  >be composed of others  >  >Dear All,  >Whilst agreeing with
 >> John, I'd like to approach this from the other end of the  >telescope.
First
 >some basics:
 >>  >1. For each concept there is a set of objects that it describes.
 >>  >2. A concept is primitive unless it is the intersection of 2 or more
 >> other
 >> >concepts.
 >>  >
 >>  >The proposition is that there is some moderate set of primitive
 >> concepts such  >that there is no useful concept that cannot be defined
 >> as the intersection of  >that set of primitive concepts.
 >>  >
 >>  >Now consider the number of objects there are that we might want to
 >> >describe.
 >>  >Consider our universe, the galaxies, planetary systems, stars,
 >> planets,
 >> >moons, other bodies, parts of these, life, molecules, atoms,
 >> >subatomic particles. Then there are all the other possible universes,
 >> >with unicorns and so on.
 >>  >
 >>  >So you need to prove that there is a number of primitive concepts,
 >> n, where  >n is (a lot) less than infinity, such that it is not
 >> possible to come up with some  >useful set of all these objects that
 >> is not the intersection of some of those n  >concepts.
 >>  >
 >>  >It does not seem credible to me that there is any such number, but I
 >> look
 >> >forward to seeing any proof that such a number exists.
 >>  >
 >>  >Regards
 >>  >
 >>  >Matthew West
 >>  >Information  Junction
 >>  >Mobile: +44 750 3385279
 >>  >Skype: dr.matthew.west
 >>  >matthew.west@xxxxxxxxxxxxxxxxxxxxxxxxx
 >>  >http://www.informationjunction.co.uk/
 >>  >https://www.matthew-west.org.uk/
 >>  >This email originates from Information Junction Ltd. Registered in
 >> England
 >> >and Wales No. 6632177.
 >>  >Registered office: 8 Ennismore Close, Letchworth Garden City,
 >> Hertfordshire,
 >>  >SG6 2SU.
 >>  >
 >>  >
 >>  >
 >>  >-----Original Message-----
 >>  >From: ontolog-forum-bounces@xxxxxxxxxxxxxxxx
 >>  >[mailto:ontolog-forum-bounces@xxxxxxxxxxxxxxxx] On Behalf Of John F
 >> >Sowa
 >>  >Sent: 03 May 2014 19:54
 >>  >To: ontolog-forum@xxxxxxxxxxxxxxxx
 >>  >Subject: Re: [ontolog-forum] English number of words/concepts that
 >> cannot  >be composed of others  >  >Gregg, Tom, Pat C, and John B,  >
 >> >GR
 >> >> Have you looked at Natural Semantic Metalanguage?
 >>  >> http://en.wikipedia.org/wiki/Natural_semantic_metalanguage
 >>  >
 >>  >Yes.  I cited Anna Wierzbicka's _Lingua Mentalis_ in my 1984 book,
 >> and I've
 >> >followed her other books over the years.  I call her primitives
 >> >'accordion words' -- because you can stretch them and squish them to
 >> >fit anything you please.
 >>  >
 >>  >They're useful.  To quote my favorite philosopher, C. S. Peirce:
 >>  >> It is easy to speak with precision upon a general theme.  Only,
 >> one  >> must commonly surrender all ambition to be certain.  It is
 >> equally  >> easy to be certain. One has only to be sufficiently vague.
 >> It is not  >> so difficult to be pretty precise and fairly certain at
 >> once about a  >> very narrow subject. (CP
 >> 4.237)  >  >Again, I recommend that every reader of this list *study*
 >> the paper "I don't  >believe in word senses."  Adam K and Sue A are
 >> *professionals* in  >lexicography and computational linguistics.  They
 >> know the difference  >between accordion words and precise definitions.
 >> Both can be useful for  >different purposes, but it's important to know
the
 >difference.
 >>  >
 >>  >TK
 >>  >> I am looking for (I'm going to call it) 'fundamental concepts' and
 >> I  >> am making the assumption that there is some basic agreed level
 >> of  >> definition of these concepts so we don't end up in Physics and
 >Chemistry.
 >>  >
 >>  >Brief answer:
 >>  >
 >>  >  1. There is no "basic agreed level" whatsoever -- NONE!
 >>  >
 >>  >  2. The top level of an ontology *must* be vague and underspecified.
 >>  >     It can be useful, but the real knowledge is in the lower levels.
 >>  >
 >>  >  3. Please remember that Cyc started out with the assumption that a
 >>  >     formal ontology of the knowledge of a high-school graduate could
be
 >>  >     specified in 10 years.  After 30 years and over $100 million of
 >>  >     investment, Doug Lenat has emphasized that all the real knowledge
 >>  >     is in the detailed low levels.  The top level is very vague and
 >>  >     underspecified.  It cannot support any kind of detailed
reasoning.
 >>  >
 >>  >TK
 >>  >> My criteria for 'fundamental concept' is that it cannot be
 >> replaced by  >> a semantic net-let that crosses the agreed level.
 >>  >
 >>  >If that's your definition, then you're talking about the empty set.
 >>  >There is no concept or thought of any kind that cannot be analyzed
 >> at a
 >> >deeper level.
 >>  >
 >>  >> So John S, to take your examples...
 >>  >
 >>  >I was just trying to give one-line examples.  In any case, the terms
 >> in your
 >> >analyses are accordion words.  Please study that paper by Adam K.
 >>  >
 >>  >JB
 >>  >> as Lakoff shows us in "Women, Fire, and Dangerous Things" the  >>
 >> universals are different for different linguistic environments...
 >>  >> But it still comes down to what type of tasks are facing. The "core"
 >>  >> concepts for farming are very different from those needed in the
 >office.
 >>  >
 >>  >Yes!  But I would avoid using the word 'core' because it gives the
 >> mistaken
 >> >impression that some kind of core is possible.  But even for farming
 >> >and offices, the basic terms are accordion words.  Note how we use
 >> >the abbreviation 'cc' in our emails.  In office-speak, it used to
 >> >mean 'carbon
 >> copy'.
 >>  >When was the last time you saw a carbon copy?
 >>  >
 >>  >PC
 >>  >> according to Guo, the number of senses used **in the definitions**
 >> >> average to less than 2.
 >>  >
 >>  >If so, Guo doesn't know how to define words or to count definitions.
 >>  >I suspect he was using those terms as accordion words.  If you
 >> stretch and
 >> >squeeze them enough, you can adapt them to almost anything.
 >>  >
 >>  >But with every stretch and squeeze, you blur an immense amount of
 >info.
 >>  >Please tell Guo to study Adam K's paper.  Also study the
 >> publications about
 >>  >*microsenses* by Alan Cruse.  A microsense is any intermediate point
 >> as you  >stretch and squeeze your accordion.
 >>  >
 >>  >PC
 >>  >> If anyone knows of such a study, I would very much like to get a
 >pointer.
 >>  >
 >>  >I've given you many, many pointers over the years.  And I beg you to
 >> study
 >> >them until you reach enlightenment.  For starters, please reread
 >> >http://www.jfsowa.com/talks/goal3.pdf and *follow* every URL to every
 >> >reference in it.
 >>  >
 >>  >Those other goalX.pdf files are also surveys.  You have to dig into
 >> the
 >> >references until you get the point.  Anything that looks like or
 >> >smells like a primitive is probably an accordion word.
 >>  >
 >>  >John
 >>  >
 >>
 >>
 >>_________________________________________________________
 >> _
 >>  >_______
 >>  >Message Archives: http://ontolog.cim3.net/forum/ontolog-forum/
 >>  >Config Subscr:
 >> http://ontolog.cim3.net/mailman/listinfo/ontolog-forum/
 >>  >Unsubscribe: mailto:ontolog-forum-leave@xxxxxxxxxxxxxxxx
 >>  >Shared Files: http://ontolog.cim3.net/file/ Community Wiki:
 >>  >http://ontolog.cim3.net/wiki/ To join:
 >>  >http://ontolog.cim3.net/cgi-bin/wiki.pl?WikiHomePage#nid1J
 >>  >
 >>  >
 >>  >
 >>
 >>
 >>_________________________________________________________
 >> _
 >>  >_______
 >>  >Message Archives: http://ontolog.cim3.net/forum/ontolog-forum/
 >>  >Config Subscr:
 >> http://ontolog.cim3.net/mailman/listinfo/ontolog-forum/
 >>  >Unsubscribe: mailto:ontolog-forum-leave@xxxxxxxxxxxxxxxx
 >>  >Shared Files: http://ontolog.cim3.net/file/ Community Wiki:
 >>  >http://ontolog.cim3.net/wiki/ To join: http://ontolog.cim3.net/cgi-
 >> >bin/wiki.pl?WikiHomePage#nid1J  >
 >>
 >>
 >>
 >__________________________________________________________
 >> _______
 >> Message Archives: http://ontolog.cim3.net/forum/ontolog-forum/
 >> Config Subscr: http://ontolog.cim3.net/mailman/listinfo/ontolog-forum/
 >> Unsubscribe: mailto:ontolog-forum-leave@xxxxxxxxxxxxxxxx
 >> Shared Files: http://ontolog.cim3.net/file/ Community Wiki:
 >> http://ontolog.cim3.net/wiki/ To join: http://ontolog.cim3.net/cgi-
 >> bin/wiki.pl?WikiHomePage#nid1J
 >>
 >
 >__________________________________________________________
 >_______
 >Message Archives: http://ontolog.cim3.net/forum/ontolog-forum/
 >Config Subscr: http://ontolog.cim3.net/mailman/listinfo/ontolog-forum/
 >Unsubscribe: mailto:ontolog-forum-leave@xxxxxxxxxxxxxxxx
 >Shared Files: http://ontolog.cim3.net/file/ Community Wiki:
 >http://ontolog.cim3.net/wiki/ To join: http://ontolog.cim3.net/cgi-
 >bin/wiki.pl?WikiHomePage#nid1J
 >    (015)


_________________________________________________________________
Message Archives: http://ontolog.cim3.net/forum/ontolog-forum/  
Config Subscr: http://ontolog.cim3.net/mailman/listinfo/ontolog-forum/  
Unsubscribe: mailto:ontolog-forum-leave@xxxxxxxxxxxxxxxx
Shared Files: http://ontolog.cim3.net/file/
Community Wiki: http://ontolog.cim3.net/wiki/ 
To join: http://ontolog.cim3.net/cgi-bin/wiki.pl?WikiHomePage#nid1J    (016)

<Prev in Thread] Current Thread [Next in Thread>