ontolog-forum
[Top] [All Lists]

Re: [ontolog-forum] What goes into a Lexicon?

To: "[ontolog-forum]" <ontolog-forum@xxxxxxxxxxxxxxxx>
From: Amanda Vizedom <amanda.vizedom@xxxxxxxxx>
Date: Sun, 26 Feb 2012 16:34:21 -0500
Message-id: <CAEmngXuWtY4MeVuy_77idTWeTyTTxLm-22rR2p+AMmJ6Qyr_kQ@xxxxxxxxxxxxxx>

Among the things beaten into the heads of ontologists-in-training (at least, if the training is any good) is that the name of the concept means nothing to the machine. The name has nothing to do with defining the concept, within the context of the computation artifact. The same is true if any text definition that may accompany it as a specialized comment. These are used by humans communicating to each other about the computational artifact, but they must always bear in mind that they are partial, misleading, and most importantly, opaque to the machine. That is, a concept in an ontology is defined by the formal assertions in which it is involved -- the ones that use the built-in formal semantics of the ontology language and other concepts built from those. Other things that may be attached to the concept may be used in other, non-inference computation (e.g., labels to help text analyzers spot possible references in unstructured data, code handles to help a federation engine find related data in external sources), but as far as the ontology core goes, only the formal assertions define a concept.

You're right, though, that the practice of picking one of the associated expressions and making it the concept's name has an irresistable pull on the humans. Even experience human ontologists fall into traps involving reading into, or expecting things of, a concept based on name, without checking the axioms. Of all the project I've worked on, only one faced this head on and took serious measures to keep the concept/language distinction clean.  In this project, concepts did not have natural language names. The name of a concept was a hexidecimal identifier. The concept would have "use for" labels, including prefered labels, in one or more languages, but none of these were the concept name.

It's probably not an accident that this was the broadest ontology I've worked on, and that it was used for semantic indexing, search, and retrieval at web scale, cross-domain and within specialized verticals, and had multi- and cross-lingual fuction.

And, before somebody wonders how one could possibly work with and edit a massive ontology like that without readable names, the answer is, more easily than ever. You could view the concepts by ID only, or with the parenthetical inclusion of a preferred label in your language of choice. The presence of the ID first and the label in parens w/lang specified maintained awareness that the label is just one imprecise handle, and that you need to look at axioms, including placement in the overall ontology, to know what a concept is.

I wish that practice were more widespread, and there were good, open tools to support it (that project used home-rolled tools, as most large successful ones seem to). Based on that experience, I would choose unreadable UIDs as concept names over the typical naming practice every time.

Best,
Amanda

On Feb 26, 2012 3:54 PM, "Rich Cooper" <rich@xxxxxxxxxxxxxxxxxxxxxx> wrote:

Dear Amanda and David,

 

Amanda wrote:

What probably causes the confusion is that the nodes in an ontology are not "terms" in this sense; they are NOT bits of language, which may have many meanings, but rather these nodes are *concepts*, abstracted from the various ways they might be expressed, in any language or jargon or context or by anyone. It *is* a rule of ontology that each abstract *concept* must have one formal definition/meaning; that's what makes it a *specific abstract concept*, and what makes it computable as part of an ontology. But there may be any number of ways of expressing this concept in language, symbols, etc., and any particilar bit of language may be associated with any number of different concepts. In an ontology, what it looks like for "terms", in the used-language sense, to have multiple meanings is that those terms are associated with multiple abstract concepts, where each of those concepts has a single, formal definition/meaning.

 

It is an abstraction, IMHO, to call a concept by ANY linguistic term.  That is, the concept has such depth of meaning when you look at how it interlocks with other concepts in the lattice that the phrases people use to describe the concept are misleading and even wrong at times – most times it seems. 

 

If concepts had some form of meaningless index, like a social security number, or other social construction that did not use English words, I could believe that the concept is different from the various terms used to describe it.  But that is not the practice used on this forum to date.  Concepts have always been described here by English terms, not by asemantic indexes. 

 

Given an index value, it could be wikified to show various English terms describing the concept for reference purposes.  Then programmers could click on an index, get a pop up page of full description, and even search the set of indexes using Google like phrases to find a list of concept indexes which might be relevant.  If that were done, I could believe that the index designates a set of abstract, language free concepts. 

 

But current practice is to refer to a concept with a word or phrase that captures only a tiny portion of the real semantics of that concept.  Therefore David’s point of subjectivity blinding the philosophy of a concept set is very appropriate to the ways in which concepts are actually used in software development. 

 

When concepts are named with words or phrases, they are at least as ambiguous as the words or phrases. 

 

HTH,

-Rich

 

Sincerely,

Rich Cooper

EnglishLogicKernel.com

Rich AT EnglishLogicKernel DOT com

9 4 9 \ 5 2 5 - 5 7 1 2


From: ontolog-forum-bounces@xxxxxxxxxxxxxxxx [mailto:ontolog-forum-bounces@xxxxxxxxxxxxxxxx] On Behalf Of Amanda Vizedom
Sent: Sunday, February 26, 2012 12:26 PM
To: [ontolog-forum]
Subject: Re: [ontolog-forum] What goes into a Lexicon?

 

David,

Actually, I think that you *do* have it wrong. At least, when you say:

"since as far as I know it's a hard & fast ontological rule that requires a term to have a single definition/meaning... an extremely unrealistic constraint for this sort of ugly real world challenge. [If I've got this wrong, please set me straight.] "

Up to that point, your message seemed to be about lexicons, terminologies, and other bits of language. And of course, as you point out, these terminologies vary highly from one context to another, even as used by a singular person at different points in time.

And you are right that some people have tried, using various modeling and standardization methods, to fix one meaning/definition for a term, where "term" = bit of language, word, phrase, abbreviation, _expression_, the kind of thing you would find in a lexicon or controlled vocabulary, and then tried and failed to use the result to represent content across heterogenous sources. Or tried and failed to impose this, top-down, on all data sources, users, systems, code, etc. Even within an enterprise with the supposed ability to enforce such uniformity, it fails. That approach simply doesn't fit with the realities of how people work, how meaning is bound up with context, how terminology evolves with use and how that local evolution is part of the development of expertise and efficiency.

With respect to all of that, IMHO, you're absolutely right.

Where things go wrong is in the bit I quoted above. It's absolutely NOT a hard & fast rule of ontology that each term have one definition/meaning, if by "term" you still mean what you meant in those previous paragraphs: a bit of language, word, phrase, abbreviation, _expression_, the kind of thing you would find in a lexicon or controlled vocabulary. In an ontology, each one of those things (e.g., each word, abbreviation, phrase...) can be associated with many different meanings. You might (or might not) even capture some relationship between those term-to-meaning associations and some context factors such as source, business process, localization, etc., if this is important for your usage. But even then, there is no restriction to one meaning per (language) "term" per context.

What probably causes the confusion is that the nodes in an ontology are not "terms" in this sense; they are NOT bits of language, which may have many meanings, but rather these nodes are *concepts*, abstracted from the various ways they might be expressed, in any language or jargon or context or by anyone. It *is* a rule of ontology that each abstract *concept* must have one formal definition/meaning; that's what makes it a *specific abstract concept*, and what makes it computable as part of an ontology.   But there may be any number of ways of expressing this concept in language, symbols, etc., and any particilar bit of language may be associated with any number of different concepts. In an ontology, what it looks like for "terms", in the used-language sense, to have multiple meanings is that those terms are associated with multiple abstract concepts, where each of those concepts has a single, formal definition/meaning.

I hope that makes the matter a bit clearer. Where ontology is successfully used for interoperability, in environments where multiple meanings per used-language "term" are typical and assumed, the ontology can help by capturing and providing mappings between the polysemous used-language "terms" (including data values, field names, and unstructured or semi-structured text) and whichever, and however many, single-meaning abstract concepts those used-language terms are used for.  So the used-language "terms" get to keep their many meanings; it's the abstract and formally defined *concepts* that must have just one.

Again, I hope that clarifies thing a bit. It's made more confusing by the fact that the linguistic _expression_ "term" is used for multiple things. In some uses, "term" is used to mean a bit of used-language; in some uses, "term" is used to mean concept in an ontology. But despite that bit of typical, confusing polysemy, the fact is that ontologically, the bits of used-language can be associated with many meanings; it's the abstract concepts that have to have just one (though they can have many, and overlapping, used-language expressions).

Best,
Amanda

On Feb 25, 2012 9:41 PM, "David Eddy" <deddy@xxxxxxxxxxxxx> wrote:

Rich -

On Feb 25, 2012, at 6:27 PM, Rich Cooper wrote:

> Ontology
> designers that produce a well documented, highly
> learnable and usable ontology (i.e., something
> simple and down in the details of a domain) could
> provide a satisfying brick to many of those first
> time developments.


I am speaking in the context of the legacy software systems that
enable our lives.


The language/lexicon/terminology/slang/whatever already exists in the
applications.  Unfortunately it's pretty much been put together with
a single ended one-time pad... & that guy(s) has left the building.

The problem is, unless you have the SME sitting at your side, or lots
& lots of time, the terminology is very difficult to grok.  And when
you move to the next assignment, the terminology/lexicon is very
likely to be different, so you have to forget what you just spent 6
months learning.

I would likely argue that this language collection has not been
accumulated with the idea of an organized ontology in mind.

Imposing an organized ontology on this disorganized language
collection probably isn't going be of much help.

But something that quickly shows or records or suggests that in a
particular context "no" actually means "id" (e.g. soc_sec_no....
social security "number" is not a number, it's an index... a very
different beast)... now that would be useful & likely to be embraced
by the grunts—application owners, analysts, programmers—in the trenches.


How ontologies could add value, I don't have a clue, since as far as
I know it's a hard & fast ontological rule that requires a term to
have a single definition/meaning... an extremely unrealistic
constraint for this sort of ugly real world challenge.  [If I've got
this wrong, please set me straight.]

___________________
David Eddy
deddy@xxxxxxxxxxxxx


_________________________________________________________________
Message Archives: http://ontolog.cim3.net/forum/ontolog-forum/
Config Subscr: http://ontolog.cim3.net/mailman/listinfo/ontolog-forum/
Unsubscribe: mailto:ontolog-forum-leave@xxxxxxxxxxxxxxxx
Shared Files: http://ontolog.cim3.net/file/
Community Wiki: http://ontolog.cim3.net/wiki/
To join: http://ontolog.cim3.net/cgi-bin/wiki.pl?WikiHomePage#nid1J



_________________________________________________________________
Message Archives: http://ontolog.cim3.net/forum/ontolog-forum/
Config Subscr: http://ontolog.cim3.net/mailman/listinfo/ontolog-forum/
Unsubscribe: mailto:ontolog-forum-leave@xxxxxxxxxxxxxxxx
Shared Files: http://ontolog.cim3.net/file/
Community Wiki: http://ontolog.cim3.net/wiki/
To join: http://ontolog.cim3.net/cgi-bin/wiki.pl?WikiHomePage#nid1J
 

_________________________________________________________________
Message Archives: http://ontolog.cim3.net/forum/ontolog-forum/  
Config Subscr: http://ontolog.cim3.net/mailman/listinfo/ontolog-forum/  
Unsubscribe: mailto:ontolog-forum-leave@xxxxxxxxxxxxxxxx
Shared Files: http://ontolog.cim3.net/file/
Community Wiki: http://ontolog.cim3.net/wiki/ 
To join: http://ontolog.cim3.net/cgi-bin/wiki.pl?WikiHomePage#nid1J    (01)

<Prev in Thread] Current Thread [Next in Thread>