[Top] [All Lists]

Re: [ontolog-forum] Context and Inter-annotator agreement

To: ontolog-forum@xxxxxxxxxxxxxxxx
From: John F Sowa <sowa@xxxxxxxxxxx>
Date: Mon, 05 Aug 2013 14:52:45 -0400
Message-id: <51FFF47D.1020008@xxxxxxxxxxx>
Pat,    (01)

You are confusing two totally different kinds of information:    (02)

  1. The huge, very complex, tightly interconnected network of everything
     that people have in their heads.  That may be called background
     knowledge or accumulated experience or whatever.    (03)

  2. The classifications and definitions that lexicographers put
     in their dictionaries.    (04)

People who have done a lot of work in analyzing words and dictionaries
(such as you, for example) may have a lot of the info in their heads
organized along lines that are similar to #2.  But those people are
a tiny minority of the human race.  That kind of information is useful
for many purposes -- but it is *not* necessary, sufficient, or
adequate for normal language understanding and generation.    (05)

>> The essential point is that people do *not* require prior definitions of word
>> senses in order to understand the words in a conversation or a document.    (06)

> That is sometimes true, but rarely.  People can only learn a **very
> small** number of words at a time    (07)

You're confusing the very artificial word senses and definitions with
the background knowledge and experience.  I suggest that you do a
global change of "word senses" to "background knowledge".    (08)

For an excellent review of the evidence from neuroscience by a
professional linguist who has spent his entire career in talking
with and working with neuroscientists, I recommend the lecture
notes by Sydney Lamb, which I cite in my goal.pdf slides.
Those slides also have many citations to other resources,
which I strongly recommend.  For example, please read    (09)

Cruse, D. Alan (2000) Aspects of the micro-structure of word meanings,
in Ravin & Leacock (2000) pp. 30-51.    (010)

Cruse, D. Alan (2002) Microsenses, default specificity and the
semantics-pragmatics boundary, Axiomathes 1, 1-20.    (011)

> people can't disambiguate multiple sequential unknown words
> in their head, on first reading.    (012)

People *never* disambiguate words.  That is an artificial term
coined by linguists and computational linguists to describe
their theories and the programs that implement them.    (013)

What they do is *understand* texts by relating the patterns of
words in the new text or discourse to the patterns of their
experience -- both verbal and nonverbal.    (014)

The only people who have word senses in their head are professional
lexicographers, people who solve crossword puzzles, and people like
you who have done extensive work with dictionaries.    (015)

> Do you have in mind some measure for how many word meanings can be
> inferred accurately from context?    (016)

For the number of word senses that people normally infer from context,
the answer is ZERO.  When they understand language, they relate the
actual words to their network of interconnected knowledge.    (017)

> You need **a lot** of context (and often prior knowledge) to clearly
> grasp the meaning of a new word    (018)

More precisely, you need a lot of background knowledge -- either
from previous experience or from the text you're reading -- in order
to understand any new *subject*.  The amount of background knowledge
required depends on the novelty of the subject, not the number of
new words.    (019)

> one of the challenges in developing a primitives-based foundation ontology
> is to determine, from constructing logical specifications of word meaning,
> just how many senses of the basic words are required to account for all
> of the needed primitive senses used in definitions    (020)

If you want to do that kind of work, that's your choice.  But we have
been getting very good results at VivoMind with only the kinds of
lexical resources I mentioned in my previous note -- supplemented with
very detailed domain-dependent ontologies for specific applications.    (021)

> One interesting thing about the Longman defining vocabulary is
> that, though you can quibble with how precise any given Longman definition
> is, if you decide to create a more precise definition for your own purpose,
> you can still do it  ***using the same defining vocabulary***.    (022)

But we have never found any reason to write such definitions.
Our domain-dependent ontologies consist of a hierarchy that fits into
the more general one (somewhat along the lines of Cyc's microtheories)
together with details stated in the terms of the subject matter.    (023)

For the legacy re-engineering example, that background knowledge came
from COBOL and SQL -- and most of it was derived automatically by
translating the COBOL and SQL programs to conceptual graphs. For the
DoE example, it came from chemistry.  For oil & gas exploration, it
came from geology.    (024)

For the chemistry and geology examples, that background knowledge
could be stated in controlled English and mapped to conceptual graphs.
For the oil and gas work, the system built up much of its own background
knowledge by analyzing a textbook on geology.  It used the resulting
CGs to interpret the research reports.    (025)

John    (026)

Message Archives: http://ontolog.cim3.net/forum/ontolog-forum/  
Config Subscr: http://ontolog.cim3.net/mailman/listinfo/ontolog-forum/  
Unsubscribe: mailto:ontolog-forum-leave@xxxxxxxxxxxxxxxx
Shared Files: http://ontolog.cim3.net/file/
Community Wiki: http://ontolog.cim3.net/wiki/ 
To join: http://ontolog.cim3.net/cgi-bin/wiki.pl?WikiHomePage#nid1J    (027)

<Prev in Thread] Current Thread [Next in Thread>