[Top] [All Lists]

Re: [ontolog-forum] Context and Inter-annotator agreement

To: ontolog-forum@xxxxxxxxxxxxxxxx
From: John F Sowa <sowa@xxxxxxxxxxx>
Date: Sat, 03 Aug 2013 10:54:23 -0400
Message-id: <51FD199F.6040801@xxxxxxxxxxx>
Pat, Michael, Ed, and William,    (01)

I strongly believe in the need for good lexical resources.
For examples of the resources VivoMind uses, see the web site
that Arun Majumdar maintains and the references cited there:    (02)

    http://lingo.stanford.edu/vso/    (03)

I also believe that no single resource can be necessary, sufficient,
or even adequate for the task of language understanding.  The title
of the following paper is "Two paradigms are better than one and
multiple paradigms are even better":    (04)

    http://www.jfsowa.com/pubs/paradigm.pdf    (05)

For a brief summary of the insufficient attempts over the past few
millennia, see the list (Slide 10) at the end of this note.    (06)

>>> Those meanings that can be reliably distinguished (>98%) by motivated
>>> (rewarded for accuracy) human annotators.    (07)

>> There are no such meanings -- except in very special cases.    (08)

> I think that human performance with real informative text is typically
> above that level, when one is trying to be accurate and not sloppy or hurried.    (09)

Yes.  But there is a *huge* difference between using language precisely
for human to human communication and the artificial exercise of trying
to select a word sense from some list.    (010)

> one has to start with some inventory of senses, but the most detailed
> inventory yet used for such tests by NL researchers is WordNet, which
> is not a good standard for such testing.    (011)

No.  An inventory of word senses is neither necessary nor sufficient
for NLP.  WordNet is widely used because it's free, but see Slide 10.    (012)

Also look at the Japanese EDR project (use Google for ref's).  The
Japanese gov't poured billions of yen into developing a "concept
dictionary" with 400,000 concepts with mappings to both English and
Japanese.  CSLI at Stanford had a copy of it, and I asked people
there whether anybody was using it.  The answer I got is that nobody
had found anything useful to do with it.    (013)

Use your favorite search engine to look for references to the SENSEVAL
projects.  If you want more info, subscribe to Corpora List and ask
people there what they think about these issues.  Adam Kilgarriff,
by the way, was one of the organizers of the SENSEVAL projects.    (014)

>> Unfortunately, there is no finite "set of senses" that can be used to
>> achieve "human-level interpretation of a broad range of texts."    (015)

> That is a bold claim... my observations suggest that no remotely applicable
> test has yet been conducted to see if such a claim is even plausible.    (016)

Researchers on machine translation tried to develop an Interlingua
of concepts (word senses) that would be useful for MT.  They failed
in the same way as EDR:  none of them produced useful results that
justified a continuation of the R & D funding.    (017)

> Until we develop a logic-based word sense inventory intended for
> broad use I don't see how the maximum agreement could be tested.    (018)

If you want to make any claims about the value of a large inventory
of logic-based concepts, you have to explain how your method would
differ from Cyc.  If you want something freely available, explain
what you would do that is different from OpenCyc.    (019)

By the way, I spoke with Ron Kaplan from PARC, then PowerSet, and
later Microsoft.  They had a license to use *all* of Cyc including
all the logic and the tools.  But Ron said that they just used the
hierarchy.  They did *not* use the axioms.  In any case, most of
the PARC/PowerSet people have left Microsoft -- that includes Ron K.
That's not a point in its favor.    (020)

> Fundamental principle:  People think in *words*, not in *word senses*.    (021)

> Really?  I sure don’t.  Without the textual content to disambiguate
> words,  communication would be extremely error-prone.
> Where does that notion come from?    (022)

The technical term 'disambiguate' is used by some linguists to describe
a stage in some programs that attempt to understand language.    (023)

At age 3, Laura understood and generated language far better than
any of those programs.  She didn't "disambiguate" words, she just
*used* words in meaningful patterns.  See the article by John Limber
cited in slide 8 of http://www.jfsowa.com/talks/goal.pdf .    (024)

> In order to understand a sentence, it is not enough to pick the right word
> sense. You have to understand the sense and how it can modify other senses.
> You need context, background knowledge and the ability for abstraction and
> generalization. You need to be able to follow the line of thinking of the
> author.    (025)

That's fairly close to what the VivoMind software does.  It does
*not* have a stage that could be called "word sense disambiguation".
Instead, the system uses a large inventory of graph patterns and
a kind of associative memory for matching its inventory to the
patterns that occur in the text.  See the paradigm.pdf article.    (026)

Those patterns can come from multiple sources. The words are
organized in a hierarchy of types and subtypes, but they are
*not* labeled with a fixed set of word senses.  Instead, new
patterns are generated dynamically from various sources,
and they are added to the inventory for future use.    (027)

Interesting point:  If the VivoMind software is used to reread
the same document on a second pass, it generates a different and
usually better interpretation (an interconnected graph of everything
derived from the document).  That's closer to what people do.    (028)

> Or take my own sentence:
>> When you put words together, you often create completely new senses
>> that cannot be grasped by looking at individual word senses only.
> How do you get from "put" and "together" to "put together" ? There are
> many senses for "looking" in Wordnet but I cannot find the right one.
> It is not used literally here.    (029)

That's a good example.  By using the original words from the text as
labels on the graphs, the VivoMind software can use patterns from the
literal use of the word 'look' to interpret metaphorical uses.    (030)

> For example, if I say "Michael eats new technologies for breakfast",
> you could understand this even if you have never heard this metaphor
> before.  The first time you heard such a thing you would have no doubt
> as to its meaning.  This is not poetry, but is its kin, as is all speech.    (031)

That is what the VivoMind software would do with that sentence.  It
would use the common pattern for Eat to interpret it, but it would
note that technologies are not a common kind of food.  If it couldn't
find a better pattern in its inventory, it would use the one it found.
But it would also evaluate the pattern match as less than perfect.    (032)

> meanings are often gradients that are noticed only when the distance
> between two points is great enough, creating what some call a 'different'
> sense, and can be blended in a variety of ways.    (033)

Yes.  A semantic distance measure is essential for evaluating the
pattern matching that occurs during language processing.  The question
of what distance is "great enough" is highly idiosyncratic.    (034)

> It is this that makes language actually a game that people play.
> Communicating with people means getting the hang of this game.    (035)

That was the main point of Wittgenstein's later philosophy.  I believe
it's fundamental to understanding language of any kind -- natural or
artificial, by humans or by computers.    (036)

> The purpose of natural language dictionaries is to inform humans who
> presumably have some familiarity with the language.  So, primitive terms
> are "defined" by providing synonyms and circular circumlocutions.  The
> idea is that the human reader will recognize enough of that verbiage
> to grasp the intended concept, by being familiar with the concept
> itself, presumably in other terms.    (037)

I agree.  And I would note that the so-called "word senses" represent
some lexicographers' choices in grouping the citations from which a
dictionary is derived.  That organization is helpful for some purposes,
but there is no evidence for a fixed, universal set.    (038)

______________________________________________________________________    (039)

 From Slide 10 of http://www.jfsowa.com/talks/kdptut.pdf    (040)


Many projects, many useful theories, but no consensus:    (042)

● 4th century BC: Aristotle’s categories and syllogisms.
● 12th to 16th c AD: Scholastic logic, ontology, and semiotics.
● 17th c: Universal language schemes by Descartes, Mersenne,
Pascal, Leibniz, Newton, Wilkins. L’Académie française.
● 18th c: More schemes. Satire of the Grand Academy of Lagado by
Jonathan Swift. Kant’s categories.
● 19th c: Ontology by Hegel, Bolzano. Roget’s Thesaurus. Boolean
algebra. Modern science, philosophy of science, early computers.
● Late 19th and early 20th c: FOL. Set theory. Ontology by Peirce,
Brentano, Meinong, Husserl, Leśniewski, Russell, Whitehead.
● 1970s: Databases, knowledge bases, and terminologies.
● 1980s: Cyc, WordNet, Japanese Electronic Dictionary Research.
● 1990s: Many research projects. Shared Reusable Knowledge
Base (SRKB), ISO Conceptual Schema, Semantic Web.
● 21st c: Many useful terminologies, but no universal ontology.    (043)

Message Archives: http://ontolog.cim3.net/forum/ontolog-forum/  
Config Subscr: http://ontolog.cim3.net/mailman/listinfo/ontolog-forum/  
Unsubscribe: mailto:ontolog-forum-leave@xxxxxxxxxxxxxxxx
Shared Files: http://ontolog.cim3.net/file/
Community Wiki: http://ontolog.cim3.net/wiki/ 
To join: http://ontolog.cim3.net/cgi-bin/wiki.pl?WikiHomePage#nid1J    (044)

<Prev in Thread] Current Thread [Next in Thread>