ontolog-forum
[Top] [All Lists]

Re: [ontolog-forum] Context and Inter-annotator agreement

To: "'[ontolog-forum] '" <ontolog-forum@xxxxxxxxxxxxxxxx>
From: "Patrick Cassidy" <pat@xxxxxxxxx>
Date: Sun, 4 Aug 2013 00:28:42 -0400
Message-id: <18fc01ce90cb$1a27a650$4e76f2f0$@micra.com>
A few comments in reply to some remarks of John Sowa:    (01)

IN this and a number of other notes, JFS seems to be saying  that failure of 
some groups to achieve a goal means that no amount of effort trying a related 
but different way can succeed:    (02)

[JFS}
> Also look at the Japanese EDR project (use Google for ref's).  The Japanese 
>gov't poured billions of yen into developing
>  a "concept dictionary" with 400,000 concepts with mappings to both English 
>and Japanese.  CSLI at  Stanford had a
>  copy of it, and I asked people there whether anybody was using it. 
> The answer I got is that nobody had found anything useful to do with it.
>
> Use your favorite search engine to look for references to the SENSEVAL 
>projects.  If you want more info, subscribe
>  to Corpora List and ask people there what they think about these issues.  
>Adam Kilgarriff, by the way, was 
> one of the organizers of the SENSEVAL projects.
<snip>
>
> Researchers on machine translation tried to develop an Interlingua of 
>concepts (word senses) that would be useful 
> for MT.  They failed in the same way as EDR:  none of them produced useful 
>results that justified 
> a continuation of the R & D funding.
>
<snip>
>
> If you want to make any claims about the value of a large inventory of 
>logic-based concepts, you have to explain 
> how your method would differ from Cyc.  If you want something freely 
>available, explain what you would do 
> that is different from OpenCyc.
>
> By the way, I spoke with Ron Kaplan from PARC, then PowerSet, and later 
>Microsoft.  They had a license
>  to use *all* of Cyc including all the logic and the tools.  But Ron said 
>that they just used the hierarchy.
>  They did *not* use the axioms.  In any case, most of the PARC/PowerSet 
>people have
>  left Microsoft -- that includes Ron K. That's not a point in its favor.    (03)

I am aware of all of those projects, and have spoken directly to some of the 
principles, including Doug Lenat and Ron K, and can say that none of those 
efforts have even tried the approach I suggested, for various reasons.  Mostly, 
none of those groups had the time or inclination to  pursue the long-term goal 
of human language understanding even at the most elementary level, because 
their funded goals required that they directly pursue more immediately 
practical results for processing texts (or in the case of Cyc, databases) in a 
manner that would not itself provide human-level understanding, but could 
present a "good enough" set of possible interpretations that a human would then 
evaluate.  For NL, the statistical approach made possible by massive amounts of 
text, including some annotated text, proved to be at least as good or better 
than the more time-consuming syntactic and conceptual paths, by the measures 
used.  So the statistical approach has become vastly more funded than the 
ontological/analytical.  There is no way to know whether the same amount of 
effort devoted to an ontological approach that is coordinated with development 
of a primitives-based foundation ontology would or would not have succeeded 
better or faster.    I think it would have.  But in any case it is incorrect to 
say that the approach I have suggested to developing a primitives-based 
foundation ontology for NLP has already been tried.  It hasn't.    (04)

This is not to say that a statistical approach is misguided.  On the contrary, 
I think it closely mimics the early stages of language understanding in humans, 
but fails at the secondary analytical stage which determines whether the "most 
probable" interpretations make sense in the context of the communication.   
Those who have read "Thinking, Fast and Slow"  might consider the statistical 
versus the  syntactic, logical (ontological) methods of processing language as 
being analogous to "system 1" and "system 2" respectively in Kahneman's  
problem-solving domain.  As Kahneman found, the statistical approach is fast 
(particularly suited to the parallel connective nature of the brain), but 
error-prone, needing supplementation from a slower more logical "system" in the 
brain to get usable results when accuracy is important.  So, research on a fast 
statistical approach is entirely appropriate, but the current strong emphasis 
on the statistical approach is, I believe retarding progress by failing to 
develop even the most basic resources needed for the analytical stage 2 
function.    (05)

Among the prominent American NL researchers I think that Hovy was most 
determined to try the analytical approach, but eventually (not sure of all the 
details) wound up using mostly WordNet as the primary lexical resource (better 
coverage, I think, for the broad texts he had to analyze). When it became 
obvious that Wordnet was not satisfactory, he reorganized it by aggregating 
some senses, which necessarily improved the inter annotator agreement by 
reducing the number of senses that had to be considered.  But that left many of 
the problems in WordNet unaddressed.    (06)

My humble hypothesis is that a resource better than WordNet for NLP can and 
should be developed, and he time-consuming task of annotating enough text to 
allow it to work with the statistical approach should  go together with that 
project..  Exactly how to achieve that can be debated, but the big problem is 
that no such project other than Hovy's aggregating project "OntoNotes" (which 
reduces discrimination of meanings without making them more accurate) seems to 
be in progress.   More progress may yet occur in NLP by just piling on more 
statistical data, but that will likely hit an asymptote that can only be solved 
by getting a better semantic dictionary.  That is why I am focusing on that 
particular project.  Maybe my specific approach is not optimal, but some effort 
in that direction is, IMHO, vastly better than none.    (07)


Pat    (08)

Patrick Cassidy
MICRA Inc.
cassidy@xxxxxxxxx
1-908-561-3416    (09)


-----Original Message-----
From: ontolog-forum-bounces@xxxxxxxxxxxxxxxx 
[mailto:ontolog-forum-bounces@xxxxxxxxxxxxxxxx] On Behalf Of John F Sowa
Sent: Saturday, August 03, 2013 10:54 AM
To: ontolog-forum@xxxxxxxxxxxxxxxx
Subject: Re: [ontolog-forum] Context and Inter-annotator agreement    (010)

Pat, Michael, Ed, and William,    (011)

I strongly believe in the need for good lexical resources.
For examples of the resources VivoMind uses, see the web site that Arun 
Majumdar maintains and the references cited there:    (012)

    http://lingo.stanford.edu/vso/    (013)

I also believe that no single resource can be necessary, sufficient, or even 
adequate for the task of language understanding.  The title of the following 
paper is "Two paradigms are better than one and multiple paradigms are even 
better":    (014)

    http://www.jfsowa.com/pubs/paradigm.pdf    (015)

For a brief summary of the insufficient attempts over the past few millennia, 
see the list (Slide 10) at the end of this note.    (016)

PC
>>> Those meanings that can be reliably distinguished (>98%) by 
>>> motivated (rewarded for accuracy) human annotators.    (017)

JFS
>> There are no such meanings -- except in very special cases.    (018)

PC
> I think that human performance with real informative text is typically 
> above that level, when one is trying to be accurate and not sloppy or hurried.    (019)

Yes.  But there is a *huge* difference between using language precisely for 
human to human communication and the artificial exercise of trying to select a 
word sense from some list.    (020)

PC
> one has to start with some inventory of senses, but the most detailed 
> inventory yet used for such tests by NL researchers is WordNet, which 
> is not a good standard for such testing.    (021)

No.  An inventory of word senses is neither necessary nor sufficient for NLP.  
WordNet is widely used because it's free, but see Slide 10.    (022)

Also look at the Japanese EDR project (use Google for ref's).  The Japanese 
gov't poured billions of yen into developing a "concept dictionary" with 
400,000 concepts with mappings to both English and Japanese.  CSLI at Stanford 
had a copy of it, and I asked people there whether anybody was using it.  The 
answer I got is that nobody had found anything useful to do with it.    (023)

Use your favorite search engine to look for references to the SENSEVAL 
projects.  If you want more info, subscribe to Corpora List and ask people 
there what they think about these issues.  Adam Kilgarriff, by the way, was one 
of the organizers of the SENSEVAL projects.    (024)

JFS
>> Unfortunately, there is no finite "set of senses" that can be used to 
>> achieve "human-level interpretation of a broad range of texts."    (025)

PC
> That is a bold claim... my observations suggest that no remotely 
> applicable test has yet been conducted to see if such a claim is even 
>plausible.    (026)

Researchers on machine translation tried to develop an Interlingua of concepts 
(word senses) that would be useful for MT.  They failed in the same way as EDR: 
 none of them produced useful results that justified a continuation of the R & 
D funding.    (027)

PC
> Until we develop a logic-based word sense inventory intended for broad 
> use I don't see how the maximum agreement could be tested.    (028)

If you want to make any claims about the value of a large inventory of 
logic-based concepts, you have to explain how your method would differ from 
Cyc.  If you want something freely available, explain what you would do that is 
different from OpenCyc.    (029)

By the way, I spoke with Ron Kaplan from PARC, then PowerSet, and later 
Microsoft.  They had a license to use *all* of Cyc including all the logic and 
the tools.  But Ron said that they just used the hierarchy.  They did *not* use 
the axioms.  In any case, most of the PARC/PowerSet people have left Microsoft 
-- that includes Ron K.
That's not a point in its favor.    (030)

JFS
> Fundamental principle:  People think in *words*, not in *word senses*.    (031)

PC
> Really?  I sure don’t.  Without the textual content to disambiguate 
> words,  communication would be extremely error-prone.
> Where does that notion come from?    (032)

The technical term 'disambiguate' is used by some linguists to describe a stage 
in some programs that attempt to understand language.    (033)

At age 3, Laura understood and generated language far better than any of those 
programs.  She didn't "disambiguate" words, she just
*used* words in meaningful patterns.  See the article by John Limber cited in 
slide 8 of http://www.jfsowa.com/talks/goal.pdf .    (034)

MB
> In order to understand a sentence, it is not enough to pick the right 
> word sense. You have to understand the sense and how it can modify other 
>senses.
> You need context, background knowledge and the ability for abstraction 
> and generalization. You need to be able to follow the line of thinking 
> of the author.    (035)

That's fairly close to what the VivoMind software does.  It does
*not* have a stage that could be called "word sense disambiguation".
Instead, the system uses a large inventory of graph patterns and a kind of 
associative memory for matching its inventory to the patterns that occur in the 
text.  See the paradigm.pdf article.    (036)

Those patterns can come from multiple sources. The words are organized in a 
hierarchy of types and subtypes, but they are
*not* labeled with a fixed set of word senses.  Instead, new patterns are 
generated dynamically from various sources, and they are added to the inventory 
for future use.    (037)

Interesting point:  If the VivoMind software is used to reread the same 
document on a second pass, it generates a different and usually better 
interpretation (an interconnected graph of everything derived from the 
document).  That's closer to what people do.    (038)

MB
> Or take my own sentence:
>
>> When you put words together, you often create completely new senses 
>> that cannot be grasped by looking at individual word senses only.
>
> How do you get from "put" and "together" to "put together" ? There are 
> many senses for "looking" in Wordnet but I cannot find the right one.
> It is not used literally here.    (039)

That's a good example.  By using the original words from the text as labels on 
the graphs, the VivoMind software can use patterns from the literal use of the 
word 'look' to interpret metaphorical uses.    (040)

WF
> For example, if I say "Michael eats new technologies for breakfast", 
> you could understand this even if you have never heard this metaphor 
> before.  The first time you heard such a thing you would have no doubt 
> as to its meaning.  This is not poetry, but is its kin, as is all speech.    (041)

That is what the VivoMind software would do with that sentence.  It would use 
the common pattern for Eat to interpret it, but it would note that technologies 
are not a common kind of food.  If it couldn't find a better pattern in its 
inventory, it would use the one it found.
But it would also evaluate the pattern match as less than perfect.    (042)

WF
> meanings are often gradients that are noticed only when the distance 
> between two points is great enough, creating what some call a 'different'
> sense, and can be blended in a variety of ways.    (043)

Yes.  A semantic distance measure is essential for evaluating the pattern 
matching that occurs during language processing.  The question of what distance 
is "great enough" is highly idiosyncratic.    (044)

WF
> It is this that makes language actually a game that people play.
> Communicating with people means getting the hang of this game.    (045)

That was the main point of Wittgenstein's later philosophy.  I believe it's 
fundamental to understanding language of any kind -- natural or artificial, by 
humans or by computers.    (046)

EJB
> The purpose of natural language dictionaries is to inform humans who 
> presumably have some familiarity with the language.  So, primitive 
> terms are "defined" by providing synonyms and circular 
> circumlocutions.  The idea is that the human reader will recognize 
> enough of that verbiage to grasp the intended concept, by being 
> familiar with the concept itself, presumably in other terms.    (047)

I agree.  And I would note that the so-called "word senses" represent some 
lexicographers' choices in grouping the citations from which a dictionary is 
derived.  That organization is helpful for some purposes, but there is no 
evidence for a fixed, universal set.    (048)

John
______________________________________________________________________    (049)

 From Slide 10 of http://www.jfsowa.com/talks/kdptut.pdf    (050)

                PROSPECTS FOR A UNIVERSAL ONTOLOGY    (051)

Many projects, many useful theories, but no consensus:    (052)

● 4th century BC: Aristotle’s categories and syllogisms.
● 12th to 16th c AD: Scholastic logic, ontology, and semiotics.
● 17th c: Universal language schemes by Descartes, Mersenne, Pascal, Leibniz, 
Newton, Wilkins. L’Académie française.
● 18th c: More schemes. Satire of the Grand Academy of Lagado by Jonathan 
Swift. Kant’s categories.
● 19th c: Ontology by Hegel, Bolzano. Roget’s Thesaurus. Boolean algebra. 
Modern science, philosophy of science, early computers.
● Late 19th and early 20th c: FOL. Set theory. Ontology by Peirce, Brentano, 
Meinong, Husserl, Leśniewski, Russell, Whitehead.
● 1970s: Databases, knowledge bases, and terminologies.
● 1980s: Cyc, WordNet, Japanese Electronic Dictionary Research.
● 1990s: Many research projects. Shared Reusable Knowledge Base (SRKB), ISO 
Conceptual Schema, Semantic Web.
● 21st c: Many useful terminologies, but no universal ontology.    (053)

_________________________________________________________________
Message Archives: http://ontolog.cim3.net/forum/ontolog-forum/
Config Subscr: http://ontolog.cim3.net/mailman/listinfo/ontolog-forum/
Unsubscribe: mailto:ontolog-forum-leave@xxxxxxxxxxxxxxxx
Shared Files: http://ontolog.cim3.net/file/ Community Wiki: 
http://ontolog.cim3.net/wiki/ To join: 
http://ontolog.cim3.net/cgi-bin/wiki.pl?WikiHomePage#nid1J    (054)



_________________________________________________________________
Message Archives: http://ontolog.cim3.net/forum/ontolog-forum/  
Config Subscr: http://ontolog.cim3.net/mailman/listinfo/ontolog-forum/  
Unsubscribe: mailto:ontolog-forum-leave@xxxxxxxxxxxxxxxx
Shared Files: http://ontolog.cim3.net/file/
Community Wiki: http://ontolog.cim3.net/wiki/ 
To join: http://ontolog.cim3.net/cgi-bin/wiki.pl?WikiHomePage#nid1J    (055)

<Prev in Thread] Current Thread [Next in Thread>