Hi John (01)
I will like to invite you to come and visit SRI and give a talk on
this topic. Will you like to do that? (02)
Sent from my iPhone (03)
On Aug 14, 2010, at 12:25 PM, "John F. Sowa" <sowa@xxxxxxxxxxx> wrote: (04)
>> I have only had time to follow up reading suggestions sent up to
>> Tuesday, and will take a few days to read through the various
>> suggests since then.
> I'll suggest a few more.
>> I had suspected that the answer would be semiotics, but the
>> Wikipedia article reduced semiotics to the usual categories of
>> syntax, semantics and pragmatics, which rather misses the point...
>> Most useful was the pointer to the Stanford Encyclopaedia of
>> Philosophy entry on Peirce's Theory of Signs.
> I agree that the Stanford article is better than the Wikipedia
> article, but Peirce himself had subdivided the subject into
> grammar, logic proper, and rhetoric, which Morris Cohen renamed
> into syntax, semantics, and pragmatics.
> However, Peirce didn't *reduce* semiotics (or as he sometimes
> spelled it semeiotic) to those three subjects as they are
> commonly taught today. He had a much broader conception of
> each of the three.
> In any case, it's always important to check anything from any
> source (including and especially any encyclopedia). But the
> Wikipedia also has a more detailed analysis of Peirce's
> classification of signs in the following article (which should
> also be checked against Peirce's own words before accepting):
>> What was more surprising is that nobody mentioned Natural
>> Language Processing, or systems such as CYC, where I had
>> thought knowledge is used to disambiguate sentences.
> Unfortunately, much of the energy of NLP from the 1980s has
> been dissipated in statistical methods. Those methods have
> proved to be very useful for many purposes. But statistics,
> by itself, can never produce semantics.
> The so-called "latent semantics" may be useful for information
> retrieval, but it is not real semantics. It cannot explain
> what a sentence (or a document) means or do further reasoning
> about what it finds.
> As a survey of the state of the art of NLP for information
> extraction, I recommend the following article:
> Information Extraction, by Jerry Hobbs and Ellen Riloff
> This is Chapter 21 in _Handbook on Natural Language Processing_
> published in 2010, so it is reasonably up to date, and Jerry
> Hobbs has been active in NLP since the 1970s. But the systems
> that they reviewed and compared in that article all use templates
> and statistical methods for IE.
> Their concluding paragraph notes that those methods have reached
> a barrier of about 60% accuracy (as measured by the geometric
> mean of recall and precision):
>> Good named entity recognition systems typically recognize about
>> 90% of the entities of interest in a text, and this is near human
>> performance. To recognize an event and its arguments requires
>> recognizing about four entities, and 0.9^4 is about 60%. If this
>> is the reason for the 60% barrier, it is not clear what we can
>> do to overcome it, short of solving the general natural language
>> problem in a way that exploits the implicit relations among the
>> elements of a text.
> In other words, they will have to go back to the more traditional
> symbolic methods of knowledge representation -- but they'll have
> to find more successful ways of dealing with performance issues.
> In 1999, I presented the following article in a Summer School
> on Information Extraction:
> Relating Templates to Language and Logic
> This is essentially the method that we have been implementing
> at VivoMind: use conceptual graphs instead of templates and
> rely on graph matching for IE. But in 1999, we did not have
> a way to do the graph matching with sufficient speed to be
> Since then, the high-speed analogy engine that Arun Majumdar
> designed reverses the performance of symbolic methods compared
> to statistical methods. Instead of analyzing large volumes
> of text and summarizing the results in statistics, we translate
> the texts to conceptual graphs, encode those graphs in a compact
> form, and index them. When we want to find matching graphs,
> we can find them in logarithmic time. That's fast enough.
> Following is a recent talk, which I presented at MJCAI:
> Future directions for semantic systems
> We have not tested our system on the MUC documents, but we were
> involved in a comparison with a dozen other systems on documents
> by the US Dept. of Energy. It involved reading documents,
> extracting certain information from them, and displaying the
> information in tables. The score was the number of correct
> entries in the tables.
> All but two of the systems failed to exceed the 60% barrier.
> One got 73% correct, and we got 96% correct. The methods we
> used are summarized in the futures.pdf slides. The following
> slides (and the readings on the final slide) present a bit more:
> Statistical methods are useful as a supplement for many purposes.
> But you can't do semantics without using symbolic methods.
> Message Archives: http://ontolog.cim3.net/forum/ontolog-forum/
> Config Subscr: http://ontolog.cim3.net/mailman/listinfo/ontolog-forum/
> Unsubscribe: mailto:ontolog-forum-leave@xxxxxxxxxxxxxxxx
> Shared Files: http://ontolog.cim3.net/file/
> Community Wiki: http://ontolog.cim3.net/wiki/
> To join: http://ontolog.cim3.net/cgi-bin/wiki.pl?WikiHomePage#nid1J
> To Post: mailto:ontolog-forum@xxxxxxxxxxxxxxxx
Message Archives: http://ontolog.cim3.net/forum/ontolog-forum/
Config Subscr: http://ontolog.cim3.net/mailman/listinfo/ontolog-forum/
Shared Files: http://ontolog.cim3.net/file/
Community Wiki: http://ontolog.cim3.net/wiki/
To join: http://ontolog.cim3.net/cgi-bin/wiki.pl?WikiHomePage#nid1J
To Post: mailto:ontolog-forum@xxxxxxxxxxxxxxxx (06)