In reply to a comment of William Frank:
[WF] > Surely, one can produce a static lexicon with all the special phrases that one has encountered in a speech corpus: put together, cat house, house cat, ...
> And surely, this is a worthwhile endeavor.
> But, that is not how people are able to understand these phrases. The first time they hear 'cat house' or 'put together', as well
> as other phrases and meanings that are not yet in your lexicon, clever people will understand them, from their context and the meanings of the parts.
Given sufficient context, yes, people can do that, and that is what I said.
[WF} > So, a computer that is able to communicate the way people do, instead of according to some reductionist theory of communications, will also do it this same way, with a self-extending and revising lexicon, not a static one.
Yes, given *sufficient* context and proper programming, the computer will also be able to do that. **BUT**
(1) That does not mean that a prior inventory of senses is somehow contradictory, in fact it is essential to determine the context from the meanings of previously understood words. Too many neologisms in a communication render it unintelligible.
(2) There will be neologisms for which the context cannot provide an accurate guess at the meaning;
(3) In any case, adding the new words, phrases, or senses to the lexical inventory will speed up processing of the next occurrence and may make possible guesses at additional neologisms whose context includes the previous neologism, now part of the standard vocabulary.
So I do not see a contradiction between us and I agree with all of your comments except this one:
[WF] > But, that is not how people are able to understand these phrases
No, in the vast majority of cases, particularly in cooperative communication, that is exactly how people understand most phrases, by using the context to disambiguate among *known* senses. Communication would be extremely cumbersome and slow and error-prone otherwise.
True neologisms occur everyday, but they are a very tiny fraction of the senses used in normal cooperative communication (the type used for business, professional work, and everyday living). When people are socializing interactively, using new senses can be a fun and harmless thing to do, because disambiguation can occur rapidly by query when the clever phrases of the speaker cannot be decoded by the listener. That is a totally different situation from the one that I am concerned with, in which written communication is common and clarifying queries are impossible. I think I have mentioned that I feel it is important to try first to approach human-level language understanding in the case of cooperative communication, because the procedures used and perfected there will also be needed for more nonliteral uses of language, but will need supplementation for the more general language uses.
From: ontolog-forum-bounces@xxxxxxxxxxxxxxxx [mailto:ontolog-forum-bounces@xxxxxxxxxxxxxxxx] On Behalf Of William Frank
Sent: Sunday, August 04, 2013 9:20 AM
Subject: Re: [ontolog-forum] Context and Inter-annotator agreement
Surely, one can produce a static lexicon with all the special phrases that one has encountered in a speech corpus: put together, cat house, house cat, ...
And surely, this is a worthwhile endeavor.
But, that is not how people are able to understand these phrases. The first time they hear 'cat house' or 'put together', as well as other phrases and meanings that are not yet in your lexicon, clever people will understand them, from their context and the meanings of the parts. So, a computer that is able to communicate the way people do, instead of according to some reductionist theory of communications, will also do it this same way, with a self-extending and revising lexicon, not a static one.
On Sat, Aug 3, 2013 at 7:54 PM, Patrick Cassidy <pat@xxxxxxxxx> wrote:
I only have time right now to present a short answer to one of your
questions. More later.
> Or take my own sentence:
>> When you put words together, you often create completely new senses
>> that cannot be grasped by looking at indidual word senses only.
> How do you get from "put" and "together" to "put together" ? There are
many senses for "looking" in Wordnet but I cannot find the right one. It is
not used literally here.
What you need to look for (and 'look at') is the compound word 'look at'
(a phrase or idiom, if you like). In WordNet 'look at' has two definitions:
1. (17) consider, take, deal, look at -- (take into consideration for
exemplifying purposes; "Take the case of China"; "Consider the following
2. (6) view, consider, look at -- (look at carefully; study mentally; "view
Word combinations will often have meanings that are related to but not
identical to any of the individual words. In the phrases "cat house" and
"house cat" each word has a different sense.
- - - -
As for " potential murderers”
> Or take the sentence "Soldiers are potential murderers". The sense of
> is modified by "potential" to be something completely different.
In the compound word "potential murderer" both "potential" and "murderer"
have exactly the meanings that they have individually. But the combination
has a meaning that is related to each word in a systematic way. The word
"potential" is one of a well-studied set of modifying words like "phony"
"counterfeit", "toy", "pretended", etc., one of whose meanings is to negate
the idea that the entity referred to it a legitimate instance of the
category thus modified.
Consider again the problematic cases you mentioned and see if they
actually illustrate the point you want to make.
[mailto:ontolog-forum-bounces@xxxxxxxxxxxxxxxx] On Behalf Of Michael
Sent: Friday, August 02, 2013 11:57 AM
Subject: Re: [ontolog-forum] Context and Inter-annotator agreement
On Fri, Aug 02, 2013 at 10:38:29AM -0400, Patrick Cassidy wrote:
> In cooperative communication (excluding poetry and intentionally
> vague or emotionally evocative language) any word senses that are not
> part of a previously agreed and mutually understood lexicon may be
> very difficult to grasp, defeating the point of communication. Of course,
new words or senses
> may be defined in a communication. Perhaps you have some examples of
> senses" that are not already part of the existing English lexicon that
> actually be *accurately* understood by the listener or reader?
I am not an expert but let me try.
In order to understand a sentence, it is not enough to pick the right word
sense. You have to understand the sense and how it can modify other senses.
You need context, background knowledge and the ability for abstraction and
generalization. You need to be able to follow the line of thinking of the
The word much has only one meaning in Wordnet: "a great amount or extent".
Great is defined by Wordnet as "relatively large in size or number or
How do you get a machine to understand the sentence "I ate much" without the
knowledge that "I" refers to a human and how much humans usually eat ?
Or take the sentence "Soldiers are potential murderers". The sense of
is modified by "potential" to be something completely different.
Or take my own sentence:
>When you put words together, you often create completely new senses
>that cannot be grasped by looking at indidual word senses only.
How do you get from "put" and "together" to "put together" ? There are many
senses for "looking" in Wordnet but I cannot find the right one. It is not
used literally here.
> For the purpose of research on language understanding, it seems to be
> a good idea to first try to solve the base problem, which has a great
> deal of practical utility, which is to understand a communication that the
> *intends* to be understood accurately. That is my current focus. People
> are really good at doing that, and I am concerned about how to get
> machines to do that too. That is where a common set of semantic
> primitives represented in a common foundation ontology is, I expect,
> likely to serve very well.
How about a dictionary without circular definitions ? I do not know of such
a thing and I bet it is not because of ignorance on my part.
The failure of the Cyc project also should be closely related to this
Maybe the people familiar with Cyc have more and better examples of the
problems you would face.
My opinion is that understanding of natural language is not possible without
true intelligence. It may also be a bit the other way round.
++ Michael Brunnbauer
++ netEstate GmbH
++ Geisenhausener Straße 11a
++ 81379 München
++ Tel +49 89 32 19 77 80
++ Fax +49 89 32 19 77 89
++ E-Mail brunni@xxxxxxxxxxxx
++ Sitz: München, HRB Nr.142452 (Handelsregister B München) USt-IdNr.
++ Geschäftsführer: Michael Brunnbauer, Franz Brunnbauer
++ Prokurist: Dipl. Kfm. (Univ.) Markus Hendel
Message Archives: http://ontolog.cim3.net/forum/ontolog-forum/
Config Subscr: http://ontolog.cim3.net/mailman/listinfo/ontolog-forum/
Shared Files: http://ontolog.cim3.net/file/
Community Wiki: http://ontolog.cim3.net/wiki/
To join: http://ontolog.cim3.net/cgi-bin/wiki.pl?WikiHomePage#nid1J