John, Richard: (01)
> PC> But WordNet still represents a tremendous and useful effort,
> > and is useful for NL at a shallow semantic level.
>
> I agree with most of what you said about WordNet, including this
> sentence. However, the following sentence is asking for something
> totally different -- not just a revised WordNet.
>
> PC> It is a good start, but something similar with a more precise
> > semantics is needed.
>
> The synsets of WordNet are at the same level as the word senses of
> a typical English dictionary. The process of deriving a dictionary
> such as the OED begins with dozens or even hundreds of highly
> trained lexicographers who take millions of citations gathered by
> thousands of people (many of them volunteers) who extract those
> citations from a truly immense volume of English.
>
> The old shoe boxes full of paper slips have been computerized,
> but the amount of human effort is measured in person-centuries.
> What you find in the dictionary (or in WordNet) is a boiled-down
> or *condensed* extract of the "average" meaning over many, many
> different occurrences of each word sense.
>
> If you want precision, you won't get it by averaging from raw data.
> You can only get precision by examining the precise *microsenses*
> of each word as it is used in each and every citation in the total
> mass of raw data.
>
> This implies that the precise semantics will be truly immense.
> And instead of being listed in alphabetic order, the precise
> meanings will be grouped in something like the microtheories
> of Cyc. But there will be an enormous number of them. In 2004,
> Lenat & Co. estimated that they had about 6000 microtheories,
> and they may have many more by now. But every time they get
> a new application, they need at least one new microtheory,
> and often quite a few more microtheories.
>
> Remember the line that Amanda mentioned and I highlighted:
>
> Ontology is fractal.
>
> That means that the amount of detail that is necessary at each
> level is the same at every level you examine. That implies that
> we will need something of the size of WordNet for every topic
> of every branch of human knowledge and activity. The completely
> precise version you are asking for will dwarf the current WWW.
>
> Yet a child at the age of 3 has a command of language that is
> far better than any computer system today. And that child
> doesn't need Cyc or WordNet or formal logic. I believe we
> should focus on what makes a child smart -- and it's not Cyc
> or anything remotely like it.
>
> RHM> You [Pat C] consistently said mapping from WordNet to xxx.
> > Do you realize that OpenCyc is mapping from its concepts to WordNet?
>
> Lots of people have been mapping their ontologies to and from WordNet.
> But no computer can understand language as well as a 3-year-old child.
> (02)
[PC] I agree with most of what John says, except for the 'fractal' part.
Ontology is certainly *capable* of being extended to indefinite levels of
detail, but practical applications do not require indefinite levels of
detail. One needs to describe all the detail that is important for the
application at hand, and that is enough. If more detail is needed, more
detail should be added. (03)
What do we need next?
>> PC> But WordNet still represents a tremendous and useful effort,
> > and is useful for NL at a shallow semantic level.
>
> [JS] I agree with most of what you said about WordNet, including this
> sentence. However, the following sentence is asking for something
> totally different -- not just a revised WordNet.
>
Yes, that is the point - what an ontology with pointers to words that label
the concept is, is not just another WordNet, though the 'synsets' derivable
from it would *look* like WordNet synsets. It would have support for much
greater levels of semantic detail and semantic precision, and would include
important rules and functional representations that WordNet cannot
represent. The point of referencing WordNet in the ontology is to reassure
NL researchers that the ontology can be used in the same way as WordNet (if
an appropriate tagged corpus becomes available), but that the 'synsets' can
also be used for logical inferencing. This will hopefully encourage NL
researchers to try it out. Eventually users will need to do things that are
not and cannot be done with WordNet - logical inference on the information
derived from the NL processing. (04)
I am well aware of the problems one encounters in identifying 'word senses'
for dictionary purposes with coherence ontological concepts, having been
concerned about precisely that for the past twenty years. What I am trying
to do now is to test the use of an ontology, with the ontology elements
serving as 'senses' and specifying which linguistic labels (English words or
phrases) are used to refer to those concepts in ordinary language. But this
is a large task, and I have to confine my own efforts to the basic
vocabulary of 2000-5000 words. (05)
As for existing mappings:
> RHM> You [Pat C] consistently said mapping from WordNet to xxx.
> > Do you realize that OpenCyc is mapping from its concepts to WordNet?
>
> Lots of people have been mapping their ontologies to and from WordNet.
> But no computer can understand language as well as a 3-year-old child.
>
As I said in my post:
[PC] >> the WordNet structure is not based on principles of inheritance; so
a simple 'mapping' of WordNet to an ontology like Cyc or SUMO is of limited
usefulness, and does not correct the problems. (06)
The existing 'mappings' do not serve the same function as the kind of
'mappings' I am doing in COSMO, because the synsets are not atomic concepts,
and not all words in a synset are in fact conventional labels for the
ontology concepts. The Cyc 'mappings' don't serve the purpose that a
complete rethinking of the synsets themselves will serve - and a rethinking
of the synsets is what is necessary, and what is part of my efforts with the
COSMO. The existing WordNet synsets are excellent references and resources,
but one cannot take them as coherent ontological concepts, though that is
the way that some people attempt to use them. (07)
Pat (08)
Patrick Cassidy
MICRA, Inc.
908-561-3416
cell: 908-565-4053
cassidy@xxxxxxxxx (09)
_________________________________________________________________
Message Archives: http://ontolog.cim3.net/forum/ontolog-forum/
Config Subscr: http://ontolog.cim3.net/mailman/listinfo/ontolog-forum/
Unsubscribe: mailto:ontolog-forum-leave@xxxxxxxxxxxxxxxx
Shared Files: http://ontolog.cim3.net/file/
Community Wiki: http://ontolog.cim3.net/wiki/
To join: http://ontolog.cim3.net/cgi-bin/wiki.pl?WikiHomePage#nid1J
To Post: mailto:ontolog-forum@xxxxxxxxxxxxxxxx (010)
|