ontolog-forum
[Top] [All Lists]

Re: [ontolog-forum] Context and Inter-annotator agreement

To: ontolog-forum@xxxxxxxxxxxxxxxx
From: John F Sowa <sowa@xxxxxxxxxxx>
Date: Wed, 31 Jul 2013 10:28:43 -0400
Message-id: <51F91F1B.9000602@xxxxxxxxxxx>
Pat,    (01)

I agree with the last sentences of the following comment.  I agree with
the first, but I would qualify the word 'know'.  The middle sentence
is more problematical.    (02)

PC
> The only ones who really know the "meaning" of a word are the ones who
> created the text.  It would not be too difficult to have creators of text
> label the senses that they intend, and through a series of iterations, find
> a set of senses that text creators and text annotators can agree on with a
> precision that would satisfy Miss Elliott.  I find little interest at NLP
> meetings for work of that kind.    (03)

There is no way that an author or speaker could write or speak
fluently about any subject while constantly thinking about which
word sense to select for every word.    (04)

But it is possible, for texts where clarity and precision are critical,
for the author to use tools that can help detect ambiguity, avoid words
that could be problematical, and suggest simpler syntax for phrases
that are overly complex.    (05)

Note the following passage from the Boeing web site:    (06)

Source: http://www.boeing.com/boeing/phantom/sechecker/checker.page
> A language checker is a software application that helps authors comply
> with a controlled-language specification. Examples of controlled languages
> include ASD Simplified Technical English, Attempto Controlled English,
> Caterpillar Technical English, Global English and the U.S. government's
> Plain Language specification.
>
> The Boeing Simplified English Checker helps writers comply with ASD
> Simplified Technical English (STE), developed by the AeroSpace and
> Defence Industries Association of Europe.    (07)

The result of such a checker is much easier for non-native speakers
to understand.  For aircraft maintenance, that is critical for manuals
used by workers at airports around the world.    (08)

That result isn't sufficiently precise to be translated to logic, but it
is usually easier to translate to other NLs by automated tools.    (09)

It can also be a first step toward a semi-automated generation of some
knowledge representation language.  It would also be very useful for
checking the comments in KR languages, such as OWL, where more of the
semantics is buried in the NL comments than in the formal operators.    (010)

However, it does require a lot of training and practice to use such
a language checker.  And the overwhelming amount of NL data on the
WWW has never been and never will be checked by such tools.    (011)

PC
> I would expect computers to be able, eventually, to do better
> than any given pair of annotators at finding the right meaning,
> provided that the "meanings" are in fact distinguishable.    (012)

I agree with qualifications.  But the two main qualifications
are (1) when do you expect "eventually" to occur, and (2) what
do you mean by "distinguishable" meanings.    (013)

For point #1, semi-automated checkers, such as Boeing's and others,
can be and should be more widely used in conjunction with KR tools.    (014)

Some of the VivoMind software that was developed, paid for, and
used for practical applications has detected issues that humans,
working without such aids, missed.  See slides 111 to 157 of    (015)

    http://www.jfsowa.com/talks/goal.pdf    (016)

Tools that use or extend such technologies should be the focus
of much more R & D than is currently being devoted to them.    (017)

For point #2, every unabridged dictionary has a different set
of word senses.  The number of word senses is constantly growing
and changing, and many linguists have strong doubts about the
possibility of having any precisely defined set.    (018)

Alan Cruse made the point that there is no limit to the number
of fine-grained senses that can be useful.  Slide 54 of goal.pdf
(copy below) summarizes and illustrates that point.  For more
on this issue, see slides 46 to 78 of goal.pdf    (019)

John
________________________________________________________________    (020)

Slide 54 of http://www.jfsowa.com/talks/goal.pdf    (021)

                            MICROSENSES    (022)

The linguist Allen Cruse coined the term microsense for a
specialized sense of a word in a particular application.    (023)

Examples of microsenses:    (024)

  ● Spatial terms in different situations and points of view.
  ● The many kinds of chairs or numbers in the egg whites.
  ● The kinds of balls in various ball games: baseball, basket ball,
billiard ball, bowling ball, football, golf ball, softball, tennis ball.
  ● Computer science requires precise definitions, but the meanings
change whenever programs are revised or extended.
  ● Consider the term 'file system' in Unix, Apple OS X, Microsoft
Windows, and IBM mainframes.    (025)

Microsenses develop through usage in different situations.    (026)

The number and kinds of new uses and innovations grow
independently of any attempt to limit the meanings of words.    (027)

_________________________________________________________________
Message Archives: http://ontolog.cim3.net/forum/ontolog-forum/  
Config Subscr: http://ontolog.cim3.net/mailman/listinfo/ontolog-forum/  
Unsubscribe: mailto:ontolog-forum-leave@xxxxxxxxxxxxxxxx
Shared Files: http://ontolog.cim3.net/file/
Community Wiki: http://ontolog.cim3.net/wiki/ 
To join: http://ontolog.cim3.net/cgi-bin/wiki.pl?WikiHomePage#nid1J    (028)

<Prev in Thread] Current Thread [Next in Thread>