ontolog-forum
[Top] [All Lists]

Re: [ontolog-forum] XML tags as natural language words

To: "[ontolog-forum]" <ontolog-forum@xxxxxxxxxxxxxxxx>
From: "Adrian Walker" <adriandwalker@xxxxxxxxx>
Date: Mon, 15 Dec 2008 16:45:55 -0500
Message-id: <1e89d6a40812151345g1860b3c6o59ffe8bbc2e7831f@xxxxxxxxxxxxxx>
Hi John --

Aha, meaning as use rears its head again!

As you know, for some purposes it is in fact useful to rely for meaning on how words and sentences are used computationally, rather than on a separate static dictionary/taxonomy/ontology.

Consider the following syllogism-like rule [1]:

some-paper is related by fact#:title to some-title
that-paper is related by fact#:author to some-description
that-description is related by some-rdf-node to some-home-page
that-home-page is related by fact#:name to some-name
--------------------------------------------------------------------------------------------
that-name is an author of the publication that-title


One can write and run [2] such a rule without reference to any source of predefined terminology.  One can even extract an explanation like this

Paper is related by fact#:title to  An Overview of RDF Query Languages 
Paper is related by fact#:author to __Description1
__Description1 is related by rdf:_2 to aeb
aeb is related by fact#:name to  Andreas Eberhart 
------------------------------------------------------------------------------------------------------------------------------------------
Andreas Eberhart  is an author of the publication  An Overview of RDF Query Languages 


What seems to happen is that the meanings of the words, and of sentences, lie in the relations between sentences a used.

Of course, while very useful in some circumstances, this not "deep"  natural language understanding, but it is phenomenon one should probably keep in mind as one digs deeper.

By the way, congratulations on your Stanford NL talk/paper.  Most interesting.

                                    -- Adrian

[1]  www.reengineeringllc.com/demo_agents/RDFQueryLangComparison1.agent

[2] Internet Business Logic
A Wiki and SOA Endpoint for Executable Open Vocabulary English over SQL and RDF
Online at www.reengineeringllc.com    Shared use is free

Adrian Walker
Reengineering


On Sat, Dec 13, 2008 at 2:33 PM, John F. Sowa <sowa@xxxxxxxxxxx> wrote:
One of the goals of formal ontologies has been the precise
definition and axiomatization of the tags used in the Semantic
Web and other media.  But the most widely used tags in RDF
and other XML-based notations are only defined by statements
in ordinary natural languages.

That observation implies that the XML tags are no more
formal or reliable than ordinary terms in any of the large
numbers of terminologies used in various fields of science,
engineering, medicine, business, and law.

Question:  How can we rely on any deductions that use the
formally defined axioms of some ontology when the input tags
have no formal definition?

Second question:  Even for those tags that are formally
defined, how can we be sure that the people who selected
the tags, either by annotating the raw data or by clicking
on a menu, had read or understood the formal definitions?

Those are issues that many people have raised.  Recently,
Yorick Wilks, who has been working in NL semantics for over
forty years, has written some papers to address those topics:

 1. "The Semantic Web as the apotheosis of annotation, but what
    are its semantics?"
    http://www.dcs.shef.ac.uk/~yorick/papers/IEEE.SW.untrak.pdf

 2. "On whose shoulders?"
    http://www.dcs.shef.ac.uk/~yorick/papers/YW.acl.pdf
    (The most relevant section is on pp. 9-12.)

Following is a quotation from p. 11 of ref #2:

The SW accords a key role to ontologies as knowledge structures:
partially hierarchical structures containing key terms -- primitives
again under another guise -- whose meanings must be made clear,
particularly at the more abstract levels.  The old AI tradition in
logic-based knowledge structuring -- descending from McCarthy and
Hayes (1969) -- was simply to declare what these primitive predicates
meant.  The problem was that predicates, normally English words
written in capital letters (as all linguistic primitives in the end
seem to be), became affected by their inferential roles over time
and the process of coding itself.  This became very clear in the
long-term CyC project (Lenat 1995) where the key predicates changed
their meanings over 30 years of coding, but there was no way of
describing that fact within the system, so as to guarantee consistency.
In Nirenburg and Wilks (2000), Nirenburg and I debate this issue in
depth, and I defend the position that one cannot simply maintain
the meanings of such terms by fiat and independent of their usage
-- they look like words and they function like words because, in
the end, they are words.

Ref #2 was published in the December 2008 issue of the _Computational
Linguistics_ journal.  In that same issue, there was another paper,
"Inter-Coder Agreement for Computational Linguistics," on the
question of reliability of human annotations.  For a copy, see

   http://cswww.essex.ac.uk/Research/nle/arrau/icagr-short.pdf

Section 4.5 of that paper discusses the task of distinguishing word
senses for words that have more than one sense (i.e., nearly all).
For naive coders (i.e., people who are not professional lexicographers)
typical agreement between coders varied from 67% to 78%.  In a study
that used professional lexicographers and arbitration to resolve
disagreements, they achieved 95.5% agreement.

Note that even professionals under carefully controlled conditions
followed by arbitration did not achieve 100% agreement.  In formal
deduction, even the slightest error can cause a theorem prover to
collapse in contradiction.  If something between 4.5% and 33% of the
data is incorrect, all the claims about the need for formal precision
in the axioms and proof procedures become questionable.

These issues do not imply that formal logic and ontology are useless,
but they do imply that we have to revisit the assumptions about
using formal logic on the XML tags of the WWW or SW.

John Sowa


_________________________________________________________________
Message Archives: http://ontolog.cim3.net/forum/ontolog-forum/
Config Subscr: http://ontolog.cim3.net/mailman/listinfo/ontolog-forum/
Unsubscribe: mailto:ontolog-forum-leave@xxxxxxxxxxxxxxxx
Shared Files: http://ontolog.cim3.net/file/
Community Wiki: http://ontolog.cim3.net/wiki/
To join: http://ontolog.cim3.net/cgi-bin/wiki.pl?WikiHomePage#nid1J
To Post: mailto:ontolog-forum@xxxxxxxxxxxxxxxx



_________________________________________________________________
Message Archives: http://ontolog.cim3.net/forum/ontolog-forum/  
Config Subscr: http://ontolog.cim3.net/mailman/listinfo/ontolog-forum/  
Unsubscribe: mailto:ontolog-forum-leave@xxxxxxxxxxxxxxxx
Shared Files: http://ontolog.cim3.net/file/
Community Wiki: http://ontolog.cim3.net/wiki/ 
To join: http://ontolog.cim3.net/cgi-bin/wiki.pl?WikiHomePage#nid1J
To Post: mailto:ontolog-forum@xxxxxxxxxxxxxxxx    (01)

<Prev in Thread] Current Thread [Next in Thread>