Re: [ontolog-forum] Semantic Web shortcomings [was Re:ANN: GoodRelations

To: rick@xxxxxxxxxxxxxx, "[ontolog-forum]" <ontolog-forum@xxxxxxxxxxxxxxxx>
From: "John F. Sowa" <sowa@xxxxxxxxxxx>
Date: Sat, 16 Aug 2008 11:37:32 -0400
Message-id: <48A6F43C.7080907@xxxxxxxxxxx>
Rick,    (01)

Although I have a strong interest in philosophical issues as an
inspiration for doing research, I believe that a computational
method must be evaluated on its results.  Evaluating methods on
the basis of terminology derived from realism or idealism is
as meaningful as deciding among algorithms because they were
invented by a Christian, Jew, Muslim, or atheist.    (02)

Following is a note I sent to Corpora list.    (03)

John    (04)

-------- Original Message --------
Subject: Re: [Corpora-List] Bootcamp: 'Quantitative Corpus Linguistics 
withR'--re Louw's endorsement
Date: Sat, 16 Aug 2008 11:06:35 -0400
From: John F. Sowa <sowa@xxxxxxxxxxx>
To: Wolfgang Teubert <w.teubert@xxxxxxxxxx>
CC: corpora@xxxxxx    (05)

Wolfgang,    (06)

The fact that some approach has been inspired by cognitive theories
does not disqualify it from being applied to corpora.  And there's
no reason why you can't mix and match multiple methods of various
kinds -- logical, analogical, statistical, heuristic, or whatever.    (07)

  > A number of responses I have received via the list or in private
  > suggest that the future will see the integration of corpus
  > linguistics with cognitive approaches.  I disagree.    (08)

I have no idea what you mean by "integration" or why you assume that
a cognitive approach must be based on introspection:    (09)

  > The problem is that the mind does not allow introspection. No one
  > has ever presented evidence for a single mental concept.    (010)

I have been working with some colleagues who have been using
conceptual graphs to represent data from multiple sources, either
unstructured, untagged documents or structured data from any source,
such as relational DBs or tags of any kind on any sources.  As an
example of a query stated in several English sentences, which was
answered from a collection of 79 untagged English documents, see
slides 26 to 37 of the following talk:    (011)

     Pursuing the Goal of Language Understanding    (012)

The approach uses multiple heterogeneous agents, which can use
different techniques to interpret a text.  If an ontology is
available, some agent will use it to interpret a sentence as it
is being parsed.  If multiple ontologies are available, multiple
agents, each one using a different ontology will attempt to
interpret a sentence or part of a sentence.  If no ontology is
available, some agents will use statistical methods.  It's even
possible for different agents to use different techniques with
different ontologies on the *same* sentence.  Some agents use
logic, but most don't.    (013)

In case of conflicts (which are the norm, not the exception),
higher level agents or a committee of higher level agents
will choose what they consider the best interpretation for
each phrase.  Individually, the agents don't have to be very
intelligent. (Imagine them as judges at the Olympic Games.)    (014)

If a sentence happens to be about a single unified topic, it is
likely that all the phrases will be interpreted by agents working
with the same ontology.  But if it mixes or relates different
topics, different parts might be interpreted by different agents
working with radically different methods.    (015)

Then the CGs are indexed (with pointers back to the original
documents), and the analogy engine is used to find the best
match (or matches) to a given query (which may be one sentence,
multiple sentences, or an arbitrary document).  The time to
index the graphs grows as (N log N), and the time to find
a graph that is similar to a given graph grows as (log N).    (016)

John Sowa    (017)

