Hi John and Gary,
Comments below:
Sincerely,
Rich Cooper
EnglishLogicKernel.com
Rich AT EnglishLogicKernel DOT com
9 4 9 \ 5 2 5 - 5 7 1 2
John F. Sowa wrote:
Rich and Gary,
Rich's note on the new thread of Quality is related to
Gary's
comment about quality in this thread. So I'll
discuss them
both in this thread.
No, there is a small but significant
difference - Gary
is concerned with philosophical quality of the representation and terminology,
while I am concerned with the functional quality of the implementation of some
representation.
The number of instances is a QUANTITATIVE measure of the implementation
that can be calculated, fed back to the ontologizer, and used to tune for
higher quality over iterative development of an ontological configuration.
But the number of relations, and the total number of columns as distributed
among these relations, also measure the complexity of the relational model,
which IMHO id another aspect of quality, if _X is correct.
RC
> I found this quote on a SEMWEB list:
>
> “(_X) tells me that empirical evidence suggests
that using a
> larger number of relationships correlates to
poorer ontologies.”
>
> Note that _X reportedly used the descriptive word
“relationship”,
> not the usual suspect “relation”, so
the total number of
> tuples/rows/records in each relation that
involves the ontology,
> summed over all such relations, would seem to
capture the intuitive
> meaning of that phrase.
The word 'relationship' usually refers to instances
rather than types.
For example, a person with a thousand Facebook
"friends" has a thousand
relationships of the same type.
I suspect that _X was counting instances rather than
types. Perhaps
_X meant that a better ontology might choose relation
types that
could represent the same information with fewer
instances.
For example, the verb 'buy' could be defined by two
instances of 'give':
X buys Y from Z for W amount of
money:
X gives W to Z in order to cause Z
to give W to X.
If your ontology does not have the concept type Buy,
it would have
more instances of the simpler relations. Perhaps
that is _X's
measure of quality. But you might ask _X.
_X has his own problems right now, so I
will offer my own view. The number of signatures which a verb can take on
in well formed sentences is limited to quite a small number, a la the “verb
alternations” work (what was her name?). It’s the huge range
of designations for subject, object, auxiliary, adverb, etc that makes sentences
so complex. Therefore the verb signature class, with its various type
subclasses, and various component parts, should be deduced with more easily
than a fully parsed sentence if one jettisons aspects of grammar in the
signatures, and simply implements the expected grammatical effect in the
interpreter, where it is most flexibly evaluated - at interpretation time.
In a Q&A system, often only a fragment
of a sentence in the database may be needed and relevant to an answer for a
given question. A signature case base for most English verbs, with their
alternations as subclasses of each signature class, would therefore seem to be
the most effective way to relate stored sentence data to the concepts detected
in questions.
So for that example, _X’s suggested “relationship”
quality metric would combine the total number of relations, plus the total
number of instances in each relation, plus the total number of columns used in
the set of all relations. Of course, it might be better to treat those subordinate
measures orthogonally and multiply by a weighting vector that reflects the
conformity of each with principles of compression, speed or other performance measure.
-Rich
GBC
> John’s example was the cutting up of some
water domain into various
> categories (rivers, steams etc.) that often come
to have a term
> associated with it. We might have an application
dealing with floods
> in which these distinctions are important. To
start on a quality
> ontology for such an application it should be
able to make meaningful
> statements about what exists in its focused
domain.
I agree that if a term is significant for a particular
domain a good
ontology should provide some way to express it.
Sometimes, the formal
definitions have a very direct mapping to the informal
terms, but
sometimes a different choice of formal relations might
be more useful.
For the example about floods, you might represent the
basic idea
of water flowing in a channel at a certain rate (say
cubic meters
per second). The common terms, such as creek, stream,
and river
might be less useful, since a flooded creek might have
a greater
flow than a river during a dry spell.
You might also choose to represent the maximum desired
channel
depth and define a flood by the distance above that
depth.
GBC
> The hydrological concepts are more basic and
underlie the real world
> phenomena at the river-stream level. For many
applications they will
> therefore organized things.
That depends on your application. If you happen
to be a landscape
designer, you might consider the common terms as
useful ways to
characterize the visual features in your design.
GBC
> ... for IT applications we need to formalize
these axioms in a language
> that on the one hand faithfully reflects this
conceptualization and
> on the other can be processed by applications.
Yes, but I would emphasize the word
'application'. The quality of
an ontology depends critically on the way its used in
an application.
Even for the same basic phenomena, different
ontologies might be better
suited to one purpose or another.
John
_________________________________________________________________