Bart, Dick, and Pat, (01)
BG> John, do you think the recent work on standardization,
> specifically on ontologies, has brought us any closer to
> the formalization step? (02)
Formalization is not some kind of ultimate goal that we're
striving to achieve. It is a very mundane, everyday process
that children learn in elementary arithmetic. (03)
The word 'formal' comes from the word 'form'. A definition
of some notation is formal when its meaning is totally
determined by its *form* (AKA syntax, grammar, or pattern). (04)
The first formally defined notation was Aristotle's syllogisms.
For example, the pattern named Barbara has the form (05)
Every A is a B.
Every B is a C.
Therefore, every A is a C. (06)
With patterns like these, Aristotle introduced the first recorded
examples of the use of variables. If you replace the letters
A, B, and C with any three nouns or noun phrases, you get a
valid inference pattern. For example, (07)
Every woozle is a slithy tove.
Every slithy tove is a one-eyed elephant.
Therefore, every woozle is a one-eyed elephant. (08)
If the first two premises are true, then the conclusion is
guaranteed to be true. If either one of the premises is
false, then the truth of the conclusion is undetermined. (09)
In summary, formalization means nothing more nor less than the
common practice of mathematicians and logicians for the past
two and a half millennia. It merely means that the meaning
of the notation and the transformations on it is defined by
operations on the explicit grammatical patterns. (010)
BG> Fields such as fuzzy logic and probability theory let us
> make statements about the world based on empirical data. (011)
Not quite. Those two systems, fuzzy logic and probability
theory, are defined formally by patterns. Although the kinds
of patterns are slightly different from Aristotle's patterns,
they are related to reality in the same way: the results of
processing a pattern are true of the world if and only if the
starting data happens to be true of the world. (012)
BG> Classification algorithms, specifically tree based ones
> like C4.5, are completely based on attributes and their values. (013)
Again, you have to recognize that the C4.5 procedure is purely
formal. It's a different kind of pattern than Aristotle's,
but its meaning is totally determined by the patterns and the
operations on those patterns. The C4.5 algorithms do *not*
process reality. The only things that they process are
patterns of character strings that represent somebody's best
guess about reality. (014)
The GIGO principle still holds: Garbage In, Garbage Out. (015)
BG> I'm wondering whether there is a point where statistical
> analysis can take over where we simply don't know enough
> about a topic to infer information using logic alone. (016)
There is no difference in principle. Logic and statistics
measure different aspects of the same things, and they are
completely compatible. Logic can be used to define what is
an A, what is a B, and what is a C. Statistics counts how
many As, how many Bs, and how many Cs. You use whichever
one is appropriate to the data and what you're looking for. (017)
As for the C4.5 algorithm, it is more closely related to
logic than it is to statistics. It's a kind of learning
algorithm, but what it learns is a *decision tree* that
can be expressed as a very large nest of if-then-else
statements. That tree can be mapped to a program in any
language that supports if-then-else statements, such as
C or Java. It can also be mapped to first-order logic. (018)
RHM> How about a language that uses synsets instead of words?
> Do you know if anyone has researched that?
>
> A synset is an equivalence class, similar to your
> definition of proposition. (019)
The fact that two different subjects happen to use equivalence
classes does not imply that they are similar in any real sense. (020)
In defining propositions as equivalence classes, I started with
formally defined statements in some version of logic, and I
defined a way of grouping them by logical equivalences. (021)
For WordNet, George Miller and his colleagues started with the
informally defined words of English and grouped them according
to their informal word senses, as determined by English speakers
who used their subjective judgments and background knowledge. (022)
If you start with words and their informal meanings and group
them according to somebody's informal judgments, the result
is definitely *not* formal. It may be a very useful grouping
for many purposes. (WordNet is a widely used resource, and we
use it for our language processors at VivoMind.) But those
synsets are much closer in principle to informal words and
word senses than they are to the formal entities of mathematics. (023)
PC> But WordNet still represents a tremendous and useful effort,
> and is useful for NL at a shallow semantic level. (024)
I agree with most of what you said about WordNet, including this
sentence. However, the following sentence is asking for something
totally different -- not just a revised WordNet. (025)
PC> It is a good start, but something similar with a more precise
> semantics is needed. (026)
The synsets of WordNet are at the same level as the word senses of
a typical English dictionary. The process of deriving a dictionary
such as the OED begins with dozens or even hundreds of highly
trained lexicographers who take millions of citations gathered by
thousands of people (many of them volunteers) who extract those
citations from a truly immense volume of English. (027)
The old shoe boxes full of paper slips have been computerized,
but the amount of human effort is measured in person-centuries.
What you find in the dictionary (or in WordNet) is a boiled-down
or *condensed* extract of the "average" meaning over many, many
different occurrences of each word sense. (028)
If you want precision, you won't get it by averaging from raw data.
You can only get precision by examining the precise *microsenses*
of each word as it is used in each and every citation in the total
mass of raw data. (029)
This implies that the precise semantics will be truly immense.
And instead of being listed in alphabetic order, the precise
meanings will be grouped in something like the microtheories
of Cyc. But there will be an enormous number of them. In 2004,
Lenat & Co. estimated that they had about 6000 microtheories,
and they may have many more by now. But every time they get
a new application, they need at least one new microtheory,
and often quite a few more microtheories. (030)
Remember the line that Amanda mentioned and I highlighted: (031)
Ontology is fractal. (032)
That means that the amount of detail that is necessary at each
level is the same at every level you examine. That implies that
we will need something of the size of WordNet for every topic
of every branch of human knowledge and activity. The completely
precise version you are asking for will dwarf the current WWW. (033)
Yet a child at the age of 3 has a command of language that is
far better than any computer system today. And that child
doesn't need Cyc or WordNet or formal logic. I believe we
should focus on what makes a child smart -- and it's not Cyc
or anything remotely like it. (034)
RHM> You [Pat C] consistently said mapping from WordNet to xxx.
> Do you realize that OpenCyc is mapping from its concepts to WordNet? (035)
Lots of people have been mapping their ontologies to and from WordNet.
But no computer can understand language as well as a 3-year-old child. (036)
John Sowa (037)
_________________________________________________________________
Message Archives: http://ontolog.cim3.net/forum/ontolog-forum/
Config Subscr: http://ontolog.cim3.net/mailman/listinfo/ontolog-forum/
Unsubscribe: mailto:ontolog-forum-leave@xxxxxxxxxxxxxxxx
Shared Files: http://ontolog.cim3.net/file/
Community Wiki: http://ontolog.cim3.net/wiki/
To join: http://ontolog.cim3.net/cgi-bin/wiki.pl?WikiHomePage#nid1J
To Post: mailto:ontolog-forum@xxxxxxxxxxxxxxxx (038)
|