Hi John,
In essence, you are correct - data mining is not
sufficient, but it does provide a very capable way of doing the OBSERVATION
part of semantic discovery.
It's true that data mining only finds patterns, not causal or even
necessarily inferentiable ones, just patterns that repeat in the data.
There is no pattern in the data itself – the observer actively
discovers patterns. The important issue for pattern discovery is
validation - does the pattern usefully identify anything that can be observed
in proxy? For that, one must validate some model of the pattern, in its
actual variety, on actual instances of data – observations from reality
– not instances from the model. The data is about reality, the
model is a useful association with reality.
We use OBSERVATION processes to try to identify and pair each pattern
to some meaningful objects, activities or events in the universe of discourse so
they can be understood in the context we found them in. That presumes
some concept of classes which gets richer as we pursue validated
activities. Each activity refines the definition of classes so far
applied, with the newly added refinements added judiciously to the model.
We use CLASSIFICATION to organize the patterns, and pattern components,
into classes that help us remember the characteristics of whole groups of
pattern instances. That refines our ability to do valid
identification. Formal concept analysis (FCA) as you mentioned is one
algorithm among those available for implementing this.
The THEORIZATION step involves describing WHAT the patterns, and
instances of patterns, MEAN – the context as viewed by an observer, again
represented by proxy. That may not be anything at all, or it may bring
useful insights about the world. Theorizing is possible to do completely
automatically, but much of this step’s effectiveness requires the kind of
insight that people do well and algorithms haven’t so far. Computer
assistance is a leverage factor for some applications. Designing relevant
heuristic functions for each domain may be useful.
With theories, classes and observations, we can design and carry out
EXPERIMENTATION steps to confirm or deny the theories, or to create new
observations to drive classification and theorization. Experiments
provide feedback to refine observations, theories and classes, or even to
suggest new ones.
All four processes in a text discovery project - experimenting,
observing, theorizing and classifying - are used in addition to data mining to
get the full results. Figures 13 and 14 in the patent describe the
processes involving Experimenting, Classifying, Observing and Theorizing.
The patent figures (Figs 13 and 14) are too large to show in this list
due to email size restrictions. If you would like to investigate the
description of the four processes of discovery, see
http://www.englishlogickernel.com/Patent-7-209-923-B1.PDF
and read the sections on Figures 13 and 14, which describe it in
detail.
HTH,
-Rich
Sincerely,
Rich Cooper
EnglishLogicKernel.com
Rich AT EnglishLogicKernel DOT com
9 4 9 \ 5 2 5 - 5 7 1 2
-----Original Message-----
From: ontolog-forum-bounces@xxxxxxxxxxxxxxxx [mailto:ontolog-forum-bounces@xxxxxxxxxxxxxxxx]
On Behalf Of John F. Sowa
Sent: Wednesday, January 19, 2011 5:49 AM
To: ontolog-forum@xxxxxxxxxxxxxxxx
Subject: Re: [ontolog-forum] Ontology of Rough Sets
On 1/19/2011 12:01 AM, Rich Cooper wrote:
> But instances DO define the types, WITH/USING pattern proxies.
> That works well, in text mining, linguistics, and the social and
medical
> sciences.
>
> I think it's the distinction between empirical sciences (instances
to types)
> and so called pure sciences, based on very limited views of
reality (types
> to instances).
This gets into critical issues about scientific methodology and
the problems with blindly using the results of data mining.
What you get from data mining is *not* a definition of a new type
from instances. What you get is a "co-occurrence
pattern" or
a "correlation". Such patterns can often be clues to a
useful
type definition, but they can often be misleading or worse.
That's why scientists don't accept an observed correlation as
a law until it has (a) made reliable predictions of future
observations, and (b) has been connected by reasonable chains
of inference to other established laws.
But I'll grant that for many applications, such as sending
junk mail, the cost of testing the correlation is higher than
the cost of dumping unwanted mail on people for whom the
prediction fails.
John