Rob, (01)
As I said, I only partially agree with Ibn Taymiyya because
there is a great deal of data that can be very well generalized.
(Actually both Taymiyya and Anderson recognized that point, but
they didn't generalize it as well as they could.) (02)
RF> Anyway, are you arguing for the idea that some data cannot be fully
> generalized, are you arguing against it, or are you just arguing? (03)
That question itself is an abstraction from the situation that
attempts to force one or another generalization on the matter.
In general, people are constantly generalizing, and it's
impossible for human beings to avoid generalizing. And any
data they derive will be biased by previous generalizations. (04)
To clarify the issues, I suggest that we replace the word
'generalization' with the term 'data compression'. To avoid
fine points of modern science, suppose we look at data available
since ancient times: the positions of the stars and other
objects in the sky. (05)
The first generalization is that most of objects in the sky are
*fixed* -- they stay in exactly the same relative positions night
after night for as long as anybody is able to detect. A theory
with very high accuracy is that they are attached to a "celestial
sphere" that rotates relative to the earth. (It's irrelevant
which one is assumed to be stationary.) That theory compresses
an enormous amount of data in one swell foop. (06)
But there are two basic motions in the sphere: the 24-hour
rotation, and the yearly movement of the sun through the zodiac.
Those generalizations enable much more data about the stars to
be summarized in a theory about fixed positions and two kinds
of rotations. That compresses the data much further. (07)
Then there are the seven known "planets", which "wander" through
the heavens at distances closer than the celestial sphere: the moon,
the sun, Mercury, Venus, Mars, Jupiter, and Saturn. Their positions
require more complex theories in order to compress the data. The
labels 'day', 'month', and 'year' are attached to those theories,
and a lot of math is necessary to make the generalizations precise. (08)
Truly random data cannot be compressed at all, but perception itself
is a process of compression. And the "salient" things, relations,
and events are the most important ones that are most likely to be
compressed and labeled by words. So all compression is done with
respect to importance and for the purpose of preserving the most
important details in the data. That implies multiple different
ways of compressing the same data for different purposes. (09)
In short, people and other animals are constantly compressing their
perceived data -- i.e., they form generalizations. Theories are
nothing more nor less than systematic records of the most successful
data compressions for some of the most important perceptions. (010)
The words of language are labels attached to those generalizations
that people consider important. Some of them, like 'cat' and 'tree'
refer to frequently observed patterns, and others like 'zodiac'
refer to theory-laden generalizations of generalizations. (011)
So when you compress data on the WWW by statistics or search it
without compression, you are always using the labels that were
derived by millennia of human beings from generalizations that
were important to their lives. Even if you store all the raw
data on the WWW and go back to it for every use, you will never
avoid generalizations. The so called "raw" data cannot be
purged of the generalizations by which they were derived. (012)
John (013)
_________________________________________________________________
Message Archives: http://ontolog.cim3.net/forum/ontolog-forum/
Subscribe/Config: http://ontolog.cim3.net/mailman/listinfo/ontolog-forum/
Unsubscribe: mailto:ontolog-forum-leave@xxxxxxxxxxxxxxxx
Shared Files: http://ontolog.cim3.net/file/
Community Wiki: http://ontolog.cim3.net/wiki/
To Post: mailto:ontolog-forum@xxxxxxxxxxxxxxxx (014)
|