Ed and Azamat, (01)
The amount of garbage on the WWW is overwhelming. But even with well
edited data from respected sources, it's always important to check
and recheck the sources of the sources and the line of reasoning
that led to any conclusion. Then get a second opinion, and another
opinion to check that one, and so forth. (02)
AA> That's why I am suspicious of all sorts of cross-divisions,
> cross-classifications and poly-hierarchies. One may classify some
> people, say actors, according to age and gender, or according to
> marriage status, age and gender, or by promiscuity, marriage status,
> age and gender, so on ad infinitum. (03)
I certainly agree that it's important to be suspicious of anything
that has that many cross links. I would also be extremely suspicious
of anything produced by unnamed editors on the Wikipedia or DBpedia.
Some of that data is extremely good, and it can sometimes be superior
to the Encyclopedia Britannica. But much of it is extremely uneven,
and I would never trust the Encyclopedia Britannica (or any other
source) without getting multiple opinions. (04)
But I would add that there is *zero* correlation between the number
of cross links and the validity of the data. In fact, I would be
even more suspicious of any ontology that was a simple tree. I
would consider a tree to be evidence that the people who developed
the data had prematurely terminated their analysis. (05)
AA> This [multiple cross links] is the sure way to generate an
> enormous heap of useless data... (06)
As Einstein said, "Make everything as simple as possible, but not
simpler." Whether a link is valuable or useless depends on the type
of problem. (07)
AA> ... leading to raw data big crunch, on which only statistical
> engines, like Google, might profit... (08)
There are many very efficient ways of processing large volumes of data,
and it is possible to use them to check and cross check data from highly
varied sources. As an example, see our paper on analogical reasoning: (09)
http://www.jfsowa.com/pubs/analog.htm (010)
Note the example of comparing computer programs to the English
documentation in order to detect errors and inconsistencies. The
methods used a combination of analogical reasoning and deduction. (011)
More recently, we have been extending those techniques with a variety
of different reasoning methods, including analogies, deduction, and
many different varieties of statistics. See the following slides: (012)
http://www.jfsowa.com/talks/pursuing.pdf (013)
The last slide has pointers to further references, including the
article with the same title as the talk. (014)
In short, don't trust any reasoning method by itself. Formal logic is
only as reliable as the starting assumptions. Get multiple opinions,
preferably based on different sources and methods of reasoning. (015)
John (016)
_________________________________________________________________
Message Archives: http://ontolog.cim3.net/forum/ontolog-forum/
Subscribe/Config: http://ontolog.cim3.net/mailman/listinfo/ontolog-forum/
Unsubscribe: mailto:ontolog-forum-leave@xxxxxxxxxxxxxxxx
Shared Files: http://ontolog.cim3.net/file/
Community Wiki: http://ontolog.cim3.net/wiki/
To Post: mailto:ontolog-forum@xxxxxxxxxxxxxxxx (017)
|