Date: Sat, 16 Feb 2008 14:04:39 +0800
Hi Sean,    (01)

On Feb 16, 2008 12:58 AM, Barker, Sean (UK) <Sean.Barker@xxxxxxxxxxxxxx> wrote:
> 5) As a working hypothesis, one might like to try the following:
> a) Web pages are generated by a finite set of random processes;
> b) Each process has a set of probability distribution and correlations
> functions that determine the probability of words appearing on the page.
> c) An investigation into the properties of the web from the words
> contained on a web page is an attempt to infer from the distributions
> what the set of generating processes is.    (02)

Have you heard of the "Hutter Prize" for the compression of text by
human knowledge?    (03)

"...in 1950, Claude Shannon estimated the entropy (compression limit)
of written English to be about 1 bit per character [3]. To date, no
compression program has achieved this level."
(http://cs.fit.edu/~mmahoney/compression/rationale.html)    (04)

There's an annual Euro 50,000 prize for the best effort.
(http://prize.hutter1.net/)    (05)

The idea of the prize is the old one that we (can't predict, and thus
compress, text as much as we expect because we...) need human
knowledge to "disambiguate" natural language. That's an old idea. I
believe almost the opposite. But the prize, and the work of Marcus
Hutter which motivated it, is interesting for what it says about the
predictability of natural language, and in particular the "randomness"
of meaning. Where by the "randomness of meaning" I mean that Hutter's
work (like Schmidhuber's "New AI") assumes it is necessary to use
probabilistic model of intelligence.    (06)

It is also a definition of intelligence dependent on goals, note (c.f.
W. J. Freeman). Hutter: "No Intelligence without Goals."    (07)

Hutter has written a book on this:    (08)

Universal Artificial Intelligence - Sequential Decisions based on
Algorithmic Probability    (09)

http://www.hutter1.net/ai/uaibook.htm    (010)

-Rob    (011)

