To: |
"[ontolog-forum] " <ontolog-forum@xxxxxxxxxxxxxxxx> |
---|---|

From: |
"Barker, Sean (UK)" <Sean.Barker@xxxxxxxxxxxxxx> |

Date: |
Fri, 15 Feb 2008 16:58:25 -0000 |

Message-id: |
<E18F7C3C090D5D40A854F1D080A84CA4B1F74C@xxxxxxxxxxxxxxxxxxxxxx> |

This mail is publicly posted to a distribution list as part of a process of public discussion, any automatically generated statements to the contrary non-withstanding. It is the opinion of the author, and does not represent an official company view. (01) Just to be clear, (02) 1) Random means subject to subject to variation which is not predicable. This covers situations such as tossing coins, the time to failure of a light bulb, or the kinetic theory of gasses. The treatment of something as random does not imply that there is no underlying mechanism, or indeed that the mechanism is not deterministic, but that we do not have any information about the <b>process</b> by which the event is generated. Consequently, we can treat radar reflections as random, even though, with a sufficiently complex model, we could calculate the signal. (03) 2) two events are independent if knowing the outcome of one, you have no more information about the outcome of the other. The probability of cutting the Ace in a pack of cards is unaffected by the outcome of a previous cut, even if it was an ace. This probability does not have to be 0.5. Conversely, if knowing one event alters the probability of knowing the outcome of another, then the events are correlated. See also Bayes theorem. (04) 3) A statement is ambiguous if it can validly be interpreted in two or more ways (although not all interpretations need be true). This seems unrelated to the word random. (05) 4) One could in theory, generate every single valid (syntactically correct, semantically coherent) web page (assume a maximum number of words from a defined set of languages), and number them according to some arbitrary schema. Accessing a web page would not allow us to predict the number in the arbitrary schema. In that sense, we could describe the content of the web as random, and not the content would not be compressible below the encoding length of the schema. This might be interesting from a theoretical perspective, but I would expect ZIP compression to be more efficient, since it only compresses the words that actually occur, and need not account for any other pages that may or may not occur. (06) 5) As a working hypothesis, one might like to try the following: (07) a) Web pages are generated by a finite set of random processes; b) Each process has a set of probability distribution and correlations functions that determine the probability of words appearing on the page. c) An investigation into the properties of the web from the words contained on a web page is an attempt to infer from the distributions what the set of generating processes is. (08) This raises the question, how much information do we need to process to have a given probability of correcting identifying common processes, and what is the threshold below which we have no reasonable chance of doing so? (09) Sean Barker BAE SYSTEMS - Advanced Technology Centre Bristol, UK + |

Previous by Date: | Re: [ontolog-forum] What words mean, Pat Hayes |
---|---|

Next by Date: | Re: [ontolog-forum] What words mean, Pat Hayes |

Previous by Thread: | Re: [ontolog-forum] What words mean, Rob Freeman |

Next by Thread: | Re: [ontolog-forum] What words mean, Rob Freeman |

Indexes: | [Date]
[Thread]
[Top]
[All Lists] |