ontolog-forum
[Top] [All Lists]

Re: [ontolog-forum] What words mean

To: "[ontolog-forum] " <ontolog-forum@xxxxxxxxxxxxxxxx>
From: "Barker, Sean (UK)" <Sean.Barker@xxxxxxxxxxxxxx>
Date: Fri, 15 Feb 2008 16:58:25 -0000
Message-id: <E18F7C3C090D5D40A854F1D080A84CA4B1F74C@xxxxxxxxxxxxxxxxxxxxxx>


This mail is publicly posted to a distribution list as part of a process
of public discussion, any automatically generated statements to the
contrary non-withstanding. It is the opinion of the author, and does not
represent an official company view.    (01)

Just to be clear,    (02)

1) Random means subject to subject to variation which is not predicable.
This covers situations such as tossing coins, the time to failure of a
light bulb, or the kinetic theory of gasses. The treatment of something
as random does not imply that there is no underlying mechanism, or
indeed that the mechanism is not deterministic, but that we do not have
any information about the <b>process</b> by which the event is
generated. Consequently, we can treat radar reflections as random, even
though, with a sufficiently complex model, we could calculate the
signal.    (03)

2) two events are independent if knowing the outcome of one, you have no
more information about the outcome of the other. The probability of
cutting the Ace in a pack of cards is unaffected by the outcome of a
previous cut, even if it was an ace. This probability does not have to
be 0.5. Conversely, if knowing one event alters the probability of
knowing the outcome of another, then the events are correlated. See also
Bayes theorem.    (04)

3) A statement is ambiguous if it can validly be interpreted in two or
more ways (although not all interpretations need be true). This seems
unrelated to the word random.    (05)

4) One could in theory, generate every single valid (syntactically
correct, semantically coherent) web page (assume a maximum number of
words from a defined set of languages), and number them according to
some arbitrary schema. Accessing a web page would not allow us to
predict the number in the arbitrary schema. In that sense, we could
describe the content of the web as random, and not the content would not
be compressible below the encoding length of the schema. This might be
interesting from a theoretical perspective, but I would expect ZIP
compression to be more efficient, since it only compresses the words
that actually occur, and need not account for any other pages that may
or may not occur.    (06)

5) As a working hypothesis, one might like to try the following:    (07)

a) Web pages are generated by a finite set of random processes;
b) Each process has a set of probability distribution and correlations
functions that determine the probability of words appearing on the page.
c) An investigation into the properties of the web from the words
contained on a web page is an attempt to infer from the distributions
what the set of generating processes is.    (08)

This raises the question, how much information do we need to process to
have a given probability of correcting identifying common processes, and
what is the threshold below which we have no reasonable chance of doing
so?    (09)

Sean Barker
BAE SYSTEMS - Advanced Technology Centre
Bristol, UK
+44(0) 117 302 8184    (010)

BAE Systems (Operations) Limited
Registered Office: Warwick House, PO Box 87, Farnborough Aerospace
Centre, Farnborough, Hants, GU14 6YU, UK
Registered in England & Wales No: 1996687     (011)

> -----Original Message-----
> From: ontolog-forum-bounces@xxxxxxxxxxxxxxxx 
> [mailto:ontolog-forum-bounces@xxxxxxxxxxxxxxxx] On Behalf Of 
> Rob Freeman
> Sent: 15 February 2008 11:53
> To: [ontolog-forum]
> Subject: Re: [ontolog-forum] What words mean
> 
> 
>                *** WARNING ***
> 
> This mail has originated outside your organization, either 
> from an external partner or the Global Internet. 
>      Keep this in mind if you answer this message. 
> 
> OK, John.
> 
> Replace that use with "ambiguous" for now, then. I think that works.
> 
> Personally I thought Sean Barker's sense for "random" from 
> signal processing theory was very relevant (Axiom. ont. Feb 12):
> 
> "Radar signal processing <B>theory</B> treats signals and 
> noise as being random, but with different probability distributions."
> 
> The natural language of which the Web chiefly consists 
> behaves in exactly this way.
> 
> Or the sense, also noted by Sean, in which you can search the 
> Web for a name and find a wide range of people.
> 
> But perhaps ambiguous can serve as well with both of these 
> senses for most purposes.
> 
> -Rob
> 
> On Fri, Feb 15, 2008 at 12:47 PM, John F. Sowa 
> <sowa@xxxxxxxxxxx> wrote:
> > Rob,
> >
> >  Although I agree with some of the points you're trying to 
> make,  you 
> > could make your case more convincing if you would avoid  making 
> > statements that are blatantly false if the words are  
> interpreted in 
> > their normal senses.
> >
> >  In the following passage, I suggest that you avoid the 
> word  'random' 
> > or 'randomness':
> >
> >
> >   > To try and be a little concrete for a minute, what this means
> >   > in terms of the Web is that the Web itself (by virtue of its
> >   > randomness) is the most compact representation of the knowledge
> >   > it contains.
> >
> >  The web may be large and complex, but it is definitely 
> *not* random  
> > in any sense; almost every page on the web can be compacted by a  
> > large fraction; and the entire web contains an enormous amount of  
> > duplication that would permit great compaction if anybody 
> wanted  to 
> > spend the time and money to do so.
> >
> >  To demonstrate that point, download a typical web page and try  
> > compacting it using any compression program on your computer  (ZIP, 
> > although not great, is usually available).
> >
> >  If you're using Windows, you can also open any zipped 
> directory,  use 
> > the details view, and note the percentage compression of 
> each  file.  
> > A text file is typically compressed to less than a third of  its 
> > original size, but diagrams are usually compressed much 
> less  because 
> > most of them have already been compressed.
> >
> >  John
>  
> _________________________________________________________________
> Message Archives: http://ontolog.cim3.net/forum/ontolog-forum/
> Subscribe/Config: 
> http://ontolog.cim3.net/mailman/listinfo/ontolog-forum/
> Unsubscribe: mailto:ontolog-forum-leave@xxxxxxxxxxxxxxxx
> Shared Files: http://ontolog.cim3.net/file/ Community Wiki: 
> http://ontolog.cim3.net/wiki/ To Post: 
> mailto:ontolog-forum@xxxxxxxxxxxxxxxx
>  
> 
>     (012)

********************************************************************
This email and any attachments are confidential to the intended
recipient and may also be privileged. If you are not the intended
recipient please delete it from your system and notify the sender.
You should not copy it or use it for any purpose nor disclose or
distribute its contents to any other person.
********************************************************************    (013)


_________________________________________________________________
Message Archives: http://ontolog.cim3.net/forum/ontolog-forum/  
Subscribe/Config: http://ontolog.cim3.net/mailman/listinfo/ontolog-forum/  
Unsubscribe: mailto:ontolog-forum-leave@xxxxxxxxxxxxxxxx
Shared Files: http://ontolog.cim3.net/file/
Community Wiki: http://ontolog.cim3.net/wiki/ 
To Post: mailto:ontolog-forum@xxxxxxxxxxxxxxxx    (014)

<Prev in Thread] Current Thread [Next in Thread>