ontolog-forum
[Top] [All Lists]

Re: [ontolog-forum] Looking to the Future of Data Science - NYTimes.com

To: "[ontolog-forum]" <ontolog-forum@xxxxxxxxxxxxxxxx>
From: David Price <dprice@xxxxxxxxxxxxxxx>
Date: Sun, 31 Aug 2014 10:26:57 +0100
Message-id: <15C9D82D-08BD-4C1D-8966-F9B82D5F04AC@xxxxxxxxxxxxxxx>
Hi All,    (01)

I'm not actually involved in that ISO group, just know it exists. Was only 
suggesting two things:    (02)

1) Big Data is not only a buzzword    (03)

 2) there is a group looking into this and John, Philip, etc. might want to 
provide some input to them if you want to make an impact beyond this forum.    (04)

Cheers,
David    (05)

UK +44 7788 561308
US +1 336 283 0606    (06)




On 31 Aug 2014, at 07:33, John F Sowa <sowa@xxxxxxxxxxx> wrote:    (07)

> On 8/30/2014 11:39 PM, John Bottoms wrote:
>> I disagree.
> 
> I have no idea what you're disagreeing with.  I basically agree with
> what you wrote, and I can't see anything that is inconsistent with
> my recommended definition:
> 
> To repeat:
>> I suggest a very simple definition for Big Data:
>> 
>> Data whose size N (in bytes) is so large that any algorithm that
>> takes time that is polynomial in N (for any exponent greater than 1)
>> is prohibitively expensive with existing hardware.
> 
> JB
>> When I was developing our browser in '87 I was told absolutely,
>> by known experts, that it was not possible to search large documents
>> in a reasonable time (3 seconds in those days).The argument was
>> that text could only searched using relational database tables and
>> that was too slow to make it usable.
> 
> Clearly, those so-called experts didn't know what they were talking
> about.  But it's irrelevant to the definition I proposed.  Fundamental
> principles:
> 
>  1. With indexes, you can find what you're looking for in (log N) time.
> 
>  2. But to create the indexes, you need algorithms that take no more
>     than (N log N) time.
> 
>  3. For finding data, linear searches are bad.  For creating indexes,
>     polynomial time is hopelessly inefficient.
> 
>  4. There's always room for innovation in finding (log N) algorithms
>     for searching more complex data in more flexible ways.
> 
> JB
>> We adopted B-trees and that has become one of the mainstays, along
>> with B+ trees, Red/Black trees and Page Rank, of search today.
> 
> Of course.  Those algorithms are logarithmic.  With any kind of
> hardware, you need logarithmic algorithms to search indexed data.
> And you must have no worse than (N log N) to create the indexes.
> 
> JB
>> This was just one of the myths that made education about search
>> difficult at the time.
> 
> Back in the 1960s, Donald Knuth was very clear about these issues.
> Anybody who had studied Knuth could never have made the kind of
> claims you're talking about.
> 
> JB
>> But the point I would like to make is that creative use of data
>> structures and architectures can be useful when algorithmic solutions
>> can't be found. We have always had a practice in CS of dealing with
>> large data; I date my entry into the discipline at 1964.
> 
> The field known as computer science (or informatics) was established
> in the mid-60s.  There were many pioneers who established the basic
> principles.  By the 1970s, the principles were fairly well known.
> 
> But the PCs of the 1980s brought a new generation of kiddies who
> had no training in any of the hard-won results of the '60s and '70s.
> 
> Those people you met in 1987 either (a) belonged to the younger and
> stupider generation or (b) belonged to the older generation that had
> never studied computer science.  In my 30 years at IBM, I met plenty
> of both -- along with some who had invented many of the ideas.
> 
> John
> 
> _________________________________________________________________
> Message Archives: http://ontolog.cim3.net/forum/ontolog-forum/  
> Config Subscr: http://ontolog.cim3.net/mailman/listinfo/ontolog-forum/  
> Unsubscribe: mailto:ontolog-forum-leave@xxxxxxxxxxxxxxxx
> Shared Files: http://ontolog.cim3.net/file/
> Community Wiki: http://ontolog.cim3.net/wiki/ 
> To join: http://ontolog.cim3.net/cgi-bin/wiki.pl?WikiHomePage#nid1J
>     (08)


_________________________________________________________________
Message Archives: http://ontolog.cim3.net/forum/ontolog-forum/  
Config Subscr: http://ontolog.cim3.net/mailman/listinfo/ontolog-forum/  
Unsubscribe: mailto:ontolog-forum-leave@xxxxxxxxxxxxxxxx
Shared Files: http://ontolog.cim3.net/file/
Community Wiki: http://ontolog.cim3.net/wiki/ 
To join: http://ontolog.cim3.net/cgi-bin/wiki.pl?WikiHomePage#nid1J    (09)

<Prev in Thread] Current Thread [Next in Thread>