Hi All, (01)
I'm not actually involved in that ISO group, just know it exists. Was only
suggesting two things: (02)
1) Big Data is not only a buzzword (03)
2) there is a group looking into this and John, Philip, etc. might want to
provide some input to them if you want to make an impact beyond this forum. (04)
UK +44 7788 561308
US +1 336 283 0606 (06)
On 31 Aug 2014, at 07:33, John F Sowa <sowa@xxxxxxxxxxx> wrote: (07)
> On 8/30/2014 11:39 PM, John Bottoms wrote:
>> I disagree.
> I have no idea what you're disagreeing with. I basically agree with
> what you wrote, and I can't see anything that is inconsistent with
> my recommended definition:
> To repeat:
>> I suggest a very simple definition for Big Data:
>> Data whose size N (in bytes) is so large that any algorithm that
>> takes time that is polynomial in N (for any exponent greater than 1)
>> is prohibitively expensive with existing hardware.
>> When I was developing our browser in '87 I was told absolutely,
>> by known experts, that it was not possible to search large documents
>> in a reasonable time (3 seconds in those days).The argument was
>> that text could only searched using relational database tables and
>> that was too slow to make it usable.
> Clearly, those so-called experts didn't know what they were talking
> about. But it's irrelevant to the definition I proposed. Fundamental
> 1. With indexes, you can find what you're looking for in (log N) time.
> 2. But to create the indexes, you need algorithms that take no more
> than (N log N) time.
> 3. For finding data, linear searches are bad. For creating indexes,
> polynomial time is hopelessly inefficient.
> 4. There's always room for innovation in finding (log N) algorithms
> for searching more complex data in more flexible ways.
>> We adopted B-trees and that has become one of the mainstays, along
>> with B+ trees, Red/Black trees and Page Rank, of search today.
> Of course. Those algorithms are logarithmic. With any kind of
> hardware, you need logarithmic algorithms to search indexed data.
> And you must have no worse than (N log N) to create the indexes.
>> This was just one of the myths that made education about search
>> difficult at the time.
> Back in the 1960s, Donald Knuth was very clear about these issues.
> Anybody who had studied Knuth could never have made the kind of
> claims you're talking about.
>> But the point I would like to make is that creative use of data
>> structures and architectures can be useful when algorithmic solutions
>> can't be found. We have always had a practice in CS of dealing with
>> large data; I date my entry into the discipline at 1964.
> The field known as computer science (or informatics) was established
> in the mid-60s. There were many pioneers who established the basic
> principles. By the 1970s, the principles were fairly well known.
> But the PCs of the 1980s brought a new generation of kiddies who
> had no training in any of the hard-won results of the '60s and '70s.
> Those people you met in 1987 either (a) belonged to the younger and
> stupider generation or (b) belonged to the older generation that had
> never studied computer science. In my 30 years at IBM, I met plenty
> of both -- along with some who had invented many of the ideas.
> Message Archives: http://ontolog.cim3.net/forum/ontolog-forum/
> Config Subscr: http://ontolog.cim3.net/mailman/listinfo/ontolog-forum/
> Unsubscribe: mailto:ontolog-forum-leave@xxxxxxxxxxxxxxxx
> Shared Files: http://ontolog.cim3.net/file/
> Community Wiki: http://ontolog.cim3.net/wiki/
> To join: http://ontolog.cim3.net/cgi-bin/wiki.pl?WikiHomePage#nid1J
Message Archives: http://ontolog.cim3.net/forum/ontolog-forum/
Config Subscr: http://ontolog.cim3.net/mailman/listinfo/ontolog-forum/
Shared Files: http://ontolog.cim3.net/file/
Community Wiki: http://ontolog.cim3.net/wiki/
To join: http://ontolog.cim3.net/cgi-bin/wiki.pl?WikiHomePage#nid1J (09)