ontolog-forum
[Top] [All Lists]

Re: [ontolog-forum] Big Data Buzzwords From A to Z

To: "[ontolog-forum]" <ontolog-forum@xxxxxxxxxxxxxxxx>
From: John Bottoms <john@xxxxxxxxxxxxxxxxxxxx>
Date: Sun, 02 Dec 2012 12:33:25 -0500
Message-id: <50BB90E5.80107@xxxxxxxxxxxxxxxxxxxx>
Also of concern,    (01)

Hadoop is designed to work with social networking data which is 
configured to be sparse.
My interest in step-dancing hamster is one of about 2000 categories of 
hamster hobbied.
This is sparse data. Hadoop is designed to work with sparse matrices.    (02)

It appears that ontological data is designed to be dense and I'm not 
sure how to convert ontological data to sparse data to use the big data 
tools.    (03)

What about MapReduce?    (04)

-John Bottoms
  FirstStar Systems
  Concord, MA    (05)

On 12/2/2012 10:47 AM, John F Sowa wrote:
> Dear Matthew,
>
> MW
>> OK, so based on this list, big data is mostly about massively parallel
>> data warehousing and the low level technologies that support various
>> approaches to this, particularly on cheap hardware.
> I agree that many of the terms in the list suggest that conclusion --
> the terms 'Hadoop', 'map/reduce', and the names of software designed
> to process huge amounts of data.  Map/reduce is an algorithm published
> by Google, and Hadoop is an open source implementation by Yahoo.
>
> But Google and Yahoo process huge amounts of web data. Cost/performance
> is very important for them, but so is semantics.  They also process,
> store, and analyze the links in and among web pages and their contents.
>
> The letter D is represented by Data warehousing.  But note slide 5:
>> But as data volumes explode, data warehouse systems are rapidly
>> changing. They need to store more data -- and more kinds of data
>> -- making their management a challenge.
> 
>http://www.crn.com/slide-shows/data-center/240142568/big-data-buzzwords-from-a-to-z.htm?pgno=5
>
> MW
>> In addition, analysis of unstructured data is thrown in as well, but
>> I guess that is just an input to the data warehousing.
> Note the terms 'text analytics', 'geospatial analysis', 'quantitative
> data analytics', and 'visualization'.  They are definitely concerned
> about the semantics of everything in the warehouse.
>
> As slide 5 says, data warehousing means much more than it did when
> the term was introduced 25 years ago.  Google and Yahoo, for example,
> have enormous data warehouses, but each web page has as much semantic
> information about its content as they can derive from it and from
> all the pages it's linked to and from.
>
> Note all the terms that refer to databases:  'columnar database',
> 'NoSQL', and 'relational database'.  The term 'extract, transform,
> and load' addresses the issues of aligning and mapping independent
> databases.  Note 'sharding' for partitioning databases -- that
> requires a lot of semantics about a database and how it's used.
> Note 'Whirr' for "running libraries for data cloud services".
>
> Note Kafka -- a messaging system developed by LinkedIn and
> contributed to the Apache Foundation.  LinkedIn maintains a lot
> of semantic information about their members and their interests.
>
> The tools contributed to the Apache Foundation are developed and used
> by multibillion dollar corporations.  They hire the best graduates from
> the best universities.  Compared to them, the academics who designed
> OWL and SPARQL are amateurs who don't understand the problems.
>
> I don't believe we should dump everything developed by the SW.
> But if the academics want mainstream IT to adopt their toys, they
> have to understand the problems of mainstream IT.  They can start
> by studying what industry does now and needs to do in the future.
>
> John
>   
> _________________________________________________________________
> Message Archives: http://ontolog.cim3.net/forum/ontolog-forum/
> Config Subscr: http://ontolog.cim3.net/mailman/listinfo/ontolog-forum/
> Unsubscribe: mailto:ontolog-forum-leave@xxxxxxxxxxxxxxxx
> Shared Files: http://ontolog.cim3.net/file/
> Community Wiki: http://ontolog.cim3.net/wiki/
> To join: http://ontolog.cim3.net/cgi-bin/wiki.pl?WikiHomePage#nid1J
>   
>
>
> -----
> No virus found in this message.
> Checked by AVG - www.avg.com
> Version: 2012.0.2221 / Virus Database: 2634/5432 - Release Date: 12/02/12
>
>    (06)


_________________________________________________________________
Message Archives: http://ontolog.cim3.net/forum/ontolog-forum/  
Config Subscr: http://ontolog.cim3.net/mailman/listinfo/ontolog-forum/  
Unsubscribe: mailto:ontolog-forum-leave@xxxxxxxxxxxxxxxx
Shared Files: http://ontolog.cim3.net/file/
Community Wiki: http://ontolog.cim3.net/wiki/ 
To join: http://ontolog.cim3.net/cgi-bin/wiki.pl?WikiHomePage#nid1J    (07)

<Prev in Thread] Current Thread [Next in Thread>