ontolog-forum
[Top] [All Lists]

Re: [ontolog-forum] So you want to be a Data Scientist?

To: ontolog-forum@xxxxxxxxxxxxxxxx
From: John Bottoms <john@xxxxxxxxxxxxxxxxxxxx>
Date: Fri, 28 Dec 2012 15:03:02 -0500
Message-id: <50DDFAF6.4010000@xxxxxxxxxxxxxxxxxxxx>
JohnS,

I recently went to a roll-out of MS Windows 8 Embedded; The Internet of Things.
Microsoft is looking to developers to incorporate the Embedded 8 components in
their embedded systems. The presentation was about 1/2 technical and 1/2 marketing.

Their projections for 2020 is about 1/2 $T for Internet embedded systems.
The database system for MS Embedded 8 products  is MS SQL.

There was no mention of taxonomies or ontologies.

-John Befuddled Bottoms
 FirstStar Systems
 Concord, MA USA
(The train is leaving the station, but it's not my train!)


On 12/28/2012 10:48 AM, John F Sowa wrote:
That's the title of an article by Charles Rose:

    http://www.dataversity.net/so-you-want-to-be-a-data-scientist/

The skills required for data scientists have a high overlap with skills
that would be very useful for ontologists.  Some excerpts:

CR
Do you ever wonder what Facebook does with all those likes that
people give and receive? Or how Netflix figures out exactly what
movies to recommend to you? Or how Google can infer exactly what
you are trying to search for as soon as you begin typing something
into the search box? What about the ads on LinkedIn that relates
directly to your profile, the music lists in iTunes and any other
scores of connections that just happen? Those examples are just
a few of the multitudes of data-related instances constantly being
collected and analyzed all the time.
Rose quotes people he calls "some of the biggest names in the data
management field" who define what a data scientist does:

“Data scientists are part digital trend spotter and part storyteller
stitching various pieces of information together. These are people
or teams at organizations that sift through the explosion of data to
discover what the data is telling them.” Anjul Bhambhri, Vice President
of Big Data Products at IBM.
IBM has a VP of Big Data Products, but not a VP of ontology or the 
Semantic Web.

“A data scientist is that unique blend of skills that can both unlock
the insights of data and tell a fantastic story via the data.”
Dr. DJ Patil is a Data Scientist in Residence at Greylock Partners,
as well as the former Chief Scientist, Chief Security Officer and
Head of Analytics and Data Teams at the LinkedIn Corporation.
They have "Data Teams", but not ontology teams.

“A data scientist is a rare hybrid, a computer scientist with the
programming abilities to build software to scrape, combine, and manage
data from a variety of sources and a statistician who knows how to
derive insights from the information within. S/he combines the skills
to create new prototypes with the creativity and thoroughness to ask and

      
answer the deepest questions about the data and what secrets it holds.”
Jake Porway, Data without Borders and New York Times.
Note the emphasis on integrating theory and practice.  One ontologist
who contributes to this forum claimed that "tools aren't interesting".
That attitude helps explain why people who have real work to do don't
find ontology interesting.

Data scientists are “analytically-minded, statistically and
mathematically sophisticated data engineers who can infer insights
about business and other complex systems from large quantities of data.”
Steve Hillion, Vice President of Analytics at EMC Greenplum.
Neither Charles Rose nor the people he quotes mention logic, ontology,
the Semantic Web, or any of the SW notations and tools.

CR
What does it take to be a Data Scientist?
1. Mathematics: Data Scientists must be competent mathematicians...
2. Statistical Analysis: A strong knowledge of R, SAS, SciPy, Stata, SPSS...
3. Programming/Scripting Languages: ... C/C++, Java, PHP, Ruby, Perl, Python...
4. Relational Databases: Know your way around SQL-based systems...
5. Distributed Computing Systems and Tools: NoSQL platforms...
6. Data Mining: Learn the primary tools used in Data Mining today...
7. Data Modeling: ... able to understand the models, present them to
   C-Level Executives and ... the many modeling tools/techniques/methodologies
   such as ERWin, Agile, ORM diagrams, UML class diagrams, CRC cards,
   conceptual/logical/physical schema, DDL, Bachman diagrams, Zachman Framework...
8. Visualization: ... tools such as Flare, HighCharts, AmCharts, D3.js,
   Google Visualization API, Raphael.js ...  Data Scientists have to tell
   a story with their data; they must provide a data narrative that anyone
   in the enterprise can follow, understand and utilize.
9. Creativity and Innovation: ... Data Scientists must be able to innovate
   the collection, analysis and usage of data ... in novel  and fantastic
   ways so all that “enterprise-critical” data is put to advantageous use.
10. Communication and Business Perspicacity: Data Scientists are crossbreeds,
   the amalgamation of IT expertise and business smarts...
11. Education: ... Math, Statistics, Computer Science, Engineering or some
   other related technical field... A BS in one  of those fields is a must and
   an MS shows the ability to work within a system, complete tasks with deadlines
   and a background in theoretical principles. Add to that MS many years of
   experience in the field...
That web site also has links to a talk on "Practical Data Modeling"
by Peter Aiken:

http://www.dataversity.net/data-ed-slides-practical-data-modeling/9792/

Type 50 in the box to jump to slide 50, which summarizes "7 mistakes
you can't afford to make in enterprise data modeling."  Ontologists
who want to make ontology useful should also avoid those mistakes.

John
 


_________________________________________________________________
Message Archives: http://ontolog.cim3.net/forum/ontolog-forum/  
Config Subscr: http://ontolog.cim3.net/mailman/listinfo/ontolog-forum/  
Unsubscribe: mailto:ontolog-forum-leave@xxxxxxxxxxxxxxxx
Shared Files: http://ontolog.cim3.net/file/
Community Wiki: http://ontolog.cim3.net/wiki/ 
To join: http://ontolog.cim3.net/cgi-bin/wiki.pl?WikiHomePage#nid1J    (01)

<Prev in Thread] Current Thread [Next in Thread>