ontolog-forum
[Top] [All Lists]

Re: [ontolog-forum] Big Data Buzzwords From A to Z

To: ontolog-forum@xxxxxxxxxxxxxxxx
From: John F Sowa <sowa@xxxxxxxxxxx>
Date: Mon, 03 Dec 2012 00:57:04 -0500
Message-id: <50BC3F30.2070809@xxxxxxxxxxx>
Leo, Kingsley, William, John B, and Bob,    (01)

The point I was trying to make with that list of buzzwords is that
the current ontology tools are totally isolated from the kinds of
things that mainstream IT is using.    (02)

Leo
> Implicit semantics in procedural code and structural data models just
> don't get us where we need to go. Humans simply cannot do all the
> semantic interpretation, or you never break out of this bottleneck.
> Machines have to lend a hand, by doing some explicit semantic
> interpretation.    (03)

We all agree.  And many of us have been preaching those ideas for years.
I was teaching courses in AI and knowledge representation at IBM in the
1980s.  The first question I would get is "That sounds great.  But what
AI software can you give me that works with the application programs
I (or my customers) have to write?"    (04)

At IBM in the 1980s, the only AI software I could offer them were LISP
and Prolog things written by academics who had never seen a mission-
critical application in their lives.  Furthermore, they were running
on the IBM VM/370 system, but not on IBM's mainstream MVS with DB2.    (05)

The software could be ported from VM to MVS by rewriting the bindings,
but none of the IBM mainstream software was designed to run with LISP
and Prolog.  Furthermore, few of the IBM programmers or IBM customers 
knew LISP or Prolog.  And I could not honestly tell them that any of
the LISP or Prolog stuff could help them do their job any better.    (06)

It's now 30 years later, and it's deja vu all over again.  The SW tools
and notations -- RDF, RDFS, OWL, and SPARQL -- are just as isolated from
mainstream IT today as LISP and Prolog were in the 1980s.    (07)

KI
> entity relationship model semantics can exist in self-describing structured 
>data.    (08)

Yes.  E-R diagrams were introduced in 1976.  Even earlier, there were
Bachman diagrams, type hierarchies, and Petri nets.  Versions of all
those diagrams were combined in UML, and mainstream programmers used
them to specify programs and databases. They developed tools to draw
the diagrams and map them to and from the software.    (09)

Just two kinds of UML diagrams provide 99% of the useful subset of OWL:
type hierarchies and E-R diagrams.  Controlled English could be added
as a very readable supplement or extension.  If the SW had adopted that
as the official strategy, mainstream IT would have a smooth migration
path to ontology-based tools.    (010)

WF
> I have found  some very skilled developers in the Hadoop space believing
> that because they are not using a relational database, they have no need
> for E/R-style concept models (be these expressed in traditional E/R, UML,
> etc.. ).   I have seen them go along fine, based on their being so smart
> they can keep the underlying relationships in their heads, till their
> systems start to grow.    (011)

I agree.  I have one general-purpose recommendation for all such cases:    (012)

    If you want people to be virtuous,
    you have to make virtue the path of least resistance.    (013)

UML diagrams are easy to get started with, and they can be supplemented
with controlled English for other subsets of logic.  Aristotle's
syllogisms (in English) can handle the type hierarchy.  Another subset
can be used for (Horn clause) rules.  And other subsets can be used to
supplement, relate, and extend each of the UML diagram types.    (014)

If the SW had adopted UML + Controlled English as their official
notation, they'd provide a smooth path toward ontology-based tools.    (015)

JB
> It appears that ontological data is designed to be dense and I'm not
> sure how to convert ontological data to sparse data to use the big data
> tools.    (016)

The size of the ontology is tiny compared to the size of the data.
For example, the largest English dictionaries have only a few hundred
thousand words.  That's trivial by today's standards.  But the amount
of text that uses those words is huge.    (017)

BN
Most people immersed in Big Data processing today would probably have
> gone with "Advanced Analytics"* as the most representative "A" word
> (not "ACID")...    (018)

I agree.  But I just cited that list to show that the Semantic Web
isn't even on the radar screen of the Big Data people.    (019)

John    (020)

_________________________________________________________________
Message Archives: http://ontolog.cim3.net/forum/ontolog-forum/  
Config Subscr: http://ontolog.cim3.net/mailman/listinfo/ontolog-forum/  
Unsubscribe: mailto:ontolog-forum-leave@xxxxxxxxxxxxxxxx
Shared Files: http://ontolog.cim3.net/file/
Community Wiki: http://ontolog.cim3.net/wiki/ 
To join: http://ontolog.cim3.net/cgi-bin/wiki.pl?WikiHomePage#nid1J    (021)

<Prev in Thread] Current Thread [Next in Thread>