Re: [ontolog-forum] Big Data Challenges

To:	"'[ontolog-forum] '" <ontolog-forum@xxxxxxxxxxxxxxxx>
From:	"Brand Niemann" <bniemann@xxxxxxx>
Date:	Tue, 10 Apr 2012 10:20:55 -0400
Message-id:	<001801cd1725$24849570$6d8dc050$@cox.net>

John, What do you think about Digital Reasoning?    (01)

-----Original Message-----
From: ontolog-forum-bounces@xxxxxxxxxxxxxxxx
[mailto:ontolog-forum-bounces@xxxxxxxxxxxxxxxx] On Behalf Of John F Sowa
Sent: Tuesday, April 10, 2012 9:08 AM
To: '[ontolog-forum] '
Subject: [ontolog-forum] Big Data Challenges    (02)

Following is a slightly edited version of a note I sent to a different list,
but it's also relevant to this list.    (03)

John Sowa    (04)

-------- Original Message --------    (05)

I'd like to mention three projects that illustrate ways that people use
highly expressive languages to process Big Data in sophisticated ways:    (06)

   1. Experian's use of logic programming to determine everybody's credit
      rating.  This is a multi-billion dollar company that used Prolog
      so heavily that they bought Prologia, the company founded by
      Alain Colmerauer, who built the first Prolog interpreter.    (07)

   2. Mathematica's use of logic programming and compilation technology
      to support engineers, mathematicians, and statisticians.  Among
      their users are the "quants" who analyze the stock markets and
      process huge volumes of data and make decisions at microsecond
      speeds.  This is merely a multi-million dollar company.    (08)

   3. The knowledge-compilation technology that led Bill Andersen
      and his colleagues to found Ontology Works, now High Fleet.
      They're not as big as the above two companies, but they've
      been in business for over a dozen years, and they build
      applications for large customers who process Big Data.    (09)

Re Experian:  They are highly secretive about their technology, rules, and
methods for detecting fraud and checking credit. It's not possible to cite
actual examples of what they do, but the general trends are clear.  They
started with commercial Prolog software, and they didn't buy Prologia to
sell Prolog software.  I'm sure that they wanted to develop their rule
language and compilation technology to make them more user friendly for the
people who write rules and more efficient for processing Big Data from all
sources, including the WWW.    (010)

Re Mathematica:  They also started with Prolog as the rule language for
writing all the software for analyzing and reasoning with and about
mathematical statements of any kind.  Over the years, their rule language
evolved far beyond Prolog, but they definitely did
*not* reduce expressive power.    (011)

Instead, they made it more "user friendly" for their target audience, who
know mathematics, but are not experts in computational complexity.
The tools give the users maximum expressive power, and they compile or
transform the input language to forms the system can process efficiently.
After the users are satisfied with the results, the tools can translate the
algorithms to code in FORTRAN or C.    (012)

As for the stock-market gang, those companies moved their computers to a
location that is as close as possible to where the high-speed Internet feed
enters Manhattan.  That gives them an 8-microsecond advantage over offices
on Wall Street.  That shows how much they value performance.  They process
Big Data, but they would never tolerate RDF bloat.    (013)

Re High Fleet:  I referred to fflogic.pdf for further discussion of their
1998 paper.  For convenience, I copied the paragraph that discusses it at
the end of this note.  It shows how compilation technology can extract info
from a very expressive language (CycL) and map it to forms that can be
processed efficiently by other tools.
The people who entered the knowledge had the freedom to express it in any
way that was convenient for them.  Then the compiler translated it to forms
that could be efficiently processed for the given problems.    (014)

After the authors started their own company, they no longer had access to
Cyc.  But they continued to use knowledge compilation techniques.
Their primary language is Prolog, but they generate anything their customers
want.  They also accept any kind of input they get, including RDF and OWL,
but they translate those languages to Prolog to improve expressive power
*and* efficiency.    (015)

Note that all three of these companies are commercially successful, and they
have stayed in business for many years.    (016)

JFS
>> Then there are the questions about who or what is going to do those 
>> transformations and adaptations.  The SME?  The knowledge engineer?    (017)

> Who does the programming -- the programmer.  Who does the extract from 
> a complex ontology -- the knowledge engineer.    (018)

Consider the three examples above.  It's very hard to distinguish the KE
from the SME.  The people who write rules for Experian know a great deal
about the subject, and they also know how to write rules in a very
expressive language -- Prolog or whatever Prologia now produces.    (019)

Or consider the "quants" who use Mathematica or whatever to analyze the
stock market.  They combine the functions of SME and KE.  And for all of
them, a software tool is the low-level programmer.    (020)

JFS
>> SMEs are experts in their subject, not in any kind of calculation.    (021)

> They should be expert in the sort of calculation performed in their 
> field of expertise.    (022)

Consider the people who use Mathematica.  They are experts in using
mathematics to state their problems.  But the system uses a very wide range
of methods for solving those problems.  It automatically chooses the
inference algorithms.  For any specific type of problem, it can translate
the algorithms to efficient code in FORTRAN or C.    (023)

But the people who enter the knowledge don't need to know, learn, or even
worry about computational complexity.  The tools handle that.
And the *tools* can warn the KE, SME, or end user about any issues of
computational complexity for any specific problem.    (024)

John
______________________________________________________________________    (025)

Source:  http://www.jfsowa.com/pubs/fflogic.pdf    (026)

Although controlled NLs are easy to read, writing them requires training for
the authors and tools for helping them. Using the logic generated from
controlled NLs in practical systems also requires tools for mapping logic to
current software. Both of these tasks could benefit from applied research:
the first in human factors, and the second in compiler technology. An
example of the second is a knowledge compiler developed by Peterson et al.
(1998), which extracted a subset of axioms from the Cyc system to drive a
deductive database. It translated Cyc axioms, stated in a superset of FOL,
to constraints for an SQL database and to Horn-clause rules for an inference
engine.
Although the knowledge engineers had used a very expressive dialect of
logic, 84% of the axioms they wrote could be translated directly to
Horn-clause rules (4667 of the 5532 axioms extracted from Cyc).
The remaining 865 axioms were translated to SQL constraints, which would
ensure that all database updates were consistent with the axioms.    (027)

Peterson, Brian J., William A. Andersen, & Joshua Engel (1998) Knowledge
bus: generating application-focused databases from large ontologies, Proc.
5th KRDB Workshop, Seattle, WA.
http://sunsite.informatik.rwth-aachen.de/Publications/CEUR-WS/Vol-10/    (028)

_________________________________________________________________
Message Archives: http://ontolog.cim3.net/forum/ontolog-forum/
Config Subscr: http://ontolog.cim3.net/mailman/listinfo/ontolog-forum/
Unsubscribe: mailto:ontolog-forum-leave@xxxxxxxxxxxxxxxx
Shared Files: http://ontolog.cim3.net/file/ Community Wiki:
http://ontolog.cim3.net/wiki/ To join:
http://ontolog.cim3.net/cgi-bin/wiki.pl?WikiHomePage#nid1J    (029)

_________________________________________________________________
Message Archives: http://ontolog.cim3.net/forum/ontolog-forum/  
Config Subscr: http://ontolog.cim3.net/mailman/listinfo/ontolog-forum/  
Unsubscribe: mailto:ontolog-forum-leave@xxxxxxxxxxxxxxxx
Shared Files: http://ontolog.cim3.net/file/
Community Wiki: http://ontolog.cim3.net/wiki/ 
To join: http://ontolog.cim3.net/cgi-bin/wiki.pl?WikiHomePage#nid1J    (030)

<Prev in Thread]	Current Thread	[Next in Thread>
[ontolog-forum] Big Data Challenges, John F Sowa Re: [ontolog-forum] Big Data Challenges, Brand Niemann <= Re: [ontolog-forum] Big Data Challenges, John F Sowa Re: [ontolog-forum] Big Data Challenges, Brand Niemann Re: [ontolog-forum] Big Data Challenges, Brand Niemann Re: [ontolog-forum] Big Data Challenges, k Goodier Re: [ontolog-forum] Big Data Challenges, Brand Niemann Re: [ontolog-forum] Big Data Challenges, Ghalem Ouadjed (EOWEO) [ontolog-forum] Big Data Challenges, Marcelino Sente

Previous by Date:	[ontolog-forum] Big Data Challenges, John F Sowa
Next by Date:	[ontolog-forum] Five years on: major conference on libraries and the Semantic Web, 27 April 2012 - Final announcement, DCMI Announce
Previous by Thread:	[ontolog-forum] Big Data Challenges, John F Sowa
Next by Thread:	Re: [ontolog-forum] Big Data Challenges, John F Sowa
Indexes:	[Date] [Thread] [Top] [All Lists]