ontolog-forum
[Top] [All Lists]

Re: [ontolog-forum] Natural Language based SPARQL Generator

To: ontolog-forum@xxxxxxxxxxxxxxxx
From: Kingsley Idehen <kidehen@xxxxxxxxxxxxxx>
Date: Fri, 01 Feb 2013 09:43:31 -0500
Message-id: <510BD493.1060506@xxxxxxxxxxxxxx>
On 1/31/13 11:36 PM, John F Sowa wrote:
> Kingsley and Doug,
>
> The Quepy developers use the NLTK toolkit, which is an open-source
> set of Python-based software for NLP processing.  It's widely used
> for teaching purposes.  But it is not state of the art NLP software.
>
> KI
>> it's only using DBpedia whereas if it used the LOD cloud cache
>> there would be a much broader knowledgebase.
> Google answered every one of my five questions, but Quepy could
> only answer one of them.  I also tried Bing, which did just
> as well as Google on all five.    (01)

But I am not presenting this as the ultimate question and answer 
machine. Far from it. I am posing it as a showcase NLP aiding SPARQL 
generation.    (02)

>
> In fact, Bing got a better answer for the question "When did
> the Revolutionary War end?"  In addition to hits that were similar
> to Google's, Bing gave the following answer above the list of hits:
>
> Bing    (03)

See my comments above.    (04)

Google, Bing, and others are silos. I am in the business of silo-busting 
via Web architecture and open standards. I am most interested in a 
global distributed database (offering equal billing to extensional and 
intensional functionality) where hyperlinks are super keys that resolve 
to entity relationship graphs endowed with machine and human 
comprehensible entity relationship semantics.
>> The American Revolutionary War began on Wednesday, April 19, 1775
>> and ended on Wednesday, September 3, 1783.
> DF
>> I asked for the President of the UK, and since the SPARQL query
>> was for a leader, not a president, the answers returned were
>> David Cameron and Queen Elizabeth II.
> I typed "Who is the president of the UK?" to Google and Bing.
> Both of them found the following plus some other relevant hits:
>
> Bing and Google
>>      Who is the president of the United Kingdom - The Q&A wiki
>>      wiki.answers.com    United Kingdom  UK Politics
>>
>>      The United Kingdom is a parliamentary constitutional monarchy
>>      and has no president. HM Queen Elizabeth II is Head of State.
>>      The Right Honourable David Cameron MP is ...
> Of course, Google and Microsoft (Bing) are multi-billion dollar
> corporations with huge R & D budgets.  Quepy is OK for homework
> exercises in a course on NLP.    (05)

Yes, but you continue to present examples that really aren't aligned to 
my core point. Again, Google, Bing etc.. are all silos. Being a 
corporation doesn't mean they have to be data silo vectors. They will 
ultimately be far more successful once they understand the virtues of 
de-silo-fication, at Web-scale.    (06)

>
> DF
>> It seems to generate SPARQL without using any ontology.
> I read some of the Quepy documentation, which indicates that
> they do recognize "classes" and "subclasses".  But Google,
> Bing, and many other commercial companies have much richer
> resources.    (07)

Resource riches don't matter so much on the Web. Google didn't even 
exist 20 years ago. Imagine if someone told a VC (circa. 1992) that 
Google and others would emerge in the future, at the expense of 
Microsoft? 99.99% of the time you would have been laughed out of the room.    (08)

The Web inflection is still very much in its infancy.
>
> KI
>> To conclude, the key point I sought to make via this post is that
>> natural language based SPARQL generation is an emerging frontier.
> I doubt that.  Neither Google nor Bing use RDF, SPARQL, or OWL.
> Instead, they do pattern matching directly to the raw, unannotated
> natural language texts.    (09)

They don't really matter as much as you assume. They aren't the beacons 
of measurement in this realms.    (010)

They only become interesting whenever they plug into the global Web DBMS 
and start publishing hyperlink based super keys. Whenever the come to 
that reality, SPARQL's utility will be crystal clear to them.
>
> I'll admit that there is a large and growing corpus of tagged
> documents, for which RDF processing can be useful.  But the raw NL
> documents are growing at a much faster rate than the tagging.    (011)

The are increasingly being fused. There are many collaborations in the 
Linked Data realm that leverage NLP services. I've been involved with 
several (and counting) with lots of simple live demonstrations in hand 
[1][2][3].    (012)

>
> KI
>> that never happened on the SQL front, in any kind of webby way, of
>> course, I would happily look at a live Web accessible SQL based system
>> to see if it can match the most basic SPARQL functionality demonstrated
>> by Quepy fronting SPARQL
> SQL has a superset of the expressive power of RDF.  People had developed
> very sophisticated NLP query systems for DB queries 30 years ago.  For
> examples, see http://www.jfsowa.com/pubs/futures.pdf .  Most of those
> systems never became profitable or they remained niche products.    (013)

I want links to existing live systems based on SQL that can deliver 
heterogeneous access to disparate Web accessible data sources via 
hyperlink based super keys. Where are those systems? They don't exist 
for a very good reason: they can't handle the nature of the Web:    (014)

1. Unpredicatable query request volume
2. Unpredicatable query request scope
3. Unpredictable query results sets navigation across entity 
relationship dimensions via cursors (static, keyset, dynamic, or mixed)
4. Unpredictable attention span of users .    (015)

There is a showdown point on the horizon that will ultimately bring 1-4 
in scope re. the likes of Google, Bing, and any other data silo player.
>
> But some of them have been connected to speech systems for those
> annoying automated telephone systems.  Replacing SQL with SPARQL
> will do nothing to make them less annoying.    (016)

Not my point. See my comments above. It's all about data virtualization 
via hyperlink based super keys.
>
> As for webifying a version of SQL, that would be fairly easy to do.    (017)

Where is it?    (018)

> In Fact, Tim B-L included SQL as one of the languages that had to be
> supported.  (See his DAML proposal of 2000.)  Oracle and IBM do that
> with their products.  But the clueless academics who jumped on the
> DAML bandwagon refused to support SQL.    (019)

Somewhat inaccurate, they contributed to the development of SPARQL.    (020)

SPARQL is SQL for the new Web-scale DBMS frontier. It's extremely 
powerful and utterly useful :-)    (021)


Kingsley
>
> John
>   
> _________________________________________________________________
> Message Archives: http://ontolog.cim3.net/forum/ontolog-forum/
> Config Subscr: http://ontolog.cim3.net/mailman/listinfo/ontolog-forum/
> Unsubscribe: mailto:ontolog-forum-leave@xxxxxxxxxxxxxxxx
> Shared Files: http://ontolog.cim3.net/file/
> Community Wiki: http://ontolog.cim3.net/wiki/
> To join: http://ontolog.cim3.net/cgi-bin/wiki.pl?WikiHomePage#nid1J
>   
>
>    (022)


--     (023)

Regards,    (024)

Kingsley Idehen 
Founder & CEO
OpenLink Software
Company Web: http://www.openlinksw.com
Personal Weblog: http://www.openlinksw.com/blog/~kidehen
Twitter/Identi.ca handle: @kidehen
Google+ Profile: https://plus.google.com/112399767740508618350/about
LinkedIn Profile: http://www.linkedin.com/in/kidehen    (025)

Attachment: smime.p7s
Description: S/MIME Cryptographic Signature


_________________________________________________________________
Message Archives: http://ontolog.cim3.net/forum/ontolog-forum/  
Config Subscr: http://ontolog.cim3.net/mailman/listinfo/ontolog-forum/  
Unsubscribe: mailto:ontolog-forum-leave@xxxxxxxxxxxxxxxx
Shared Files: http://ontolog.cim3.net/file/
Community Wiki: http://ontolog.cim3.net/wiki/ 
To join: http://ontolog.cim3.net/cgi-bin/wiki.pl?WikiHomePage#nid1J    (01)

<Prev in Thread] Current Thread [Next in Thread>