ontolog-forum
[Top] [All Lists]

Re: [ontolog-forum] Natural Language based SPARQL Generator

To: ontolog-forum@xxxxxxxxxxxxxxxx
From: Kingsley Idehen <kidehen@xxxxxxxxxxxxxx>
Date: Thu, 31 Jan 2013 17:20:33 -0500
Message-id: <510AEE31.2070907@xxxxxxxxxxxxxx>
On 1/31/13 4:54 PM, doug foxvog wrote:
> On Thu, January 31, 2013 09:28, Kingsley Idehen wrote:
>> On 1/30/13 11:16 PM, John F Sowa wrote:
>>> On 1/30/2013 1:00 PM, Kingsley Idehen wrote:
>>>> Quick FYI re: http://quepy.machinalis.com/ .
>>>> Examples:
>>>> 1. What is the capital of Nigeria? --http://bit.ly/WvXAIb.
>>>> 2. Who is the President of Nigeria? --http://bit.ly/XIsc5x.
>>>> 3. What is the population of Nigeria? --http://bit.ly/XQbAJM.
>>> I was *extremely* unimpressed by that system.
>> Remember, this is a bare bones system. ...
>> I did not present this as an excellent answer service. It's about a tool
>> for generating SPARQL .
> I didn't look through the code, but it seems to be hand-coded from
> English words to SPARQL queries.  I asked for the population of
> Washington, to see if it would go for the state, DC, or give me
> answers for a set of different Washingtons, but the SPARQL queried
> a Country.
>
> One of the example queries asked for when Pink Floyd was founded,
> so i asked when the US was founded.  The generated SPARQL queried
> for a band with the name "US".
>
> I asked for the President of the UK, and since the SPARQL query
> was for a leader, not a president, the answers returned were
> David Cameron and Queen Elizabeth II.
>
> I asked for the president of some major companies, but the SPARQL
> query was for countries with the names of those companies.
>
> I don't find any indication ontology use in the generation of the SPARQL
> queries.
>
> -- doug foxvog    (01)

Yes, that was one of the very points I made in my response to John. Its 
very basic, but demonstrates the ability to front SPARQL with a natural 
language processor.    (02)

There are many ways this system can be improved so that it does leverage 
inference etc.. First off, they need to make the generated SPARQL 
editable :-)    (03)

Kingsley
>
>>> I typed each of those three sentences to Google and got the
>>> answer just from the brief excerpts quoted in the first
>>> few hits.
>> Please post the URLs of the Google responses.
>>
>> Here is a Google URL for the question: Who is the president of Nigeria?
>> 
><http://www.google.com/search?client=safari&rls=en&q=who+is+the+president+of+nigeria&ie=UTF-8&oe=UTF-8>
>>
>> The problem with the response is that its giving me a report (data
>> contextualized by a document) that doesn't return the actual identifier
>> that denotes the president of Nigeria. I can't use the output from
>> Google in a program to construct or navigate a graph as part of a
>> knowledge processing pipeline. That's the problem with Google's approach.
>>
>> The same question posed to Wolfram Alpha gets the answer but once again
>> without an identifier that denotes the president of Nigeria:
>> <http://www.wolframalpha.com/input/?i=who+is+the+president+of+Nigeria> .
>>
>>> I didn't even have to click on any of the URLs.
>> The URIs are the issue here. We want to query an HTTP accessible data
>> space (comprised of entity relationship graphs) and have the ability to
>> incorporate super keys (URIs) into the query result set. In addition, we
>> want to be able to share query results and their definitions via
>> hyperlinks (URIs).
>>
>>> Then I typed the following sentences to both Quepy and Google.
>>> In each case, I typed the full sentence to Quepy & did a cut
>>> and paste of exactly the same string to Google.  Since Quepy
>>> would complain about capitalization, I was careful about that:
>>>
>>>     1. Who was Einstein?
>>>
>>>     2. What did Einstein do?
>>>
>>>     3. When did Lindbergh cross the Atlantic?
>>>
>>>     4. What is the chemical formula for acetone?
>>>
>>>     5. When did the Revolutionary War end?
>> See my comments above.
>>
>> The generated query reads:
>>
>> ## query start ##
>> PREFIX owl: <http://www.w3.org/2002/07/owl#>
>> PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
>> PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
>> PREFIX foaf: <http://xmlns.com/foaf/0.1/>
>> PREFIX skos: <http://www.w3.org/2004/02/skos/core#>
>> PREFIX quepy: <http://www.machinalis.com/quepy#>
>> PREFIX dbpedia: <http://dbpedia.org/ontology/>
>> PREFIX dbpprop: <http://dbpedia.org/property/>
>> PREFIX dbpedia-owl: <http://dbpedia.org/ontology/>
>>
>> SELECT DISTINCT ?x1 WHERE {
>>     ?x0 rdf:type foaf:Person.
>>     ?x0 rdfs:label "Einstein"@en.
>>     ?x0 rdfs:comment ?x1.
>> }
>>
>> ##query end ##
>>
>> That's just basic query generation which can be easily fixed via the
>> toolset (which is openly available to anyone). Note, 'Albert Einstein'
>> denote any entity, there's isn't a notability factor implicit in your
>> Google searches i.e., you've already concluded which 'Albert Einstein'
>> you seek information about.
>>
>> SPARQL factoring disambiguation [3][4][5] would look something like the
>> following:
>>
>> ## query start ##
>>
>>    SELECT ?s1 AS ?c1,
>>           ( bif:search_excerpt ( bif:vector ( 'ALBERT', 'EINSTEIN' ) ,
>> ?o1 ) ) AS ?c2,
>>           ?sc,
>>           ?rank,
>>           ?g
>>    WHERE
>>     {
>>         {
>>           SELECT ?s1,
>>           ( ?sc * 3e-1 ) AS ?sc,
>>           ?o1,
>>           ( sql:rnk_scale ( <LONG::IRI_RANK> ( ?s1 ) ) ) as ?rank,
>>           ?g
>>           WHERE
>>           {
>>             QUAD MAP virtrdf:DefaultQuadMap
>>             {
>>               GRAPH ?g
>>               {
>>                 ?s1 ?s1textp ?o1 .
>>                 ?o1 bif:contains ' ( ALBERT AND EINSTEIN ) ' option (
>> score ?sc ) .
>>                 ?s1 a <http://schema.org/Person> .
>>                 ?s1 <http://purl.org/dc/terms/subject> ?s2 .
>>                 FILTER ( ?s2 =
>> <http://dbpedia.org/resource/Category:Nobel_laureates_in_Physics> ) .
>>
>>               }
>>              }
>>            }
>>          ORDER BY DESC ( ?sc * 3e-1 + sql:rnk_scale ( <LONG::IRI_RANK> (
>> ?s1 ) ) )
>>          LIMIT 20 OFFSET 0
>>         }
>>      }
>>
>>    ## query end ##
>>
>> To conclude, the key point I sought to make via this post is that
>> natural language based SPARQL generation is an emerging frontier. One
>> that never happened on the SQL front, in any kind of webby way, of
>> course, I would happily look at a live Web accessible SQL based system
>> to see if it can match the most basic SPARQL functionality demonstrated
>> by Quepy fronting SPARQL :-)
>>
>>
>> Links:
>>
>> 1. https://github.com/machinalis/quepy .
>> 2. http://bit.ly/UydU9t -- example of inlined inference via SPARQL 1.1
>> property paths functionality .
>> 3. http://bit.ly/Vxtki9 -- 'Albert Einstein' disambiguated .
>> 4. http://bit.ly/14zcUHB -- SPARQL Query Results Link .
>> 5. http://bit.ly/11lH6kQ -- SPARQL Query Definition Link .
>>
>>
>> --
>>
>> Regards,
>>
>> Kingsley Idehen
>> Founder & CEO
>> OpenLink Software
>> Company Web: http://www.openlinksw.com
>> Personal Weblog: http://www.openlinksw.com/blog/~kidehen
>> Twitter/Identi.ca handle: @kidehen
>> Google+ Profile: https://plus.google.com/112399767740508618350/about
>> LinkedIn Profile: http://www.linkedin.com/in/kidehen
>>
>>
>>
>>
>>
>>
>> _________________________________________________________________
>> Message Archives: http://ontolog.cim3.net/forum/ontolog-forum/
>> Config Subscr: http://ontolog.cim3.net/mailman/listinfo/ontolog-forum/
>> Unsubscribe: mailto:ontolog-forum-leave@xxxxxxxxxxxxxxxx
>> Shared Files: http://ontolog.cim3.net/file/
>> Community Wiki: http://ontolog.cim3.net/wiki/
>> To join: http://ontolog.cim3.net/cgi-bin/wiki.pl?WikiHomePage#nid1J
>>
>
>   
> _________________________________________________________________
> Message Archives: http://ontolog.cim3.net/forum/ontolog-forum/
> Config Subscr: http://ontolog.cim3.net/mailman/listinfo/ontolog-forum/
> Unsubscribe: mailto:ontolog-forum-leave@xxxxxxxxxxxxxxxx
> Shared Files: http://ontolog.cim3.net/file/
> Community Wiki: http://ontolog.cim3.net/wiki/
> To join: http://ontolog.cim3.net/cgi-bin/wiki.pl?WikiHomePage#nid1J
>   
>
>    (04)


--     (05)

Regards,    (06)

Kingsley Idehen 
Founder & CEO
OpenLink Software
Company Web: http://www.openlinksw.com
Personal Weblog: http://www.openlinksw.com/blog/~kidehen
Twitter/Identi.ca handle: @kidehen
Google+ Profile: https://plus.google.com/112399767740508618350/about
LinkedIn Profile: http://www.linkedin.com/in/kidehen    (07)

Attachment: smime.p7s
Description: S/MIME Cryptographic Signature


_________________________________________________________________
Message Archives: http://ontolog.cim3.net/forum/ontolog-forum/  
Config Subscr: http://ontolog.cim3.net/mailman/listinfo/ontolog-forum/  
Unsubscribe: mailto:ontolog-forum-leave@xxxxxxxxxxxxxxxx
Shared Files: http://ontolog.cim3.net/file/
Community Wiki: http://ontolog.cim3.net/wiki/ 
To join: http://ontolog.cim3.net/cgi-bin/wiki.pl?WikiHomePage#nid1J    (01)

<Prev in Thread] Current Thread [Next in Thread>