ontolog-forum
[Top] [All Lists]

Re: [ontolog-forum] Natural Language based SPARQL Generator

To: "[ontolog-forum]" <ontolog-forum@xxxxxxxxxxxxxxxx>
From: "doug foxvog" <doug@xxxxxxxxxx>
Date: Thu, 31 Jan 2013 17:01:04 -0500
Message-id: <1f59937a2ba4f7751635a78205005ae1.squirrel@xxxxxxxxxxxxxxxxx>
On Thu, January 31, 2013 09:28, Kingsley Idehen wrote:
> On 1/30/13 11:16 PM, John F Sowa wrote:
>> On 1/30/2013 1:00 PM, Kingsley Idehen wrote:
>>> Quick FYI re: http://quepy.machinalis.com/ .    (01)

>>> Examples:    (02)

>>> 1. What is the capital of Nigeria? --http://bit.ly/WvXAIb.
>>> 2. Who is the President of Nigeria? --http://bit.ly/XIsc5x.
>>> 3. What is the population of Nigeria? --http://bit.ly/XQbAJM.
>> I was *extremely* unimpressed by that system.    (03)

> Remember, this is a bare bones system. Its open source [1] and
> extensible. ...
>
> I did not present this as an excellent answer service. It's about a tool
> for generating SPARQL .    (04)

It seems to generate SPARQL without using any ontology.    (05)

A query for a President, generates SPARQL for the leader of a country.
[So the two answers for President of the UK are Cameron and QE II.]    (06)

A query for population or capital of a geopolitical entity also restricts
that entity to being a country.    (07)

A query for when something was founded (I asked about the US)
generated for the dates that a music band with that name existed.    (08)

One can certainly hard code a mapping from English queries to
SPARQL queries, but i see no sign that this "bare bones system"
has anything to do with ontologies.    (09)

-- doug foxvog    (010)

>> I typed each of those three sentences to Google and got the
>> answer just from the brief excerpts quoted in the first
>> few hits.
>
> Please post the URLs of the Google responses.
>
> Here is a Google URL for the question: Who is the president of Nigeria?
> 
><http://www.google.com/search?client=safari&rls=en&q=who+is+the+president+of+nigeria&ie=UTF-8&oe=UTF-8>
>
> The problem with the response is that its giving me a report (data
> contextualized by a document) that doesn't return the actual identifier
> that denotes the president of Nigeria. I can't use the output from
> Google in a program to construct or navigate a graph as part of a
> knowledge processing pipeline. That's the problem with Google's approach.
>
> The same question posed to Wolfram Alpha gets the answer but once again
> without an identifier that denotes the president of Nigeria:
> <http://www.wolframalpha.com/input/?i=who+is+the+president+of+Nigeria> .
>
>> I didn't even have to click on any of the URLs.
> The URIs are the issue here. We want to query an HTTP accessible data
> space (comprised of entity relationship graphs) and have the ability to
> incorporate super keys (URIs) into the query result set. In addition, we
> want to be able to share query results and their definitions via
> hyperlinks (URIs).
>
>>
>> Then I typed the following sentences to both Quepy and Google.
>> In each case, I typed the full sentence to Quepy & did a cut
>> and paste of exactly the same string to Google.  Since Quepy
>> would complain about capitalization, I was careful about that:
>>
>>    1. Who was Einstein?
>>
>>    2. What did Einstein do?
>>
>>    3. When did Lindbergh cross the Atlantic?
>>
>>    4. What is the chemical formula for acetone?
>>
>>    5. When did the Revolutionary War end?
>
> See my comments above.
>
> The generated query reads:
>
> ## query start ##
> PREFIX owl: <http://www.w3.org/2002/07/owl#>
> PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
> PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
> PREFIX foaf: <http://xmlns.com/foaf/0.1/>
> PREFIX skos: <http://www.w3.org/2004/02/skos/core#>
> PREFIX quepy: <http://www.machinalis.com/quepy#>
> PREFIX dbpedia: <http://dbpedia.org/ontology/>
> PREFIX dbpprop: <http://dbpedia.org/property/>
> PREFIX dbpedia-owl: <http://dbpedia.org/ontology/>
>
> SELECT DISTINCT ?x1 WHERE {
>    ?x0 rdf:type foaf:Person.
>    ?x0 rdfs:label "Einstein"@en.
>    ?x0 rdfs:comment ?x1.
> }
>
> ##query end ##
>
> That's just basic query generation which can be easily fixed via the
> toolset (which is openly available to anyone). Note, 'Albert Einstein'
> denote any entity, there's isn't a notability factor implicit in your
> Google searches i.e., you've already concluded which 'Albert Einstein'
> you seek information about.
>
> SPARQL factoring disambiguation [3][4][5] would look something like the
> following:
>
> ## query start ##
>
>   SELECT ?s1 AS ?c1,
>          ( bif:search_excerpt ( bif:vector ( 'ALBERT', 'EINSTEIN' ) ,
> ?o1 ) ) AS ?c2,
>          ?sc,
>          ?rank,
>          ?g
>   WHERE
>    {
>        {
>          SELECT ?s1,
>          ( ?sc * 3e-1 ) AS ?sc,
>          ?o1,
>          ( sql:rnk_scale ( <LONG::IRI_RANK> ( ?s1 ) ) ) as ?rank,
>          ?g
>          WHERE
>          {
>            QUAD MAP virtrdf:DefaultQuadMap
>            {
>              GRAPH ?g
>              {
>                ?s1 ?s1textp ?o1 .
>                ?o1 bif:contains ' ( ALBERT AND EINSTEIN ) ' option (
> score ?sc ) .
>                ?s1 a <http://schema.org/Person> .
>                ?s1 <http://purl.org/dc/terms/subject> ?s2 .
>                FILTER ( ?s2 =
> <http://dbpedia.org/resource/Category:Nobel_laureates_in_Physics> ) .
>
>              }
>             }
>           }
>         ORDER BY DESC ( ?sc * 3e-1 + sql:rnk_scale ( <LONG::IRI_RANK> (
> ?s1 ) ) )
>         LIMIT 20 OFFSET 0
>        }
>     }
>
>   ## query end ##
>
> To conclude, the key point I sought to make via this post is that
> natural language based SPARQL generation is an emerging frontier. One
> that never happened on the SQL front, in any kind of webby way, of
> course, I would happily look at a live Web accessible SQL based system
> to see if it can match the most basic SPARQL functionality demonstrated
> by Quepy fronting SPARQL :-)
>
>
> Links:
>
> 1. https://github.com/machinalis/quepy .
> 2. http://bit.ly/UydU9t -- example of inlined inference via SPARQL 1.1
> property paths functionality .
> 3. http://bit.ly/Vxtki9 -- 'Albert Einstein' disambiguated .
> 4. http://bit.ly/14zcUHB -- SPARQL Query Results Link .
> 5. http://bit.ly/11lH6kQ -- SPARQL Query Definition Link .
>
>
> --
>
> Regards,
>
> Kingsley Idehen
> Founder & CEO
> OpenLink Software
> Company Web: http://www.openlinksw.com
> Personal Weblog: http://www.openlinksw.com/blog/~kidehen
> Twitter/Identi.ca handle: @kidehen
> Google+ Profile: https://plus.google.com/112399767740508618350/about
> LinkedIn Profile: http://www.linkedin.com/in/kidehen    (011)




_________________________________________________________________
Message Archives: http://ontolog.cim3.net/forum/ontolog-forum/  
Config Subscr: http://ontolog.cim3.net/mailman/listinfo/ontolog-forum/  
Unsubscribe: mailto:ontolog-forum-leave@xxxxxxxxxxxxxxxx
Shared Files: http://ontolog.cim3.net/file/
Community Wiki: http://ontolog.cim3.net/wiki/ 
To join: http://ontolog.cim3.net/cgi-bin/wiki.pl?WikiHomePage#nid1J    (012)

<Prev in Thread] Current Thread [Next in Thread>