[Top] [All Lists]

Re: [ontolog-forum] Constructs, primitives, terms

To: ontolog-forum@xxxxxxxxxxxxxxxx
From: Kingsley Idehen <kidehen@xxxxxxxxxxxxxx>
Date: Fri, 16 Mar 2012 12:20:37 -0400
Message-id: <4F636855.3070800@xxxxxxxxxxxxxx>
On 3/16/12 11:27 AM, doug foxvog wrote:
>> >  In different guises, blending disparate structured data
> By "structured data", do you include traditional structured data which
> appears in file records, data bases, and spreadsheets?    (01)

Yes.    (02)

>   Or are you
> restricting yourself to RDF-structured data?    (03)

No.    (04)

> Since you are discussing
> converting traditional software to URIs, i'm guessing that you include
> traditional structured data as well as SW-structured data.    (05)

Yes.    (06)

>> >  boils down to establishing identifiers for:
>> >  1. data objects
>> >  2. data object access addresses.
> By "data object" do you mean OOP objects, which have attached methods;
> normal CS objects, including data base cells, variables, constants, arrays,
> strings, and floats; instances of RDF-S definitions for particular
> classes; or
> something else?    (07)

Traditional OO objects modulo methods. Basically, the data members.    (08)

> I would suggest that blending traditional structured data from disparate
> sources involves far more than establishing URIs for cells, columns, and
> rows of databases and addresses for the content of such data.    (09)

It does, but you don't go wrong establishing a basic foundation. For 
example, make an ontology from an RDBMS schema, then map it to 
ontologies closely aligned to the integration problem at hand, tweak 
where necessary, reiterate if need be...    (010)

> What is needed is URIs for the semantic entities and classes which the
> structured data explicitly or implicitly refers to.  Sometimes a record or
> database column or row refers to an individual or class for which data
> might be shared.  Sometimes it refers to a relation which should have
> a URI.  But other times one set of columns for a single row in a database
> (or set of fields in a file record) refer to one entity, while another set
> in the same row/record refers to another entity.    (011)

Yes, this is all covered an achievable today using the R2RML mapping 
language. An RDBMS allows you to shape data in a myriad of ways that 
ultimately can be mapped to virtual data objects via R2RML. Of course, 
with replication and delta handling in play you can even persist said 
objects without utterly compromising change-sensitivity.    (012)

> A database or file field may be restricted to providing information about
> a certain type of thing, even if the field filler is a text string.  Are such
> restrictions considered to be "data objects"?    (013)

The restrictions will still apply to the process of generating transient 
of persistent data objects using something like R2RML as per my comments 
above.    (014)

>   Fields are often restricted
> to being filled by different codes, each of which has a specific meaning.
> Do such codes require unique URIs?    (015)

Some of these codes (typed literals) would be mapped to 
inverseFunctional properties which ultimately aids co-reference 
algorithms and reasoning via OWL reasoners.    (016)

>   What about cases in which the
> same string is a code for different things in different fields?
Semantically, you will know which properties are inverseFunctional and 
which ones aren't. The loose coupling of TBox and ABox will enable you 
handle this kind of fidelity.    (017)

>   What about
> the case in which different codes in different fields represent the same
> type of thing?    (018)

See my comments above.    (019)

>> >  The above bring portability and dexterity to:
>> >  1. data object representation models
>> >  2. data object representation formats
>> >  3. data object access protocols.
> I agree that shared URIs enable such portability.    (020)

Yep!    (021)

>> >  Context fluidity is a timeless challenge for data access and
>> >  integration. The above take us a long way towards alleviating
>> >  said challenges.
> In that the use of URIs would encourage coders not to broaden, narrow,
> or otherwise alter the meaning of/restrictions on a field?    (022)

Correct. They have the semantics of resulting relations in the TBox for 
handling that. Of course, if all fails you look to rules. These days, 
SPARQL even serves as a pretty niffy rules language via CONSTRUCT patterns.    (023)

>> >  Today, we hear a lot about BigData or 'Big Data' and very little about
>> >  the fundamental realities associated with:
>> >  1. exponential growth of data volume
>> >  2. exponential growth of data velocity
>> >  3. exponential growth of data variety (heterogeneity)
>> >  3. exponential growth of data location disparity.
>> >  You can't deal with these matters without URIs in your arsenal.
> URIs seem to me to address the first point #3.  I don't see offhand how
> growth of volume and velocity of homogeneous data or even growth
> in the number of homogeneous sensors (and thus data location) requires
> the use of URIs.    (024)

As volume and velocity grow you potential for serendipitous discovery 
increases rather than decreases. Scalability is no longer the challenge 
it used to be. We've demonstrate these patterns at 29 Billion+ scale 
using servers available for anyone to use on the Web. Most recently 
we've upped that to 52 Billion+ [2].    (025)

>> >  Ontologies, Inference Rules, Data Objects etc.. can all be combined in
>> >  powerful ways that enable subject matter experts solve complex data
>> >  integration problems. When done right, the end products of said efforts
>> >  scale by fitting naturally into virtuous Linked Data clouds
> I agree that data encoded in ontologies can fit naturally into constrained
> Linked Data clouds.  This would especially be the case if the clouds
> easily represented ternary and higher-arity relations as well as ordered
> lists.    (026)

Yep!    (027)

>> >  as exemplified burgeoning Web of Linked Data.
> I am not convinced that the burgeoning Web of Linked Data exemplifies
> "virtuous" LD "done right".  I see massive heterogeneity  with multiple
> URIs promulgating for the same classes, the same relations, and the
> same individuals.  I see linked data removed from context.    (028)

When I make statements like that I am basically referring to the 
combined prowess of OWL reasoning and the Linked Open Data cloud.    (029)

>> >  Note, when I refer to Linked
>> >  Data I am not implying a use-case that only applies to the public HTTP
>> >  network aka. the World Wide Web.
> Context-restricted and context-identified Linked Data could be more
> useful, imho.  That which i find appearing on the WWW does not have
> these features.    (030)

Depends where you look :-)    (031)

>> >  My mantra is very simple re. contemporary data access and integration
>> >  matters: URI Everything and Everything is Cool!:-)
> I've got a hammer; it'd be great if the rest of you turned everything into
> nails!  8)#    (032)

No, you need URIs to get going. Otherwise, you have reinvent the same 
thing.    (033)

Links:    (034)

https://docs.google.com/spreadsheet/ccc?key=0AihbIyhlsQSxdHIxc3hhdk82UFdYd1ppaGw3WDNrVGc#gid=0    (035)

-- Google Spreadsheet with some benchmark numbers based on SPARQL queries    (036)

2. https://plus.google.com/s/owl%20reasoning%20linked%20data%20idehen -- 
collection of posts (with live demo links in most cases) showcasing 
combined prowess of OWL and Linked Data    (037)

3. http://delicious.com/kidehen/faceted_browsing -- faceted browsing 
examples, wherever the host is "lod.openlinksw.com" note that its 
working against a live instance with 29 Billion+ triples (soon you'll 
see: "lod2.openlinksw.com" which has 52 Billion+ triples doing the very 
same things) .
> -- doug
>    (038)

--     (039)

Regards,    (040)

Kingsley Idehen 
Founder&  CEO
OpenLink Software
Company Web: http://www.openlinksw.com
Personal Weblog: http://www.openlinksw.com/blog/~kidehen
Twitter/Identi.ca handle: @kidehen
Google+ Profile: https://plus.google.com/112399767740508618350/about
LinkedIn Profile: http://www.linkedin.com/in/kidehen    (041)

Attachment: smime.p7s
Description: S/MIME Cryptographic Signature

Message Archives: http://ontolog.cim3.net/forum/ontolog-forum/  
Config Subscr: http://ontolog.cim3.net/mailman/listinfo/ontolog-forum/  
Unsubscribe: mailto:ontolog-forum-leave@xxxxxxxxxxxxxxxx
Shared Files: http://ontolog.cim3.net/file/
Community Wiki: http://ontolog.cim3.net/wiki/ 
To join: http://ontolog.cim3.net/cgi-bin/wiki.pl?WikiHomePage#nid1J    (01)

<Prev in Thread] Current Thread [Next in Thread>