ontolog-forum
[Top] [All Lists]

Re: [ontolog-forum] Accommodating legacy software

To: "[ontolog-forum]" <ontolog-forum@xxxxxxxxxxxxxxxx>
From: "doug foxvog" <doug@xxxxxxxxxx>
Date: Tue, 4 Sep 2012 15:41:19 -0400
Message-id: <7f5c662e57d9460ef2d17b7c207fbf79.squirrel@xxxxxxxxxxxxxxxxx>
On Tue, September 4, 2012 12:34, Kingsley Idehen wrote:
> On 9/4/12 11:55 AM, doug foxvog wrote:
>> On Sun, September 2, 2012 10:43, Kingsley Idehen wrote:
>>> On 9/1/12 1:37 PM, John F Sowa wrote:
>>>> Denise, Michael B, Kingsley, and Andries,
>>>>
>>>> Before commenting on the details of your notes, I'd like to emphasize
>>>> an excerpt from the original DAML report by Tim Berners-Lee:
>>>>
>>>> TB-L
>>>>> The goal of interoperability between heterogeneous components that
>>>>> we build is one that will test the extent to which the Semantic Web
>>>>> is achieving its promise. The more diverse the systems
>>>>> interoperating, the greater the merit of the Semantic Web.
>>>> The date of the proposal is February 2000:
>>>> http://www.w3.org/2000/01/sw/DevelopmentProposal    (01)

>>>> ...
>>>> Finally, the Final Technical Report from September 2006:
>>>>       http://www.dtic.mil/cgi-bin/GetTRDoc?AD=ADA458366
>>>> ...
>>>> No legacy systems, no relational databases, and no possibility
>>>> of interoperating with any data not represented in RDF.    (02)

>>> You are right in some ways, but there are nuances that need to be
>>> considered. Let's start with RDF, what is it actually?
>>> ...
>>> Its about the entity-attribute-value model enhanced with explicit
>>> semantics.    (03)

>> This is where the trouble comes in.  RDF is a triples model with some
>> semantics attached.  Many things are difficult to model with triples
>> (but not impossible).    (04)

> Yes, there's always an issue with context formalization and semantics.
> Basically, the 4th column in quad patterns is something that needs
> formalization (model theory and specs). Pat Hayes and others are
> tackling this matter, orthogonally, on the current RDF working group.    (05)

Quads allow a pointer to be attached to a triple.  That pointer can reference
an object that is split into context and provenance (or equivalently a
context which has specified provenance).  Quads do not allow for higher
arity
relations -- except for ternary relations if context and provenance are
ignored.    (06)

And yes, i know that a module for provenance has recently been patched
onto RDF.    (07)

...    (08)

>> The main problem that i find with RDF is its being wedded to triples.    (09)

> Before you get to that matter, you have the fact that its still wedded
> to a formats so the model and its semantics are often lost.    (010)

There are multiple formats for RDF.  RDF is not wedded to any one format.    (011)

I'm discussing semantic issues, not syntax.  RDF being wedded to entity-
object-value is a semantic issue.    (012)

...
>>> Of late, I coined the phrase "R-D-F reflux" I used it to refer to folks
>>> to have a gut reaction to those letters due to the ill effects of the
>>> conflation and zealotry for which it (unfortunately) it is now
>>> associated.    (013)

>> The fight over the best syntax for RDF does blur the issue of whether
>> the RDF model (triples) is really the best model to use.    (014)

> Triples are a powerful starting point, at the very least.    (015)

A steam engine is also a powerful starting point.  However, to get
the automotive industry off the ground, a newer paradigm was needed.    (016)

In this case, however, it is not a newer paradigm that is being proposed:
higher-level computer languages since the 1950s have used routines
which accepted more than two input values to return an output value.
LISP was using n-tuples for encoding logic statements in 1958.    (017)

>>> Let's try to move past the mistakes of the past to a future that
>>> beneficial to all. Let's try to veer the conversation towards the
>>> entity-attribute-value model distinct from any RDF specificity since at
>>> best, its just an optional route.    (018)

>> Mistakes of the past, in this case, are at two levels: the one you are
>> referring to -- forcing a messaging syntax on a public not involved in
>> the intricacies of messaging -- and the one you are not --
>> forcing semantic sentences on the web to be encoded
>> using a entity-attribute-value model.    (019)

> We start somewhere or we go nowhere.    (020)

So, start with an n-tuple.  It is not so restricted as a triple invented
decades later.    (021)

> The entity-attribute-value model has been with us forever.    (022)

Forever?    (023)

> It works. Doesn't solve everything in easiest
> fashion but it works, and its widely understood, and actually used
> (knowingly and unknowingly).    (024)

An n-tuple works, is widely understood, and is actually used
knowingly and unknowingly.  The only difference with a triple
is that many cases which are difficult to deal with using triples,
are easy to deal with using n-tuples.    (025)

>>> What ultimately matters is the ability to refer to entities (real world
>>> or otherwise) using URIs
>> Fine.
>>
>>> combined with the ability to craft their
>>> digital representations via Web documents
>> Fine.
>>
>>> where content format is varied
>> Fine.    (026)

>>> albeit constrained by the same fundamental data model.    (027)

>> Such a "same fundamental data model" for all data on all data on
>> the Semantic Web -- which i don't see as a requirement in the early SW
>> documents -- should (imho) not use such a restrictive data model as
>> entity-attribute-value.    (028)

> The model can be tweaked or built upon. The critical item is to use
> foundation that's already in use.    (029)

N-tuples are a foundation that's already in use.    (030)

> RDF went of the rails by not
> aggressively leveraging how it relates to EAV.    (031)

I think you are discussing syntax here, not semantics.    (032)

You earlier defined RDF as EAV plus semantics.    (033)

>> I would suggest that a "fundamental data model" should model the
>> following
>> types of data:
>> * (terms for) types/classes of things
>>    + specification of subtype/subclass to another type/class
>>    + specification of disjointness with another type/class
>> * (terms for) instances of such types/classes
>> * (terms for) relations among represented things
>>    + specification of arity of relations
>>      - single arity (can be binary)
>>      - variable arity
>>    + restriction on argument fillers of relation
>>      - to being an instance of given type(s)/class(es)
>>      - to being a subclass of given type(s)/class(es)
>>      - restrictions based on the type/superclass of another argument
>> filler
>>    + specification of properties of relations
>>      - transitive       - symmetric     - reflexive    - functional in
>> arg N
>>      - antitransitive  - asymmetric   - irreflexive  - ...
>>    + specification of relations among relations
>>      - subrelation  - inverse subrelation - transitive closure - ...
>> * (terms for) functions of represented things
>>    + specification of arity of function
>>    + restriction on argument fillers of function
>>    + specification of type/class of result of function applied to
>> arguments
>>      - to being an instance of given type(s)/class(es)
>>      - to being a subclass of given type(s)/class(es)
>>      - to being a instance/subclass of one of its arguments
>>      - restriction based on the type/superclass of an argument filler
>> * provenance of data
>> * contexts for data
>>    + define contexts
>>    + specify semantics of a context using statements based on relations
>> * context of data
>>    + specification of the context in which data is valid    (034)

>> Such a data model would not require a fixed syntax.    (035)

>  From my vantage point, if the RDF working group sort out Named Graphs
> and Inference Context we are set. I am not seeing what isn't expressible
> in triples from what you outline above.    (036)

> At best, I see stuff that could be awkward to express in triples    (037)

E.g., the variable arity relations.  The accessibility of triples without
their context or provenance.    (038)

But why add a restriction (to triples) that complicates encoding and
makes things hard to express?    (039)

I admit that anything can be expressed in triples.  It can also be expressed
using a Turing Machine.  In both cases there are efficiency issues.    (040)

> or be awkward to grok at first blush etc..    (041)

>>   Under the covers
>> (i.e., at what is now the XML level), messaging syntaxes should allow
>> variable arity relations (as XML currently does).  If someone wants to
>> code a messaging syntax that requires triples
>> for higher arity relations (and takes 20 times as much bandwidth),
>> such a syntax should not be banned.    (042)

> I don't see an XML or any other format level :-)    (043)

XML is the bottom layer in the SW layer cake.    (044)

Doesn't the SW require a common format for data transmission?   A common
format is what gave us the WWW.    (045)

Sure, we could have multiple formats, but transmitted semantic data not
in a defined format could not be successfully received.    (046)

>>> With the
>>> aforementioned in place, we can then appreciate the virtues of denoting
>>> real world entities with URIs that resolves to descriptor documents (or
>>> data objects) via indirection.
>> I note that this benefit does not require a model restriction to
>> triples.    (047)

> See triples as a base. Applying functions to triples etc.. still means
> you are working from a base.    (048)

I see n-tuples as a base.  The restriction of n to 3 only complicates
matters.    (049)

>>> TimBL veered back to the Linked Data meme because he clearly realized
>>> that RDF and the Semantic Web Project were veering off course.
>>> Unfortunately, once the Linked Data meme took off, the gut reaction of
>>> the RDF & Semantic Web crowd was to once again make a power-play
>>> by conflating both things. Yes! They decided that it was beneficial to
>>> conflate RDF and Linked Data,    (050)

>> I agree that the Linked Data meme does not require the RDF model
>> (triples).    (051)

> I believe Data denotes Subject Observation.    (052)

> I believe all observations are comprised of:    (053)

> 1. a subject
> 2. subject attributes
> 3. subject attribute values.    (054)

I note that 3.) is plural.    (055)

One common type of observation is that A is between B and C.
How would you express this with a single triple?    8)#    (056)


> Then to the above one has to deal with the mercurial matter of context.
> In some cases #4 in a quad pattern. Then if we get really adventurous
> there's temporality. For instance, all of these observations and their
> contexts occurred in a specific time-frame.    (057)

What is the problem with this?   Databases commonly have sets of rows
that fit together.    (058)

>>> even though history provides ample
>>> evidence for the folly inherent in such thinking and execution
>>> strategy.    (059)

>>>> ...
>>>> It's time to rethink the goals of the SW and redefine the layer cake
>>>> to incorporate a broader vision that is closer to original (plus the
>>>> new ideas that have been introduced since 2000).    (060)

>>> Linked Data is trying to achieve this goal. It isn't RDF syntax or
>>> serialization specific. Its basically the entity-attribute-value model    (061)

>> Which makes it RDF-model specific.  It is this generic syntax
>> restriction that is the basic problem that John is describing.    (062)

> We have to start somewhere.    (063)

That somewhere is n-tuples.  Restricting n to 3 causes lots of problems.    (064)

> From my study of John's concerns, enemy #1 was putting syntax
> ahead of semantics by placing semantics atop syntax.
> With that visual and mental model in place, everything fell apart.    (065)

John?    (066)

>> Accepting a requirement for the entity-attribute-value model
>> is not agreeing with John's points.    (067)

> I beg to differ as I see the issue of context as mercurial and not the
> basis for dithering over the critical first step i.e., have something
> that's built upon etc..    (068)

We agree that we should start with something agreed upon to build on.
I don't see it as dithering to avoid problematic restrictions on that first
step.    (069)

>>> enhanced with URIs and explicit semantics re. subject-predicate-object
>>> or entity-attribute-value.  When all is said and done we have:
>>> 1. entity-attribute-value + classes and relationships model --
>>> relationship semantics are implicit    (070)

>> It is broken at this point.  What (imho) is needed is:    (071)

>> 0. classes - class instances - predicates - functions model (as
>> specified above)    (072)

> IMHO. It not broken, its suboptimal in certain circumstances. Those
> circumstances don't carry enough weight to invalidate what's been
> achieved. What, we take down the LOD cloud, and rebuild it with what?    (073)

Removing the restriction on n=3 for tuples does not invalidate tuples
in which n happens to equal 3.    (074)

TimBL's initial position (which John Sowa refers to) is that multiple formats
should be able to be handled on the SW.  Thus archaic formats such as
RDF and OWL should still be able to be handled in a modern SW.  This
means that nothing would be required to be taken down.    (075)

Of course, programs could translate legacy RDF & OWL into more modern
languages without any semantic loss.    (076)

>>> 2. ditto + URIs and explicit relationship semantics -- pitched as the
>>> RDF model
>>> 3. ditto + RDF Schema + OWL - additional relationship semantics.
>>> As I am sure you know, the RDF zealotry can be so chronic that some
>>> even
>>> believe RDF and Semantics are one and the same thing :-(    (077)

>> I agree.  I note that you said above:    (078)

>>>      what is RDF actually? ... Its about the entity-attribute-value
>>> model
>>>      enhanced with explicit semantics.    (079)

>> Zealotry for RDF as you (correctly) define it is something we have to
>> move past.  Expressing semantics using this tight RDF
>> constraint is problematic.    (080)

> I would say its imperfect in certain situations. As I am sure John will
> attest, everything is ultimately imperfect in some situation.    (081)

But why go for the a problematic solution when a less problematic
one (removing the constraint for triples) is easily available?    (082)

> In the context above, 'It' refers to the entity-attribute-value model
> combined with de-referencable URIs as a mechanism for structured data
> representation in a form that's webby or web oriented.    (083)

The problematic part is the EAV-only model -- not its combination with
de-referencable URIs.    (084)

>> I'm afraid the zealotry for triples comes from the layer cake.  TimBL's
>> signature accomplishment was the WWW based on a common messaging
>> syntax (after all, devices that don't have a common syntax can not
>> communicate).  I note that this syntax intentionally did not restrict
>> the syntax of the files that could be transmitted over the WWW --
>> access to masses of existing files was required.    (085)

> I wouldn't blame TimBL for the zealotry. I put that on the W3C's table.    (086)

Who put RDF on top of XML in the layer cake?    (087)

> Got to separate TimBL and the W3C, they aren't one albeit associated in
> certain contexts.    (088)

Agreed.  TimBL certainly did not initially propose the SW as being
based on triples.    (089)

>> Fixing a common syntax below the Semantic Web should (imho) be
>> similar;
>> it should provide a syntax for accessing data, but should not limit the
>> semantics of the files being transmitted.    (090)

> Sorta.    (091)

>> An XML messaging format does just this.  Adding an RDF layer above the
>> XML layer is highly restrictive, however.  XML does not require a triple
>> model.  Neither do FOL or HOL languages which can be used to encode
>> semantics that one might expect the SW to be able to transmit.  Placing
>> a restrictive funnel on an intermediate level
>> of the layer cake seems to me to be counterproductive.    (092)

>> One can certainly argue with the XML base to the layer cake.    (093)

> XML should never have been in the diagram. Period. The base of the whole
> thing should have been URIs.    (094)

>> Perhaps
>> such a verbose syntax is only required for specifying and enveloping
>> data sets.  A more compact syntax might be useful for encoding the
>> semantic statements.  But this issue with XML is about compactness
>> and readability.  The issue with RDF is semantic.    (095)

> Yes, but as you can see, we have to untangle the mess first,    (096)

To me, it is clear where the tangle is: the restriction to triples.    (097)

> then tweak/evolve etc..    (098)

I see no problem creating a model that legacy RDF can be mapped
into.  I'd suggest a LISP-ish notation (a la KIF and Cyc) with terms
that can be mapped to URIs through a namespace convention.    (099)

A knowledge base would, like OWL, define a set of used ontologies
with namespace abbreviations.  The knowledge base would represent
a context and would specify properties of that context: temporal,
provenance, legal, geospatial, corporate, etc.  Statements made in
the knowledge base would be true in that context and could have
additional provenance and other metadata attached.    (0100)

Accessing data from such a knowledge base would yield statements
wrapped in a "that" (isTrueInContext C (that S)).  Systems that use
S as true could ignore C at their own risk or could accept, reject, or
modify S depending upon the properties of C.    (0101)

> My biggest problem is that circa., 2012 we still haven't recovered from
> 2004, re. RDF :-(    (0102)

Agreed.  "RDF ... [is] about the entity-attribute-value model enhanced
with explicit semantics." -- KI    (0103)

We need to get beyond that.    (0104)

-- doug foxvog    (0105)

> Kingsley
>>
>> -- doug
>>
>>>> ...
>>>> John
>>> --
>>>
>>> Regards,
>>>
>>> Kingsley Idehen
>>> Founder & CEO
>>> OpenLink Software
>>> Company Web: http://www.openlinksw.com
>>> Personal Weblog: http://www.openlinksw.com/blog/~kidehen
>>> Twitter/Identi.ca handle: @kidehen
>>> Google+ Profile: https://plus.google.com/112399767740508618350/about
>>> LinkedIn Profile: http://www.linkedin.com/in/kidehen
>>
>> _________________________________________________________________
>> Message Archives: http://ontolog.cim3.net/forum/ontolog-forum/
>> Config Subscr: http://ontolog.cim3.net/mailman/listinfo/ontolog-forum/
>> Unsubscribe: mailto:ontolog-forum-leave@xxxxxxxxxxxxxxxx
>> Shared Files: http://ontolog.cim3.net/file/
>> Community Wiki: http://ontolog.cim3.net/wiki/
>> To join: http://ontolog.cim3.net/cgi-bin/wiki.pl?WikiHomePage#nid1J
>>
>>
>>
>
>
> --
>
> Regards,
>
> Kingsley Idehen
> Founder & CEO
> OpenLink Software
> Company Web: http://www.openlinksw.com
> Personal Weblog: http://www.openlinksw.com/blog/~kidehen
> Twitter/Identi.ca handle: @kidehen
> Google+ Profile: https://plus.google.com/112399767740508618350/about
> LinkedIn Profile: http://www.linkedin.com/in/kidehen
>
>
>
>
>
>
> _________________________________________________________________
> Message Archives: http://ontolog.cim3.net/forum/ontolog-forum/
> Config Subscr: http://ontolog.cim3.net/mailman/listinfo/ontolog-forum/
> Unsubscribe: mailto:ontolog-forum-leave@xxxxxxxxxxxxxxxx
> Shared Files: http://ontolog.cim3.net/file/
> Community Wiki: http://ontolog.cim3.net/wiki/
> To join: http://ontolog.cim3.net/cgi-bin/wiki.pl?WikiHomePage#nid1J
>    (0106)



_________________________________________________________________
Message Archives: http://ontolog.cim3.net/forum/ontolog-forum/  
Config Subscr: http://ontolog.cim3.net/mailman/listinfo/ontolog-forum/  
Unsubscribe: mailto:ontolog-forum-leave@xxxxxxxxxxxxxxxx
Shared Files: http://ontolog.cim3.net/file/
Community Wiki: http://ontolog.cim3.net/wiki/ 
To join: http://ontolog.cim3.net/cgi-bin/wiki.pl?WikiHomePage#nid1J    (0107)

<Prev in Thread] Current Thread [Next in Thread>