[Top] [All Lists]

Re: [ontolog-forum] Solving the information federation problem

To: "[ontolog-forum]" <ontolog-forum@xxxxxxxxxxxxxxxx>
From: Peter Yim <peter.yim@xxxxxxxx>
Date: Thu, 27 Oct 2011 19:30:50 -0700
Message-id: <CAGdcwD1RZN4jbVDusECi9Zs943+-xeshgA3E-X3R4d2OOqpeNA@xxxxxxxxxxxxxx>
David,    (01)

> [DP]  I .. am ..  using semantic mediation languages that may become
> W3C standards ... semantics-driven IDIOM framework might be a more
> reasonable place to start.    (02)

[ppy]  Ah! That's where you are coming from ... sounds great! Maybe
when you resurface from your big information federation project,
David, it would be great if you can give us a talk about that
semantics-driven IDIOM framework (wherever you plan to get the
standard developed at ... Ontolog is not an SDO anyway.)    (03)

All the best to your project!    (04)

Regards. =ppy
--    (05)

On Thu, Oct 27, 2011 at 7:10 PM, David Price <dprice@xxxxxxxxxxxxxxx> wrote:
> Hi Ed,
> As usual ... Wow! I may try to dissect this at some point, but am in the
> middle of a big information federation project with a reference ontology
> using semantic mediation languages that may become W3C standards ... so
> may not have time:-)
> My one sentence summary of your email 'It's too early to make standards
> for information federation as a whole, and there are already enough
> standards in place for the components' ... about right?  I mostly agree
> with that. However, the question I was addressing  (which may not be the
> question in which you are interested, but seems of interest to Cory) was
> 'If in the next 5 years there need to be standards in support of
> information federation, where should they be placed?' and nothing in
> this reply does anything but strengthen the view that the W3C is the
> most sensible answer. You may not approve of all the W3C WG activities
> as they don't fit into the NIST semantic mediation architecture view,
> but these activities are pushing in the right direction are are doing
> far more to add to the components of a solution than other standards
> activities of which I'm aware.
> FWIW ... I'm not even convinced a semantic mediation tools suite is
> what's actually required. In order to make progress, making things
> clearly specified for humans (but computer navigable) independent of any
> specific tool suite could be is a big step in the right direction, so
> something more like the loosely-coupled, but semantics-driven IDIOM
> framework might be a more reasonable place to start.
> Anyway, back to my real job...
> Cheers,
> David
> On 10/28/2011 1:10 AM, Ed Barkmeyer wrote:
>> David Price wrote:
>>> There are of course things that organizations can do to start improving
>>> the situation, but they have little to do with Ontolog-typical concerns
>>> and so I doubt that the Ontolog Forum is the place to 'get on with' this
>>> problem.
>>> I think it's pretty clear now that the OMG cannot do it either - as has
>>> been proven by the lack of progress on SIMF despite a valiant effort on
>>> your part. FWIW it's very hard to push through the OMG 'everything is a
>>> meta-model' and 'vested interests' barriers. Luckily, it seems to me
>>> that a new language is actually pretty far down the list of important
>>> mechanisms/approaches wrt information federation anyway.
>> Well, I can agree to some extent.  The problem that OMG has in this
>> regard is that Cory is pushing for a *standard* that supports 'semantic
>> integration tools', and he can't name one.  I pointed out then that, in
>> spite of 2 EU FP6 projects and millions of euros invested in this, the
>> result was only weak academic tooling, and the three collections of
>> tools I saw chose different organizations and different integrating
>> mechanisms.  The OMG Telecomm group put out an RFI for the current state
>> of the art in semantic integration tools and got only one response, from
>> Cory's AESIG.  NIST itself is now on its 4th project in trying to define
>> a feasible toolset for some known mediation problems.  Part of the
>> difficulty is in agreeing on what the modules should be and do, and part
>> of the difficulty is agreeing on an adequate form for the integrating
>> model.
>> But the main problem is simply that it is easier to build a one-off
>> mapping of your business data from representation1 to representation2
>> using XSLT or Java, than to learn to use, and use, the tools to create
>> the ontology for your business data and the tools to map the XML schemas
>> to the ontology and the tools to perform the runtime transformations.
>> You have to see a broader, longer-term value to the reference ontology
>> to realize any value at all from the extra work.
>> And the reference ontology has to be able to capture the rules of usage
>> that you will write into your XSLT script.  OWL can't.  RDF can, if the
>> tool provider invents enough special vocabulary, but what modeling tool
>> will you use to create the RDF ontology?  UML with stereotypes and OCLv2
>> can, but it isn't any easier to write OCL than Java.  So there is a
>> serious practical barrier to getting /useable/ and /cost-effective/
>> semantic mediation tooling.  And that is why there are not  lots of
>> commercial tools.
>> Yes, there is enormous value to be realized, IF you can figure out how
>> to create it.  We at NIST justify our work in this area as 'research',
>> because we have not yet seen a tool set that is even effective, without
>> getting into useable or cost-effective.  And OMG has been given to
>> understand that the IBM evaluation of the situation is similar.  So I
>> applaud Cory's idea that this could be an interesting topic for the
>> Ontology Summit, if nothing more than to get a clearer handle on the
>> state of the art in semantic mediation in 2012.  The state of the
>> practice is nearly non-existent, which is why a standards project is of
>> doubtful value.
>>> Cory, this problem belongs in the W3C.  I suggested that to you
>>> previously, and the events of the past year have made that fact even
>>> more clear in my mind - the solution has to be based in Web and Internet
>>> standards and technologies.
>> That is certainly true, but all of OMG, W3C, OASIS and other bodies are
>> working on solutions to various problems based on XML and XML Schema and
>> WSDL/SOAP, and all their dialects and add-ons, which is the meaning of
>> 'Web and Internet standards'.  Then we come to who is actually working
>> on solutions using OWL and RDF, and suddenly we have much smaller and
>> more scattered contingent, but there are active committees in all of
>> those, and all in various states of disorganization.
>> I don't see that W3C is a better choice.  The W3C RIF project, for
>> example, had the problem of having to work with OWL and having to work
>> with SPARQL, because those were the W3C invested technologies, even
>> though none of the non-academic rules engines, and at most half the
>> academic ones, had anything to do with either one.  (David's employer
>> falls into non-academic category; TopQuadrant support for OWL was an
>> afterthought.)  In short, going to W3C just begets a different set of
>> politics and prejudices.
>> The problem is not what technologies to use, or where to do the
>> standards work.  The problem is to have a community that has semantic
>> mediation tooling and is interested in getting a standard to enable some
>> tools to work together.  All of the tool sets I have seen perform the
>> entire mediation function.  They need to be able to read XML schemas,
>> and ASN.1 schemas (in HL7), and EDI schemas (in many business
>> applications), and EXPRESS schemas (in manufacturing and construction),
>> and read and write the corresponding standard message forms.  They need
>> to have an internal representation for the integrating model (aka
>> reference ontology), and they probably rely on some off-the-shelf
>> modeling tools to provide the input from which that model is created.
>> It may be advantageous to convert UML to OWL or vice versa, and they
>> probably need to add UML stereotypes or something the like to mark up
>> the incoming model to meet their internal needs for the content of the
>> reference ontology.  In addition, they need a runtime capability that is
>> based on a central engine with interface and schema plugins on the input
>> side and the output side, and the semantic maps and reference ontology
>> as inputs.
>> Now given that you are building a semantic mediation tool suite, you
>> have a list of tool components (which the last draft of the SIMF RFP was
>> still not clear on):  reference ontology creation tool, semantic mapping
>> creation tool, general runtime conversion engine, semantic mapping tool
>> plugins for XML schema, EDI, ASN.1, EXPRESS (according to your target
>> market), runtime plugins for the schemas and the corresponding data
>> encodings for input and output, and runtime plugins for WSDL/SOAP and
>> ebMS, and probably other protocols (again depending on target market).
>> If you build all the tool components as part of your suite, the only
>> standards you need are the existing standards for the schemas and the
>> data forms.
>> There are already standards for all schemas and encodings, and there are
>> probably open source libraries for reading both and writing encodings.
>> Unless you want to standardize the Java APIs for that, there is no
>> opportunity for standards there.
>> Similarly, you probably want the reference ontology creation tool to be
>> some off-the-shelf product of a vendor that does that kind of thing
>> well, and spits out some standard form, like UML XMI or OWL/RDF or RDF
>> or CLIF (if John Sowa has convinced anyone).  Alternatively, you could
>> probably use one of these do-it-yourself graphical DSL tools to make
>> your own tool, and then use your own internal reference ontology format
>> as the direct output of your tool.  In either case, however, you don't
>> need a standard, unless you need a new language.
>> Finally, you will need a tool that can take an exchange schema in its
>> left hand, and a reference ontology in its right hand, and enable the
>> domain expert to define the links between the model elements, path to
>> path.  This is the critical Semantic Mediation Rules Tool.  And you need
>> to define two sets of links -- one is an interpretation rule: data to
>> concept; the other is an encoding rule: concept to data.  They are not
>> always symmetric, because the starting points are usually different.
>> The Semantic Mediation Rules Tool needs to record and export the mapping
>> rules it generates, because those rulesets are the critical input to the
>> runtime engine -- the Mediator.  If you expect that one organization
>> will build a Semantic Mediation Rules Tool that can be used by someone
>> else's Mediator, you need a standard for the representation of semantic
>> mediation rules.  If not, then not.  Does any commercial or academic
>> project not envisage building both the Rules Tool and the Mediator as
>> part of its toolkit?  None that I know of.  Why would you?  Is there any
>> reason to create a standard for communication between my Rules Tool and
>> my Mediator?  Not only is it my design choice, it is my IP, and I can
>> improve my capabilities by improving the capabilities of that interface
>> whenever I discover a new and exciting feature that I can add.  And I
>> might find it useful to patent my design.  The last thing I want is a
>> standard.
>> In summary, there is the issue of defining a standard architecture, but
>> we would have to do that before trying to standardize any of the
>> interfaces.  It strikes me that a useful output of the OMG AESIG would
>> be the whitepaper that clearly defines the semantic mediation
>> architecture and assesses the opportunities for standardization, rather
>> than an RFP for several not clearly necessary standards.
>> I see only three areas for interface standardization:
>>   - the form of the reference ontology that is input and presented at the
>> interface between the human knowledge engineer and the reference
>> ontology capturing tool.  It is probably a combined graphical and text
>> form, a la UML+OCL, or OWL+RDF.
>>   - the form of the reference ontology that is exported by the capturing
>> tool for use by other tools, including but not limited to the Semantic
>> Mediation Rules tool and the Mediator.  It is probably an RDF dialect.
>> What all is captured here, or can be captured here, has some impact on
>> the capabilities and possible behaviors of the Semantic Mediation Rules
>> tool.  So this interface may be an important part of the tool-builder
>> IP.  If there were enough experience to know what all might be useful to
>> express, you could get agreement on a standard, even though most tools
>> would only be able to use some of it.  Most importantly, however, a
>> standard in this area that is not just a UML profile, or something the
>> like, would require the toolsmith to build some kind of back-end for the
>> off-the-shelf UML or OWL tool that is the primary ontology input tool.
>> And I would expect that many semantic mediation toolkits might just
>> assume that a UML or OWL tool can be used and will generate the standard
>> XMI or RDF formats.
>>   - the form of mediation rules that is input and presented at the
>> interface between the knowledge engineer and the Semantic Mediation
>> Rules tool.  This is an area that is by no means ripe for
>> standardization, because the workings of this tool are very different in
>> various designs.  Part of the rules generation process can be automated,
>> and part of it requires human input, and how much is which, and how the
>> automation is enabled, and what algorithms it uses, and how complex the
>> executable rules for the Mediator can be, are all design decisions.
>> This a primary area of tool-builder IP.
>> So, IMO, the big question is what the form of the reference ontology
>> is.  Do we need a new language for creating them?  Do we need a set of
>> RDF additions to OWL, or a UML Profile for Reference Ontologies?  If we
>> don't need a new language at all, then we already have all the standards
>> we need, and we need to get some experience with commercial tools.
>> If we need a new language, then we also need to standardize its export
>> form.  A UML profile can be processed by off-the-shelf UML tools and the
>> models can be exported in XMI.  Similarly, an RDF add-on to OWL might be
>> supported by an extension to an existing OWL tool and exported as
>> described in OWL/full.  (Clark/Parsia are already doing this kind of
>> thing with Pellet.)  CLIF may be a desirable export form for some
>> mediation tools, but it is a highly undesirable input form for knowledge
>> engineers working with domain experts.  Domain experts can glean most of
>> the content of UML and graphical OWL models with a little experience,
>> but CLIF is about as intelligible as OWL/RDF or XMI or Old Church
>> Slavonic.  A wholly new language requires a new set of tools and
>> standards for both ends; a CLIF tool requires a new input form.  (One of
>> the failures of OMG SBVR is that it exemplifies a possibly viable input
>> form for rules and definitions that it does not standardize, and then
>> standardizes an output form that merely competes with OCL and CLIF/IKL
>> -- a new kind of train on existing tracks with no doors for the passengers.)
>> And at this time in history, I think the standardization of input to the
>> Mediation Rules generator would be a mistake.  There is no agreement on
>> how to generate such rules, or even what capabilities of the Mediator
>> they must drive.  So, let us by all means have conferences and
>> whitepapers on the subject, but please not as standards development
>> projects.
>>> The Goverment Linked Open Data WG and the
>>> RDB2RDF WG are examples of practical things happening in the W3C that
>>> will hopefully make some real progress possible. More of that kind of
>>> thing, perhaps more focused at this particular problem, seems like the
>>> only practical way forward to me.
>> Linked Open Data is the latest in a long line of webheaded information
>> integration technologies, which is in no way related to semantic
>> mediation, as far as I can tell.  RDB2RDF is a knowledge-free technical
>> transformation of SQL relational database schemas to RDF Schema + SQL
>> RDF dialect.  The object seems to be to allow the implementors of
>> triple-store databases to use real industrial information that is stored
>> in relational data management systems in a predictable way.  It is
>> almost the antithesis of semantic mediation, in which the objective is
>> to relate the database-engineered SQL schema to a knowledge-engineered
>> domain ontology.  But it is the case that some mediation tools take
>> exactly RDB2RDF approach to the semantic mapping process, and similar
>> projects use XML Schema as the basis.  And let us not forget that Cory
>> is working the OMG MOF2RDF standard to make standard RDF export forms
>> for UML models and BPMN models, etc., as RDF Schema + MOF RDF dialect.
>> This is exactly why W3C is not a better place.  I don't think we want
>> semantic integration standards to be strongly influenced by RDB2RDF or
>> Linked Open Data, any more than we want them to be influenced by MOF and
>> SBVR and UML.
>> I suggest that we can make better progress by getting a whitepaper out
>> there that identifies the architecture, standardizes a component and
>> interface nomenclature, discusses the state of the art in mediation
>> technology and the opportunities for standardization.  And I strongly
>> agree that the Ontolog Summit could contribute to the 'state of the art
>> in mediation technology' part, which is critical to the assessment of
>> opportunities for standardization.  The AESIG has been so busy trying to
>> generate acceptable RFPs that it has lost sight of its primary value as
>> an Architecture Board SIG -- to provide education on the technology and
>> guidance on the development of a program of work in this area.
>> -Ed
>> P.S.  In spite of NIST's strong interest in semantic mediation, we
>> (primarily I, no surprise there) have been a thorn in Cory's side since
>> the beginning of the SIMF RFP effort.  But I believe the suggestion for
>> a workshop topic for the Ontolog Summit is a much more valuable step, on
>> the way to the whitepaper that would form the basis for any kind of
>> standardization plan, and by-the-by serve as a reference terminology for
>> the emerging papers on the subject.  Part of the reason why the EU had 3
>> different INTEROP projects doing semantic mediation (all differently) is
>> that none of them used the same terms to describe what they were doing.
> --
> Managing Director and Consultant
> TopQuadrant Limited. Registered in England No. 05614307
> UK +44 7788 561308
> US +1 336-283-0606
> _________________________________________________________________
> Message Archives: http://ontolog.cim3.net/forum/ontolog-forum/
> Config Subscr: http://ontolog.cim3.net/mailman/listinfo/ontolog-forum/
> Unsubscribe: mailto:ontolog-forum-leave@xxxxxxxxxxxxxxxx
> Shared Files: http://ontolog.cim3.net/file/
> Community Wiki: http://ontolog.cim3.net/wiki/
> To join: http://ontolog.cim3.net/cgi-bin/wiki.pl?WikiHomePage#nid1J
>    (06)

Message Archives: http://ontolog.cim3.net/forum/ontolog-forum/  
Config Subscr: http://ontolog.cim3.net/mailman/listinfo/ontolog-forum/  
Unsubscribe: mailto:ontolog-forum-leave@xxxxxxxxxxxxxxxx
Shared Files: http://ontolog.cim3.net/file/
Community Wiki: http://ontolog.cim3.net/wiki/ 
To join: http://ontolog.cim3.net/cgi-bin/wiki.pl?WikiHomePage#nid1J    (07)

<Prev in Thread] Current Thread [Next in Thread>