ontology-summit
[Top] [All Lists]

[ontology-summit] [Ontology Application Framework] Integration

To: "'Ontology Summit 2011 discussion'" <ontology-summit@xxxxxxxxxxxxxxxx>, "'Michael Gruninger'" <gruninger@xxxxxxxxxxxxxxx>, "'Michael F Uschold'" <uschold@xxxxxxxxx>, "'Nicola Guarino'" <guarino@xxxxxxxxxx>
From: "Matthew West" <dr.matthew.west@xxxxxxxxx>
Date: Fri, 1 Apr 2011 08:41:33 +0100
Message-id: <4d9581ad.cc0ce30a.0ac1.6107@xxxxxxxxxxxxx>

Dear Michael G Michael U and Nicola,

 

I was reading this section on integration during the call last night, and it did not ring true to me, but I was struggling to put my finger on what I thought the problem was:

In this class of applications, the primary functionality is the matching and mapping of concepts, while the primary architecture is within a set of multiple systems. Ontologies are typically used at runtime by application developers (who are in the best position to write translators among the systems).    (2PQO)

  • information integration Multiple information resources are combined using ontologies at runtime to match concepts with similar meaning.    (2OM2)

Examples: web service composition, mashups, information aggregation, data fusion, linked data    (2PQP)

  • database integration Queries that require multiple databases are specified using common ontologies and data schema are matched using these ontologies at runtime.

After a troubled night’s sleep, I think I have a better idea.

 

In industry, when people talk about integration, they mean what it takes to turn a bunch of separate information systems into a single coherent system of systems. There are typically 4 types of integration with different characteristics:

 

1.       Core Process Integration: An enterprise will have at least one and maybe several core processes. These are the processes the execution of which are the purpose of the enterprise. In business they are often around the provision of goods or services to customers. These core processes are supported by information systems, and information needs to pass between them in order to efficiently perform the overall process, integrating the disparate systems involved. A key part of this will include Master Data Management so that all the systems use the same identification for key things like products and customers.

2.       Lifecycle Integration: There will be a number of things needed to support the core processes, such as factories and vehicles, each of which has a lifecycle that needs to be integrated. Particularly with large and long lived assets like factories and ships, information needs to be passed between lifecycle stages (e.g. design to construction, design and construction to operations, etc) and between the information systems that support the different processes in those different stages.

3.       Supply Chain Integration: When enterprises procure goods in particular, but also services, they do not just capture price information, part of the product will be information about it, and its use. This information needs to be integrated within the systems that are involved with the operation and use of those products.

4.       Business Intelligence (Performance Management): To understand how well an enterprise is performing its processes you need to be able to analyse the data that arises from the performance of its processes. This usually involves collecting and integrating the data from multiple systems, and populating star or snowflake schemas so that for example sales can be analysed by customer type, by product type, by geographic region, by volume, by value, by profitability etc.

 

I’m afraid I did not get a any sense of this from the description above.

 

The other thing that was implicit was that data resides in one system and  is queried from there (single source of truth). This is actually extremely unusual. Much more common is managed replication, for reasons of both performance and resilience. Take the business intelligence type of integration above. The source data is typically in one or more Transaction Processing systems, where there are no indices because when you create records, building the indices are what hits performance. So to perform the queries, not only do you need to do joins across the transaction data and the master data, but most queries will require a table scan. This of course will not only affect the performance of the query, but  the TP system as well. So a copy of the TP data is made at regular intervals and it is combined with the Master Data into massively denormalized tables that have indices on most columns. Performance is transformed.

 

There are in fact a variety of architectures possible for achieving data integration, which is suitable will depend on the circumstances. This paper of mine from 1998 outlines some of the possible approaches.

 

http://www.matthew-west.org.uk/documents/TheRoleOfEnterpriseModelsInIntegration.PDF

 

Regards

 

Matthew West                           

Information  Junction

Tel: +44 1489 880185

Mobile: +44 750 3385279

Skype: dr.matthew.west

matthew.west@xxxxxxxxxxxxxxxxxxxxxxxxx

http://www.informationjunction.co.uk/

http://www.matthew-west.org.uk/

 

This email originates from Information Junction Ltd. Registered in England and Wales No. 6632177.

Registered office: 2 Brookside, Meadow Way, Letchworth Garden City, Hertfordshire, SG6 3JE.

 

 


_________________________________________________________________
Msg Archives: http://ontolog.cim3.net/forum/ontology-summit/   
Subscribe/Config: http://ontolog.cim3.net/mailman/listinfo/ontology-summit/  
Unsubscribe: mailto:ontology-summit-leave@xxxxxxxxxxxxxxxx
Community Files: http://ontolog.cim3.net/file/work/OntologySummit2011/
Community Wiki: http://ontolog.cim3.net/cgi-bin/wiki.pl?OntologySummit2011  
Community Portal: http://ontolog.cim3.net/wiki/     (01)
<Prev in Thread] Current Thread [Next in Thread>