ontology-summit
[Top] [All Lists]

Re: [ontology-summit] [Requirements] FW: promised material on expressing

To: Ontology Summit 2008 <ontology-summit@xxxxxxxxxxxxxxxx>, "Ontology Summit 2008" <ontology-summit@xxxxxxxxxxxxxxxx>, doug@xxxxxxx
From: Mala Mehrotra <mm@xxxxxxxxxxxxxxx>
Date: Thu, 03 Apr 2008 23:36:03 -0700
Message-id: <E1JhfXV-0007zl-7f@xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx>
This is really great, Doug. Thanks so much for sharing this wealth. 
Even though I had analyzed quite a few microtheories in the RKF days 
- I hadn't really come across the mappings aspect of Cyc then.    (01)

I felt very assured that quite a few categories of mappings I alluded 
to  in my talk today - does appear in Cyc as well. The detailed 
treatment on mappings of databases I found especially interesting as 
Pat and I have taken a very similar approach in CL.    (02)

However, I would be interested in learning if one could express 
partial mappings across database columns (from two different 
databases) - which has a hidden existential (such as the last example 
on my slide deck) through CycL? I have a feeling one could but it 
would get quite hairy in CycL.    (03)

                 Mala    (04)



At 04:37 PM 4/3/2008, Obrst, Leo J. wrote:
>Content-class: urn:content-classes:message
>Content-Type: multipart/alternative;
>         boundary="----_=_NextPart_001_01C895E3.B184C44D"
>
>Forward from Doug based on questions asked at the OOR panel session today.
>
>Thanks, Doug!
>
>
>Here were the questions, responses:
>
>[14:30] LeoObrst: Question to Doug: Given Cyc's long experience with 
>such matters, can you provide the OOR group with what you would 
>suggest as a "small set of inter-ontology alignment relations"? 
>Which are necessary and which are desirable?
>
>[14:45] doug lenat: Response to Leo: Yes, I would be happy to 
>provide the set of (surprisingly few) predicates we use to state 
>those inter-ontology correspondences.  I will send that out and/or 
>post it today or tomorrow.
>
>See the chat session for other interesting discussion.
>
>Leo
>
>-----Original Message-----
>From: Doug Lenat [mailto:doug@xxxxxxx]
>Sent: Thursday, April 03, 2008 5:46 PM
>To: Obrst, Leo J.; Pat Hayes; Peter P. Yim
>Subject: promised material on expressing the mappings Cyc to other ontologies
>
>This is the material I promised during the conference call; about
>mapping Cyc to other ontologies.  Please circulate or post this, as
>appropriate.
>
>There are two different sorts of "mapping to other ontologies/schemata"
>that we do from Cyc:
>* mapping between terms in Cyc and terms in other ontologies
>* mapping between terms in Cyc and terms in databases (or SUBMIT-able
>web pages).  We call this SKSI for Semantic Knowledge Source Integration.
>
>
>The next two sections treat these two asterisked processes in turn.
>
>To be clear, this is not HOW we figure out the mappings, this is the
>vocabulary of CycL predicates, collections, etc. we employ to express
>that mapping, express it formally, express it in ways that the standard
>Cyc inference engine can make use of information stored in those alien
>sources, e.g., as part of discharging some sub-sub-...-problem.
>
>There is a third sort of mapping which is NOT covered here, namely where
>we are exporting entire sets of Cyc assertions to languages (e.g., OWL)
>which are strictly less expressive than Cyc's representation language,
>and need to have special ways of flattening HOL assertions into FOPC, or
>full FOPC assertions into description logic, etc.
>
>--------------------------------------------------------
>
>Here's a summary of the predicates etc. we use to map to other ontologies:
>---------------------------------------------------------------------------
>
>exact mapping:
>
>  (synonymousExternalConcept CYC-TERM SOURCE SOURCE-TERM)
>
>CYC-TERM maps to SOURCE-TERM in SOURCE (ontology/representation system
>where SOURCE-TERM 'lives')
>
>E.g.,
>
>  (synonymousExternalConcept DrinkingMug
>  LSCOMObjectAndSituationOntology "Mug")
>
>;;;;;;;;;;;;;;;
>
>close mapping:
>
>   (overlappingExternalConcept CYC-TERM SOURCE SOURCE-TERM)
> 
>
>CYC-TERM closely maps to SOURCE-TERM in SOURCE.
> 
>
>This relation might hold under a number of conditions:
>
>* SOURCE-TERM conflates some concepts (so that there are
>   two or more Cyc-term entries with which it overlaps),
>
>* SOURCE-TERM is a predicate with the same meaning as CYC-TERM, but
>   has a different arg-order.
>
>* SOURCE-TERM denotes a predicate with different (usually tighter)
>   argument constraints than CYC-TERM (due to certain contextual assumptions
>   built into SOURCE ontology).
>
>So overlappingExternalConcept assertions are sometimes inferred from
>more specific mapping predicates, e.g.,
>
>;;;;
>
>Different Arg Constraints:
>
>  (synonymousExternalPredWRTTypes CYC-PRED SOURCE S-PRED TYPE1 TYPE2)
>
>* S-PRED denotes a specialization of CYC-PRED, where the
>specialization is defined by the tighening of the 1st and 2nd
>arguments as specified by TYPE1 and TYPE2.
>
>E.g.,
>
>  (synonymousExternalPredWRTTypes dateOfDeath JRC-EMMOntology
>   "dateDeath" Person Date)
>
>The JRC-EMM ontology only deals with deaths of people when they deal
>with deaths at all, so their "dateDeath" predicate maps to our
>#$dateOfDeath, but is more restrictive (our predicate relates living
>organisms to their death-dates).
>
>;;;;
>
>Different arg-order:
>
>  (synonymousExternalPred-Inverse PRED SOURCE STRING)
>
>E.g.,
>
>  (synonymousExternalPred-Inverse primeMinister JRC-EMMOntology
>   "isPrimeMinisterFor")
>
>(self-explanatory)
>
>;;;;
>
>Combination of arg-order and constraint-variance:
>
>  (synonymousExternalPredWRTTypes RELN SOURCE STRING TYPE1 TYPE2)
>
>E.g.,
>
>  (synonymousExternalPredWRTTypes-Inverse children JRC-EMMOntology
>   "childOf" Person Person)
>
>The Cyc predicate #$children relates two animals, in order of parent
>to child.  JRC-EMM ontology relates two people, in order of child to
>parent.
>
>
>------------------------------------------------
>
>Here's a summary of the SKSI mapping vocabulary we employ:
>----------------------------------------------------------------
>
>The term #$StructuredKnowledgeSource in the CycL language denotes the
>collection of all knowledge sources which have a well defined schema
>that can be used to access and interpret the source's data.  When a new
>database instance is integrated with Cyc, it is denoted by a unique term
>in CycL that is an instance of the collection #$Database-Physical, a
>specialization of #$StructuredKnowledgeSource.
>
>A knowledge source is related to its parts using the CycL predicate
>#$subKS-Direct. We create a instance of #$DatabaseTable-Physical in CycL
>for each table contained within a database and associate it with the
>CycL term for the parent database, relate it to the parent using
>#$subKS-Direct and assert the name of the table.
>
>The literal structure of part or all of a structured knowledge source is
>described by its physical schema, denoted in Cyc by the collection
>#$PhysicalSchema. A physical field is an abstraction of a column in a
>database table.  The CycL term #$PhysicalField denotes the collection of
>all physical fields. A physical schema has one physical field for each
>column of a table and bears the same name as the column it represents. A
>physical field is determined uniquely by its associated physical schema
>and name, so we introduce a binary function #$PhysicalFieldFn which
>takes an instance of #$PhysicalSchema as its first argument and a string
>as its second argument. Functional expressions denote unique instances
>of #$PhysicalField. A physical schema is related to its fields using the
>CycL predicate #$physicalFields.
>
>Here are some examples drawn from a mapping of Cyc to the USGS GNIS
>database:
>
>(#$physicalFields #$USGS-GNIS-PS (#$PhysicalFieldFn #$USGS-GNIS-PS "fid"))
>(#$physicalFields #$USGS-GNIS-PS (#$PhysicalFieldFn #$USGS-GNIS-PS "name"))
>(#$physicalFields #$USGS-GNIS-PS (#$PhysicalFieldFn #$USGS-GNIS-PS "type"))
>(#$physicalFields #$USGS-GNIS-PS (#$PhysicalFieldFn #$USGS-GNIS-PS
>"state_fips"))
>(#$physicalFields #$USGS-GNIS-PS (#$PhysicalFieldFn #$USGS-GNIS-PS
>"county_fips"))
>
>Field data types are represented by associating the field with a CycL
>term denoting its datatype:
>
>(#$fieldDataType (#$PhysicalFieldFn #$USGS-GNIS-PS "fid") #$Integer)
>(#$fieldDataType (#$PhysicalFieldFn #$USGS-GNIS-PS "name")
>#$CharacterString)
>(#$fieldDataType (#$PhysicalFieldFn #$USGS-GNIS-PS "type")
>#$CharacterString)
>(#$fieldDataType (#$PhysicalFieldFn #$USGS-GNIS-PS "state_fips")
>(#$StringOfLengthFn 2))
>(#$fieldDataType (#$PhysicalFieldFn #$USGS-GNIS-PS "county_fips")
>(#$StringOfLengthFn 3))
>
>The logical schema of a knowledge source (database or database table) is
>the semantic analogue of its physical schema. It describes how the
>content of a table is interpreted in the broader Cyc ontology. The
>collection of all logical schemas is denoted in CycL by the term
>#$LogicalSchema. Typically, one logical schema is associated with one
>physical schema for each database table.
>
>In CycL, a logical field type is a collection in the ontology and each
>instance of #$LogicalField is related to some instance of #$Collection,
>which determines its type. Similar to #$PhysicalFields, we construct
>instances of #$LogicalField using a function, #$LogicalFieldFn, that
>takes the logical schema as its first argument and use the logical
>field's type as its second argument. However since a table may have
>multiple columns that correspond to the same type of object, just
>stating the type alone is not sufficient to uniquely identify a logical
>field within a logical schema. So in addition we use a unique integer in
>the third argument of the function. The choice of the integer is
>arbitrary, as long as it results in a term that is distinct from the
>other logical fields for a schema. In addition, we relate a logical
>schema to its fields using the CycL predicate #$logicalFields.
>
>(#$logicalFields #$USGS-GNIS-LS (#$LogicalFieldFn #$USGS-GNIS-LS #$Place
>1))
>(#$logicalFields #$USGS-GNIS-LS (#$LogicalFieldFn #$USGS-GNIS-LS
>#$ProperNameString 1))
>(#$logicalFields #$USGS-GNIS-LS (#$LogicalFieldFn #$USGS-GNIS-LS
>#$CartographicFeatureType 1))
>(#$logicalFields #$USGS-GNIS-LS (#$LogicalFieldFn #$USGS-GNIS-LS
>#$State-UnitedStates 1))
>(#$logicalFields #$USGS-GNIS-LS (#$LogicalFieldFn #$USGS-GNIS-LS
>#$USCounty 1))
>
>In object oriented databases and many relational databases, tables often
>correspond to natural classes of objects in the world, and the rows of
>such tables may implicitly denote the objects of the class. In such
>cases, the table's primary key provides an identifier that may be used
>to uniquely distinguish between objects of the class, in addition to
>merely distinguishing between rows of a table. We distinguish such
>tables by assigning to them a special type of logical schema, called an
>object defining schema, which is denoted in CycL by the collection
>#$ObjectDefiningSchema, a specialization or subcollection of the
>collection #$LogicalSchema.
>
>For example, the gnis.type column contains coded values that describe
>the type of cartographic feature represented by a feature in the
>database. The physical field corresponding to this column is
>
>(#$PhysicalFieldFn #$USGS-GNIS-PS "type")
>
>and the logical field corresponding to this column is
>
>(#$LogicalFieldFn #$USGS-GNIS-LS #$CartgraphicFeatureType 1)
>
>The correspondence between the values for the physical field and the
>values for the logical field (instances of #$CartographicFeatureType)
>are recorded in the reified mapping #$USGS-FeatureType-CMLS using the
>#$codeMapping predicate. Here is a sample of these sentences:
>
>(#$codeMapping #$Usgs-FeatureType-CMLS "cave" #$Cave)
>(#$codeMapping #$Usgs-FeatureType-CMLS "other" #$Place)
>(#$codeMapping #$Usgs-FeatureType-CMLS "ruin" #$RuinedArtifact)
>(#$codeMapping #$Usgs-FeatureType-CMLS "unknown" #$Place)
>(#$codeMapping #$Usgs-FeatureType-CMLS "summit" #$Summit)
>(#$codeMapping #$Usgs-FeatureType-CMLS "slope" #$Slope-Topographical)
>(#$codeMapping #$Usgs-FeatureType-CMLS "ridge" #$Ridge-Hill)
>(#$codeMapping #$Usgs-FeatureType-CMLS "ppl" #$PopulatedPlace)
>
>Finally, the following sentence tells Cyc that the
>#$USGS-FeatureType-CMLS reified mapping should be used to translate the
>logical field above:
>
>(#$logicalFieldMapping (#$LogicalFieldFn #$USGS-GNIS-LS
>#$CartgraphicFeatureType 1) #$Usgs-FeatureType-CMLS)
>
>Whereas the SQL standard and database management systems conflate the
>two at the conceptual level, we distinguish between a physical field and
>an arbitrary value of the physical field, and between a logical field
>and an abritrary value of the logical field. The values of a physical
>field are called physical field indexicals and are denoted in CycL by
>the collection #$PhysicalFieldIndexical. Similarly, the values of a
>logical field are called logical field indexicals and are denoted in
>CycL by the collection #$LogicalFieldIndexical. Exactly one physical
>field indexical is created for each physical field, and one logical
>field indexical for each logical field. To denote the instances of
>#$PhysicalFieldIndexical and #$LogicalFieldIndexical we introduce two
>additional functions, #$ThePhysicalFieldValueFn and
>#$TheLogicalFieldValueFn. They are the indexical analogues of
>#$PhysicalFieldFn and #$LogicalFieldFn and have exactly the same
>argument signature. Physical and logical indexicals are related to their
>schema using the CycL predicates #$physicalFieldIndexicals and
>#$logicalFieldIndexicals respectively, and to their corresponding
>physical and logical fields using the CycL predicates
>#$indexicalForPhysicalField<tt> and <tt>#$indexicalForLogicalField
>respectively. The sentences and terms for the USGS GNIS example fields are:
>
>(#$physicalFieldIndexicals #$USGS-GNIS-PS (#$ThePhysicalFieldValueFn
>#$USGS-GNIS-PS "fid"))
>(#$physicalFieldIndexicals #$USGS-GNIS-PS (#$ThePhysicalFieldValueFn
>#$USGS-GNIS-PS "name"))
>(#$physicalFieldIndexicals #$USGS-GNIS-PS (#$ThePhysicalFieldValueFn
>#$USGS-GNIS-PS "type"))
>(#$physicalFieldIndexicals #$USGS-GNIS-PS (#$ThePhysicalFieldValueFn
>#$USGS-GNIS-PS "state_fips"))
>(#$physicalFieldIndexicals #$USGS-GNIS-PS (#$ThePhysicalFieldValueFn
>#$USGS-GNIS-PS "county_fips"))
>
>(#$indexicalForPhysicalField (#$PhysicalFieldFn #$USGS-GNIS-PS "fid")
>(#$ThePhysicalFieldValueFn #$USGS-GNIS-PS "fid"))
>(#$indexicalForPhysicalField (#$PhysicalFieldFn #$USGS-GNIS-PS "name")
>(#$ThePhysicalFieldValueFn #$USGS-GNIS-PS "name"))
>(#$indexicalForPhysicalField (#$PhysicalFieldFn #$USGS-GNIS-PS "type")
>(#$ThePhysicalFieldValueFn #$USGS-GNIS-PS "type"))
>(#$indexicalForPhysicalField (#$PhysicalFieldFn #$USGS-GNIS-PS
>"state_fips") (#$ThePhysicalFieldValueFn #$USGS-GNIS-PS "state_fips"))
>(#$indexicalForPhysicalField (#$PhysicalFieldFn #$USGS-GNIS-PS
>"county_fips") (#$ThePhysicalFieldValueFn #$USGS-GNIS-PS "county_fips"))
>
>(#$logicalFieldIndexicals #$USGS-GNIS-LS (#$TheLogicalFieldValueFn
>#$USGS-GNIS-LS #$Place 1))
>(#$logicalFieldIndexicals #$USGS-GNIS-LS (#$TheLogicalFieldValueFn
>#$USGS-GNIS-LS #$ProperNameString 1))
>(#$logicalFieldIndexicals #$USGS-GNIS-LS (#$TheLogicalFieldValueFn
>#$USGS-GNIS-LS #$CartographicFeatureType 1))
>(#$logicalFieldIndexicals #$USGS-GNIS-LS (#$TheLogicalFieldValueFn
>#$USGS-GNIS-LS #$State-UnitedStates 1))
>(#$logicalFieldIndexicals #$USGS-GNIS-LS (#$TheLogicalFieldValueFn
>#$USGS-GNIS-LS #$USCounty 1))
>
>(#$indexicalForLogicalField (#$LogicalFieldFn #$USGS-GNIS-LS #$Place 1)
>(#$TheLogicalFieldValueFn #$USGS-GNIS-LS #$Place 1))
>(#$indexicalForLogicalField (#$LogicalFieldFn #$USGS-GNIS-LS
>#$ProperNameString 1) (#$TheLogicalFieldValueFn #$USGS-GNIS-LS
>#$ProperNameString 1))
>(#$indexicalForLogicalField (#$LogicalFieldFn #$USGS-GNIS-LS
>#$CartographicFeatureType 1) (#$TheLogicalFieldValueFn #$USGS-GNIS-LS
>#$CartographicFeatureType 1))
>(#$indexicalForLogicalField (#$LogicalFieldFn #$USGS-GNIS-LS
>#$State-UnitedStates 1) (#$TheLogicalFieldValueFn #$USGS-GNIS-LS
>#$State-UnitedStates 1))
>(#$indexicalForLogicalField (#$LogicalFieldFn #$USGS-GNIS-LS #$USCounty
>1) (#$TheLogicalFieldValueFn #$USGS-GNIS-LS #$USCounty 1))
>
>The schema translation process begins by relating a physical schema to
>its associated logical schema. This is done using the CycL predicate
>#$logicalPhyscialSchemaMap. For the USGS GNIS table, we have
>
>(#$logicalPhysicalSchemaMap #$USGS-GNIS-LS #$USGS-GNIS-PS)
>
>Physical and logical field translations are stated as relationships
>between their corresponding indexical terms. These relationships
>describe explicitly how to manipulate the raw data object from its
>representation as a physical field value to its representation as a
>logical field value, and vice versa. This is acomplished using two CycL
>predicates, #$fieldDecoding and #$fieldEncoding. Below are the schema
>translation templates for the gnis.fid physical and logical fields that
>describe how the data values are translated, using the physical and
>logical field indexicals as placeholders:
>
>(#$fieldDecoding
>  #$USGS-GNIS-LS
>  (#$TheLogicalFieldValueFn #$USGS-GNIS-LS #$Place 1)
>  #$USGS-GNIS-PS
>  (#$SourceSchemaObjectFn
>    #$USGS-KS
>    #$USGS-GNIS-LS
>    (#$ThePhysicalFieldValueFn #$USGS-GNIS-PS "fid")))
>
>(#$fieldEncoding
>  #$USGS-GNIS-PS
>  (#$ThePhysicalFieldValueFn #$USGS-GNIS-PS "fid")
>  #$USGS-GNIS-LS
>  (#$SourceSchemaObjectIDFn
>    #$USGS-KS
>    #$USGS-GNIS-LS
>    (#$TheLogicalFieldValueFn #$USGS-GNIS-LS #$Place 1)))
>
>The #$fieldDecoding says, in essence, that to convert a value for the
>gnis.fid into an instance of #$Place, plug it into the third argument of
>the #$SourceSchemaObjectFn term, with the given terms #$USGS-KS and
>#$USGS-GNIS-LS in the second and third arguments respectively, while the
>fieldEncoding says that to convert an instance of #$Place to a raw data
>value that is consistent with the constraints on gnis.fid,plug it into
>the third argument of the #$SourceSchemaObjectIDFn term with the given
>terms #$USGS-KS and #$USGS-GNIS-LS in the second and third arguments
>respectively.
>
>--
>----------------------------------
>Douglas Lenat
>CEO, Cycorp
>7718 Wood Hollow Drive, Suite 250
>Austin, TX 78731
>
>phone: 512-342-4001
>cell: 512-773-1709
>email: Doug@xxxxxxx
>----------------------------------
>
>
>_________________________________________________________________
>Msg Archives: http://ontolog.cim3.net/forum/ontology-summit/
>Subscribe/Config: http://ontolog.cim3.net/mailman/listinfo/ontology-summit/
>Unsubscribe: mailto:ontology-summit-leave@xxxxxxxxxxxxxxxx
>Community Files: http://ontolog.cim3.net/file/work/OntologySummit2008/
>Community Wiki: http://ontolog.cim3.net/cgi-bin/wiki.pl?OntologySummit2008
>Community Portal: http://ontolog.cim3.net/    (05)

Mala Mehrotra
Pragati Synergetic Research Inc.  MS 19-46Q, NASA Research Park, 
Moffett Field, CA 94035
Voice:
(650)-625-0274(Office)
(408)-861-0939 (Home Office)
(408)-910-4115 (Cell)
Fax: (408)-516-9599
URL: http://www.pragati-inc.com                 
Email: mm@xxxxxxxxxxxxxxx    (06)



_________________________________________________________________
Msg Archives: http://ontolog.cim3.net/forum/ontology-summit/ 
Subscribe/Config: http://ontolog.cim3.net/mailman/listinfo/ontology-summit/  
Unsubscribe: mailto:ontology-summit-leave@xxxxxxxxxxxxxxxx
Community Files: http://ontolog.cim3.net/file/work/OntologySummit2008/
Community Wiki: http://ontolog.cim3.net/cgi-bin/wiki.pl?OntologySummit2008 
Community Portal: http://ontolog.cim3.net/    (07)
<Prev in Thread] Current Thread [Next in Thread>