OntologySummit2012: Session-09 - Thu 2012-03-08    (35V5)

Summit Theme: OntologySummit2012: "Ontology for Big Systems"    (35V6)

Track (4) Title: Large-Scale Domain Applications    (365W)

Session Topic: Large-scale domain applications – Biomedical, earth & environmental science & engineering    (365X)

Session Chair: Dr. TrishWhetzel (NCBO; Stanford) and Dr. SteveRay (CMU) - intro slides    (365Y)

Panelists:    (365Z)

Archives:    (3666)

Conference Call Details    (35V7)

Attendees    (3675)

Abstract:    (367P)

Large-Scale Domain Applications - II : Biomedical, earth & environmental science & engineering    (367Q)

This is our 9th Ontology Summit, a joint initiative by NIST, Ontolog, NCOR, NCBO, IAOA & NCO_NITRD with the support of our co-sponsors. The theme adopted for this Ontology Summit is: "Ontology for Big Systems." The event today is our 9th virtual session.    (367R)

The principal goal of the summit is to bring together and foster collaboration between the ontology community, systems community, and stakeholders of some of "big systems." Together, the summit participants will exchange ideas on how ontological analysis and ontology engineering might make a difference, when applied in these "big systems." We will aim towards producing a series of recommendations describing how ontologies can create an impact; as well as providing illustrations where these techniques have been, or could be, applied in domains such as bioinformatics, electronic health records, intelligence, the smart electrical grid, manufacturing and supply chains, earth and environmental, e-science, cyberphysical systems and e-government. As is traditional with the Ontology Summit series, the results will be captured in the form of a communiqué, with expanded supporting material provided on the web.    (367S)

The large-scale domain applications track will help to ground the discussions in the other tracks and bring key challenges to light by describing current large-scale systems and systems of systems that either use, or could use, ontologies in their deployment. "Large-scale" can mean either very large data sets, very complex data sets, federated systems, highly distributed systems, or real-time, continuous data systems. Examples of large data sets might include scientific observations and studies; complex data sets could be technical data packages for manufactured products, or electronic health records; federated systems could include information sharing to combat terrorism, highly distributed systems includes items such as the smart electrical grid (aka Smart Grid), and real-time systems include network management systems. Of course, some big systems might include all five aspects.    (367T)

Today’s speakers will describe experiences in adapting ontology technology for biomedical applications, biological plant studies, earth science, a hydrology system, and the oil and gas industry. As with the prior session in this track, each presentation will try to highlight what systems engineering or operational functions were impacted by the use of ontology, and how. These examples have been chosen to help ground the discussions in the other summit tracks of where ontology could or should be used.    (367U)

More details about this Summit at: OntologySummit2012 (home page for the summit)    (367V)

Agenda:    (367W)

Ontology Summit 2012 - Panel Session-09    (367X)

Proceedings:    (3689)

Please refer to the above    (368A)

IM Chat Transcript captured during the session:    (368B)

 see raw transcript here.    (368C)
 (for better clarity, the version below is a re-organized and lightly edited chat-transcript.)
 Participants are welcome to make light edits to their own contributions as they see fit.    (368D)
 -- begin in-session chat-transcript --    (368E)
	PeterYim: Welcome to the    (36ED)
	 = OntologySummit2012: Session-09, Thursday 2012-03-08 =    (36EE)
	Summit Theme: OntologySummit2012: "Ontology for Big Systems"    (36EF)
	Track (4) Title: Large-Scale Domain Applications    (36EG)
	Session Topic: Large-scale domain applications – Biomedical, earth & environmental science & engineering    (36EH)
	Session Chairs: Dr. TrishWhetzel (NCBO; Stanford) and Dr. SteveRay (CMU)    (36EI)
	Panelists:    (36EJ)
	* Mr. DavidPrice (TopQuadrant) 
	  - "Experiences from a Large Scale Ontology-Based Application Development for Oil Platforms"    (36EK)
	* Dr. MichaelKellen (Sage Bionetworks)
	  - "Collaborative Clinical Genomics Data Analysis with Sage Bionetworks Synapse"    (36EL)
	* Dr. DamianGessler (iPlant Collaborative) & Dr. BlazejBulka (Clark & Parsia)
	  - "The iPlant Collaborative Semantic Web Platform: Using OWL and SSWAP (Simple Semantic Web Architecture and Protocol) for On-Demand Semantic Pipelines"    (36EM)
	* Dr. IlyaZaslavsky (San Diego Supercomputing Center)
	  - "Managing observation semantics in CUAHSI Hydrologic Information System"    (36EN)
	* Dr. LinePouchard (Oak Ridge National Laboratory)
	  - "Linked Science as a producer and consumer of big data in the Earth Sciences"    (36EO)
	Session page: http://ontolog.cim3.net/cgi-bin/wiki.pl?ConferenceCall_2012_03_08    (36EP)
	Mute control: *7 to un-mute ... *6 to mute (please make sure your own phone is not muted as well)    (36EQ)
	Can't find Skype Dial pad? ... it's under the "Call" dropdown menu as "Show Dial pad"    (36ER)
	 == Proceedings: ==    (36ES)
	 anonymous morphed into DamianGessler    (36ET)
	anonymous morphed into ChristopherSpottiswoode    (36EU)
	anonymous1 morphed into DavidFlater    (36EV)
	anonymous morphed into JimSchoening    (36EW)
	anonymous morphed into DavidPrice    (36EX)
	anonymous morphed into MichaelKellen    (36EY)
	anonymous morphed into JimRhyne    (36EZ)
	anonymous morphed into ScottHills    (36F0)
	anonymous morphed into ByronDavies    (36F1)
	anonymous morphed into PavithraKenjige    (36F2)
	LinePouchard: testing    (36I4)
	PeterYim: == DavidPrice presenting ...    (36F4)
	anonymous morphed into ElisaKendall    (36I5)
	MikeBennett: Does the reported data include SCADA data from on-platform systems such as Fire and 
	Gas, ESD and so on? Just curious.    (36I6)
	DavidPrice: The date is about Drilling and Production. Some of it is measurement, but not really 
	SCADA.    (36F7)
	MikeBennett: @David thanks. Just seeing an opportunity there.    (36F8)
	Ernani Santos morphed into ErnaniSantos    (36F9)
	anonymous morphed into ThomasGetgood    (36FA)
	DougFoxvog: What is "warm fallover"?    (36FB)
	HaroldBoley: Re Slide 10: Would the 300 million triples be structured/modularized in some way, e.g. 
	as named graphs?    (36FC)
	DavidPrice: The triples are managed in graphs based on Licenses for Fields in the sea. This allows 
	us to control the set of data over which aggregations and queries are allowed and also allows us to 
	use these graphs as the basis for access control and security.    (36FD)
	MichaelGruninger: @David -- you say that it is hard to test an ontology. Can you share any of your 
	ontologies which we can test?    (36FE)
	anonymous morphed into DWiz    (36FF)
	AmandaVizedom: @David - to what extend do you think the "vague/ambiguous" character of the 
	ontologies or their documentation (your slide 17) is a result of the application type being 
	comparatively shallow (that is, not too much axiomatization or reasoning)?    (36FG)
	DavidPrice: I attribute vague-ness entirely to projects running out of money - in fact I have heard 
	that from the people who created some of the background ontologies/reference data we are re-using.    (36FH)
	DavidPrice: I cannot yet share the ontologies, but they will be made public eventually ... probably 
	3Q2012. The project is not yet complete, the Drilling stuff is going into production use this month. 
	The Production use will follow between now and June/July.    (36FI)
	DWiz: David: Do you have a mandatory XDS? Otherwise, how to you generate a true ontology from XML?    (36FJ)
	DavidPrice: Our 'XSD Proxy Ontology' capability does not try to create a 'true ontology' - as the 
	name suggests it's a proxy for the XSD that allow us to import an XML data file into the workspace 
	and do SPARQL over it directly through the proxy ontology-based triples.    (36FK)
	AmandaVizedom: @David - Have you tried developing test beds and test apps with which to evaluate the 
	ontologies? Given the well-developed application context, this seems like an approach you could use 
	to ontology testing & proofing.    (36FL)
	DavidPrice: We are just now getting into the detailed use of proper software testing tools to try to 
	do a better job wrt testing our ontology. We are working on large, complete test cases, test 
	scenarios, automated tests using REST-like services, etc. to make testing the ontology fit into the 
	more traditional testing apparatus used by our software team.    (36FM)
	AmandaVizedom: @David - Ah, yes, I see what you mean. Not unlike undocumented code from abandoned 
	software projects, then.    (36FN)
	DavidPrice: FWIW we use Github for the source code/ontology management and SpiraTest for the testing 
	tool.    (36FO)
	DavidPrice: TopBraid Composer is an eclipse-based tool and that's what we use to develop ontology 
	and SPIN/SPARQL/SWP.    (36FP)
	DavidPrice: I have to leave for a while but will respond to any other questions in 30 mins or so. 
	Thanks!    (36FQ)
	SteveRay: Thanks David. Fascinating talk.    (36FR)
	AmandaVizedom: @David - I can't remember who it was, now, but one presenter earlier in the summit 
	discussed an approach in which they used "real" (sound, computational) ontologies but also 
	intermediate artifacts that are ontological in format (OWL) but are not used (or usable) as 
	ontologies; rather, they are fairly direct models of the data source's data model. This is similar 
	to what you describe, yes?    (36FS)
	DavidPrice: @Amanda For ontologies of a domain, we usually follow Leo Orbst use of the term 'Strong 
	ontologies' meaning they are about the domain of interest.    (36FT)
	DavidPrice: @Amanda - wrt direct models of the data source ... yes, we call that a proxy ontology.    (36FU)
	AmandaVizedom: @David - Thanks. I think a similar approach is used for the DoD EIW. Independently, 
	we discussed using something like "proxy ontologies" on the USAF project. Though non-technical 
	factors/authorities mooted that discussion, I thought (think) that it was promising. I sense a 
	pattern.    (36FV)
	DavidPrice: @Amanda For us the main thing is to use a semantic language like SPARQL to define the 
	transforms between data sources and targets and so everything must be presented as at least RDF, and 
	preferably as an OWL ontology.    (36FW)
	DavidPrice: We also have ontologies of SPARQL, etc. which we often call 'system ontologies' as in 
	software system ... just to confuse things even more    (36FX)
	AmandaVizedom: @David - though I did and do think there needs to be some explicit (meta)data on/in 
	the proxy ontology to make clear that it is not a full ontology - that is, under the formal 
	semantics of the language used, it would not likely be computationally sound.    (36FY)
	MikeBennett: @Amanda et al - we need a metaontology. Or at least an ISO 1087 compliant terminology 
	and vocabulary setting out rather less messy uses of words like "Ontology" - people think they are 
	listening about the same thing when someone is talking about a different thing - dog food much?    (36FZ)
	DavidPrice: We have a taxonomy of ontology-related artifacts we use (and modify as required in 
	various projects). It can be used as metadata but we also use it in the base of the URIs for things 
	and even in the name of graphs so it's visible to the ontologies/software developer.    (36G0)
	DavidPrice: We actually often use the phrase 'schema' when we mean the 'ontology' in many projects 
	because customers are more familiar with that terminology. We then have graphs that are 
	transforms', that are 'swp', that are 'spin', that are 'testcase', etc. as you would in a typical 
	software development team.    (36G1)
	AmandaVizedom: @Mike - Yup. And we've probably gotten far enough along since we started saying that 
	that we could actually build one, identifying main types of artifact. We would then run into the 
	problem of every domain, in that we would disagree over what to call the various types. Were we then 
	to get over the names problem (via multiple, contextual labels and/or other techniques), we'd then 
	have a very useful product *and* a useful methodological example!    (36G2)
	DavidPrice: @Amanda We have published some work on a metadata for ontology at 
	http://linkedmodel.org/doc/vaem/1.2/ Vocabulary for Essential Metadata ... Ralph Hodgson has pushed 
	that effort.    (36G3)
	PeterYim: == MikeKellen presenting ...    (36G4)
	anonymous morphed into GiulianoLancioni    (36G5)
	SimonSpero: @MichaelKellen: SKOS is for controlled vocabularies; SKOS concepts are "Subjects", not 
	the things that subjects are about"    (36G6)
	SimonSpero: @Michael: is that the intended semantics    (36G7)
	MichaelKellen: Yes, we aren't trying to create a model of the relationships among domain objects 
	that we can reason about    (36G8)
	MichaelKellen: We are simply trying to consistently structure information to help scientists pull 
	together appropriate data so that they can reason about it    (36G9)
	SimonSpero: @Michael: as long as it's just for guiding people to data sets, that's a safe use    (36GA)
	MichaelKellen: There are other projects in life sciences trying to use the richer semantics to 
	actually model the domain    (36GB)
	MichaelKellen: The problem they hit is that there are so many unknowns in our domain that this is 
	hard to do    (36GC)
	SimonSpero: @Michael: right - it's just important to keep the distinctions clear so that a KOS isn't 
	used directly an ontology    (36GD)
	MikeBennett: @Simon @Michael we have a labeling problem: if we get into the habit of referring to 
	everything that is in triple-store formats as "An ontology" then we need a new word for ontologies. 
	Syntax is not semantics...    (36GE)
	PeterYim: == DamianGessler presenting ...    (36GF)
	BobbinTeegarden: @Damian where is are the transformation decisions made between services in the 
	pipeline?    (36GG)
	DamianGessler: @BobbinTeegarden third-parties hosting SSWAP semantic web services run a servlet that 
	we provide from our Software Development Kit. This servlet handles the semantics and ensures that 
	both input and output follow the protocol. So at the pipeline end, we can look at the outputs and 
	required inputs, and orchestrate the interaction. Third-party data need not pass through us: it goes 
	directly from the upstream service to the downstream service.    (36GH)
	BobbinTeegarden: @Damian Thank you, grand.    (36GI)
	DougFoxvog: How do you deal when a large number of possible services are available at one point? 
	E.g., there may be hundreds of services available for converting images from Format A to Format B.    (36GJ)
	DamianGessler: @DougFoxvog The key is in service choice prioritization--just like Google prioritizes 
	web pages on its search results page. But here, we are not nearly as sophisticated as Google and 
	currently use a very simple algorithm. We'll put focus here down the road a little.    (36GK)
	AmandaVizedom: @Damian - There are many similarities between the iPlant approach you describe and 
	the Semantic SOA - service discovery approach being developed by USAF. I think that the approach 
	used in the DoD EIW is also strongly similar (perhaps DWiz - DennisWisnosky - will comment). In each 
	case, ontologies are primarily being used to provide semantic description, model, or wrapper for 
	(mostly natively non-semantic) data services, and ontological reasoning and search technologies are 
	used to enable service discovery given user needs. Your statement about ontology alignment, however, 
	stands out. I understand service matching operationally and ephemerally based on reasoning over 
	lightly-aligned ontologies. But you seem to be saying something else, that the alignment of the 
	locally-developed ontologies in which the services are described is not manual, not static, and not 
	axiomatic. Can you say something more about how you align, or connect, or reason across such 
	ontologies without any prior / stable alignment points?    (36GL)
	DamianGessler: @AmandaVizedom The key is that we are not aligning ontologies on *data* per se; we 
	let services make the mapping statements (e.g., the (possibly complex) data they take in and the 
	(possibly complex) data they give back. So we "simply" need to determine and operate on subsumption 
	questions: can this service operate on my data and return what I want? This is essentially a 
	dynamic, operational alignment question.    (36GM)
	PeterYim: == IlyaZaslavsky presenting ...    (36GN)
	anonymous morphed into CarlosRueda    (36GO)
	SimonSpero: @Ilya: if the hierarchies are genuine hierarchies - that is, subordinate terms always 
	entail the superordinate term, then you have traditional Knowledge Organization System semantics    (36GP)
	ElisaKendall: @Ilya, have you used any vocabulary such as ISO 1087 to define relationships such as 
	synonomy, polysemy, etc.? Just wondering ...    (36GQ)
	MatthewWest: Sorry I have to go.    (36GR)
	BobbinTeegarden: @Ilya what did you use for the visualizations, and how well received were they?    (36GS)
	IlyaZaslavsky: @BobbinTeegarden: Inxight startree, semantic wiki, also recently Silverlight. 
	Hydrologists were comfortable using Inxight startree in the tagging application    (36GT)
	IlyaZaslavsky: @ElisaKendall: no, we haven't. This sounds interesting. The current plan is to use 
	SKOS    (36GU)
	IlyaZaslavsky: @SimonSpero. No, in many cases these are not trees. We try to present them as trees 
	where possible    (36GV)
	HaroldBoley: @Elisa, do you use SBVR rules for mapping?    (36GW)
	ElisaKendall: @Ilya: I have a draft ontology for ISO 1087 that we're planning to standardize at OMG, 
	together with ISO TC 37. I'm guessing that we will be publishing a draft sometime this 
	spring/summer, with one of the goals being to use it to assist in mapping SBVR vocabularies to 
	ODM/OWL. If you'd like to chat more about this offline, please feel free to contact me directly, at 
	ekendall at thematix.com.    (36GX)
	IlyaZaslavsky: @ElisaKendall, thanks Elisa, I'd be interested    (36GY)
	ElisaKendall: @Harold, not so far, but Mark Linehan (IBM) has been doing some work in this area for 
	a Date Time vocabulary we're creating at OMG. The current alpha (or maybe beta) spec is available at 
	http://www.omg.org/spec/DTV/1.0/Beta1/, but doesn't include much of the OWL work we've been doing 
	more recently in finalization.    (36GZ)
	ElisaKendall: @Harold, you might take a look even at the Beta spec for the Date Time effort, as it 
	does include some OCL and CLIF statements for the SBVR definitions, which we're still refining, but 
	could give you a better sense of what the SBVR actually is intended to say .    (36H0)
	HaroldBoley: @Elisa, thanks, I just opened the huge http://www.omg.org/spec/DTV/1.0/Beta1/ PDF.    (36H1)
	FrankOlken: @ElisaKendall The URL you posted for DTV yields a 404 (page missing) error ...    (36H2)
	ElisaKendall: @Frank, hmmm... for me it downloads, so I'm glad you were able to get to it.    (36H3)
	ElisaKendall: @Harold, Sorry , but hopefully you'll find it useful. When we've added the OWL it will 
	only get bigger ... as you might imagine, but I think the result could provide a next generation OWL 
	Time ontology, and includes a number of business oriented definitions.    (36H4)
	MikeBennett: Is it the mental model of hydraulogists that doesn't map to Protege, or that fact that 
	to use Protege you first need a mental model of ontology, since Protege in no way presents a visual 
	or other model of ontology to the person looking at it.    (36H5)
	FrankOlken: @ElisaKendall The DTV link that HaroldBoley posed seems to work.    (36H6)
	HaroldBoley: @Frank, my quote of the URL omitted the comma, wrongly assumed to be part of the URL by 
	the chat software.    (36H7)
	MikeBennett: @Frank et al - there is a rogue comma in the original URL.    (36H8)
	HaroldBoley: @Elisa, yes it looks very exhaustive. Can I mentioned it in the OASIS TC on 
	LegalRuleML?    (36H9)
	ElisaKendall: @Harold - Yes, of course, and please give people the link to the specification. Our 
	next iteration will have more OWL, and ultimately we will have a set of OWL ontologies corresponding 
	to the SBVR    (36HA)
	PeterYim: == LinePouchard presenting ...    (36HB)
	LinePouchard: @Peter: I sent you a new set of slides that contain page numbers. Would you have time 
	to upload them?    (36HC)
	PeterYim: @Line - ack .... will try now    (36HD)
	MichaelGruninger: my copy has page numbers    (36HE)
	GiulianoLancioni: mine too    (36HF)
	JimRhyne: So does mine.    (36HG)
	AmandaVizedom: Copy just downloaded from refreshed call page does have the page numbers.    (36HH)
	DougFoxvog: The link to the slides is the same. Re-download & you'll get the page numbers    (36HI)
	SteveRay: Strange, I just did this and do not get slide numbers...    (36HJ)
	anonymous morphed into NancyWiegand    (36HK)
	DougFoxvog: @Line: attaching multiple textual terms to individual ontology terms would assist in 
	search, not relying on searching only on the ontology term's name.    (36HL)
	PeterYim: == SteveRay moderating the open discussion ...    (36HM)
	SimonSpero: [ref. MichaelGruninger's verbal comment] Testing; Competency or performance?    (36HN)
	LinePouchard: DataONE is offering an Internship program for Summer 2012 at 
	http://www.dataone.org/internships. I am co-mentoring for Project #7 to continue the work described 
	today. In particular, this work needs to extend parts of the SWEET ontologies w.r.t soil science, 
	and a candidate with domain knowledge would be ideal. The intern works remotely from their own 
	institution for ten weeks. The deadline is March 12. Please share with your students.    (36HO)
	PeterYim: @ALL - the memory and the work of Dr. Robert Raskin 
	(http://ontolog.cim3.net/cgi-bin/wiki.pl?RobRaskin ) who passed away last Friday, will be with us. 
	Rob was the PI of the SWEET Ontology (Semantic Web for Earth and Environmental Terminology) project, 
	and an active contributor to this community    (36HP)
	MikeBennett: [ref. IlyaZaslavsky's comment about domain experts having problems with Protege, and 
	PeterYim's comment about limitations imposed by the expressivity of the tools or the language being 
	used] Expressivity versus what is expressed - these are two distinct matters.    (36HQ)
	LarryLefkowitz: And formalism vs content is another distinction. Having a great grammar but a small 
	vocabulary is certainly going to limit expressivity. Yes, in theory you can create new vocabulary on 
	the fly, but that could easily overtake the initial modeling task.    (36HR)
	MikeBennett: Exactly. you can have a very expressive model of data, but it's still a model of data. 
	Or you can have a more or less expresive model of real things in the problem domain, and it's an 
	ontology.    (36HS)
	ElisaKendall: @Frank, the units/dimensions part of the model is limited, fyi, but the SysML effort 
	for quantities and units and this units model are in the process of being aligned now at OMG, with 
	the SysML version being more comprehensive, as you might imagine/hope.    (36HT)
	DavidFlater: @Steve What happened with OASIS Quantities & Units of Measure Ontology?    (36HU)
	DavidPrice: There is also the QUDT ontology from NASA Ames that was input to the OASIS QUOMOS 
	activity ... http://qudt.org/    (36HV)
	ElisaKendall: @David, Folks from OMG who are working on the SysML QUDV effort are looking at 
	alignment with that as well, fyi, but I don't know much about the differences. I have heard there 
	are some, which they are working through, and that JPL is involved along with ESA.    (36HW)
	IlyaZaslavsky: @LinePouchard: do you know if there are already soil vocabularies available in some 
	form? Is there any relationship with SoilML? CZO project would be interested in this.    (36HX)
	LinePouchard: @Ilya: yes, I had talked to Nancy W. about this, and last Fall, SoilML was not 
	released yes.    (36HY)
	DamianGessler: Thank you    (36HZ)
	GaryBergCross: bye all    (36I0)
	PeterYim: great session!    (36I1)
	SteveRay: Thanks everybody for making the session stimulating.    (36I2)
	PeterYim: -- session ended: 11:26am PST --    (36I3)
 -- end of in-session chat-transcript --    (368F)

Audio Recording of this Session    (368L)

Additional Resources:    (368U)


For the record ...    (369A)

How To Join (while the session is in progress)    (369B)