OntologySummit2012: Session-10, Thursday 2012-03-15    (365S)

Summit Theme: OntologySummit2012: "Ontology for Big Systems"    (36TB)

Track 3 Title: Challenge: Ontology and Big Data    (36TC)

Session Topic: Big Data Developing Challenges    (36TD)

Session Chairs: Ms. MaryBrady (NIST) and Mr. ErnieLucier (NCO/NITRD) - intro-slides    (36TE)

Panelists:    (36TG)

Archives:    (36TL)

Conference Call Details    (365T)

Attendees    (36UK)

ABSTRACT:    (365U)

Session Topic: Meeting Big Data Challenges through Ontology - III    (36UV)

This is our 7th Ontology Summit, a joint initiative by NIST, Ontolog, NCOR, NCBO, IAOA & NCO_NITRD with the support of our co-sponsors. The theme adopted for this Ontology Summit is "Ontology for Big Systems." The event today is our 10th virtual session.    (36UW)

The principal goal of the summit is to bring together and foster collaboration among the ontology community, systems community, and stakeholders of some of "big systems." Together, the summit participants will exchange ideas on how ontological analysis and ontology engineering might make a difference, when applied in these "big systems.” We will aim towards producing a series of recommendations describing how ontologies can create an impact; as well as providing illustrations where these techniques have been, or could be, applied in domains such as bioinformatics, electronic health records, intelligence, the smart electrical grid, manufacturing and supply chains, earth and environmental, e-science, cyberphysical systems and e-government. As is traditional with the Ontology Summit series, the results will be captured in the form of a communiqué, with expanded supporting material provided on the web.    (36UX)

The goal of "Meeting Big Data Challenges through Ontology" Track 3 is to identify issues that can be addressed using an ontology challenge. Challenges can take many forms and target many issues.    (36UY)

Potential issues to be addressed by challenges:    (36UZ)

Potential challenge directions    (36VO)

This first session of Track 3 - ConferenceCall_2012_02_09 - was designed to help everyone understand the relationships between big data challenges and ontologies. At this second session today - ConferenceCall_2012_03_15 - we hope to talk about solutions and benefits. At the OntologySummit2012_Symposium, we would like to present various approaches to implementing ontologies using challenges and a sample from the NITRD Big Data working group.    (36VX)

More details about this Summit at: OntologySummit2012 (home page for the summit)    (36VY)

Agenda:    (365V)

Ontology Summit 2012 - Panel Session-10    (36VZ)

Proceedings:    (36W5)

Please refer to the [ above]    (36W6)

IM Chat Transcript captured during the session:    (36W7)

 see raw transcript here.    (36W8)
 (for better clarity, the version below is a re-organized and lightly edited chat-transcript.)
 Participants are welcome to make light edits to their own contributions as they see fit.    (36W9)
 -- begin in-session chat-transcript --    (36WA)
	PeterYim: Welcome to the    (378J)
	 = OntologySummit2012: Session-10, Thursday 2012-03-15 =    (378K)
	Summit Theme: OntologySummit2012: "Ontology for Big Systems"    (378L)
	Track (3) Title: Challenge: Ontology and Big Data    (378M)
	Session Topic: Big Data Developing Challenges    (378N)
	Session Chairs: Ms. MaryBrady (NIST) and Mr. ErnieLucier (NCO/NITRD)    (378O)
	Panelists:    (378P)
	* Professor TimFinin (UMBC) - "Making the Semantic Web Easier to Use"    (378Q)
	* Dr. KyoungsookKim (NICT, JP) - "Use cases of cyber-physical data cloud computing"    (378R)
	* Dr. MikeFolk (HDF Group) - "The HDF5 technology suite"    (378S)
	* Dr. MarioPaolucci (LABSS/ISTC/CNR, Rome, Italy) - "FuturICT: Global Participatory Computing for Our Complex World"    (378T)
	* Dr. UrsulaKattner (NIST) - "Data Needs for the Materials Genome Initiative (MGI) at NIST"    (378U)
	* Dr. EdinMuharemagic (HPCC Systems; LexisNexis) - "HPCC Systems Machine Learning"    (378V)
	Session page: http://ontolog.cim3.net/cgi-bin/wiki.pl?ConferenceCall_2012_03_15    (378W)
	Mute control: *7 to un-mute ... *6 to mute    (378X)
	Can't find Skype Dial pad? ... it's under the "Call" dropdown menu as "Show Dial pad"    (378Y)
	 == Proceedings: ==    (378Z)
	anonymous morphed into TimFinin    (3790)
	anonymous morphed into HasanSayani    (3791)
	anonymous morphed into MatthewHettinger    (3792)
	anonymous morphed into MikeFolk    (3793)
	anonymous1 morphed into EdLowry    (3794)
	anonymous morphed into AndreaWesterinen    (3795)
	anonymous morphed into ChristopherSpottiswoode    (3796)
	KyoungsookKim: can you hear me    (3797)
	anonymous1 morphed into MarioPaolucci    (3798)
	anonymous morphed into DavidOrloff    (3799)
	anonymous morphed into CoryCasanave    (379A)
	anonymous morphed into RosarioUcedaSosa    (379B)
	anonymous morphed into ElizabethFlorescu    (379C)
	BobSchloss: Peter, Leo et al: I am thinking about the April 12-13 F2F at NIST. I may not be able to 
	be in Maryland by first thing in the morning on April 12th. Would you aim to put some details about 
	start time, and agenda, on the page for the Symposium sometime in the next week. Thank you    (379D)
	BobSchloss: I do see that the page says that if you arrive at NIST at 8am, there will be time to get 
	through security before things start. I just want to check that I could still arrive later and get 
	through Security.... If there is one person I should talk to about such logistics, just give me 
	their name, e-mail address, phone number. Thank you.    (379E)
	MaryBrady: Bob, you can most certainly arrive later at NIST and get through security. Arriving at 
	8:00 will just allow you to make it through prior to the start of the meeting.    (379F)
	MichaelGruninger: @BobSchloss: We will be putting an initial agenda up on the Symposium page 
	sometime tomorrow    (379G)
	PeterYim: @BobSchloss - ref. person you can call - the official contact I have from the NIST 
	registration page is at: 
	http://ontolog.cim3.net/cgi-bin/wiki.pl?OntologySummit2012/WorkshopRegistration#nid36LP ... so feel 
	free to call me on my mobile ( 650 578.9998 ) too    (379H)
	PeterYim: == Chairs doing intro ...    (379I)
	DougFoxvog: I've been unable to connect to Shared-screen support at http://vnc2.cim3.net:5800/ . Is 
	it up and running?    (379J)
	PeterYim: @Doug ... it should be - does anyone else have problems (especially if you have been able 
	to do that before)    (379K)
	DougFoxvog: I was unable to connect with Firefox. But i managed to dredge up an Internet Explorer, 
	and there was no problem. If anyone else is having a problem connecting with Firefox, i would 
	suggest they try a different browser.    (379L)
	MarioPaolucci: I'm seeing the VNC but it took quite a while to establish connection    (379M)
	PeterYim: @Mario - don't worry about it (if you cannot get a "good" connection to the vnc server) 
	just use your own slides (on your desktop) but please remember to call out slide advances and call 
	out the slide numbers each time    (379N)
	anonymous morphed into JosephTennis    (379O)
	anonymous1 morphed into JodyDesRoches    (379P)
	anonymous2 morphed into BuckNimz    (379Q)
	anonymous morphed into UrsulaKattner    (379R)
	anonymous1 morphed into ChristopherSpottiswoode    (379S)
	PeterYim: == TimFinin presenting ...    (379T)
	DeborahMacPherson: Hi Everyone! Interesting discussion this all has been. I have several entire 
	threads set aside to read again fully    (379U)
	SteveRay: @Tim Finin: Very cool. Is there a way to point your GOR tool to other SPARQL endpoints?    (379V)
	LeoObrst: @Tim: concerning Varish Mulwad's research (inferring semantics of tables), was Formal 
	Concept Analysis considered?    (379W)
	TimFinin: @leo --- No, not to my knowledge. Good idea. We'll look into it and think about its value 
	for this problem.    (379X)
	ErnieLucier: HTTP copies as hFp for all reference URLs in TimFinin's presentation. Replacing hPf 
	with http works.    (379Y)
	PeterYim: == KyoungsookKim presenting ...    (379Z)
	KyoungsookKim: hi    (37A0)
	MaryBrady: Dr. Kim, we are ready, *7 to un-mute    (37A1)
	JosephTennis: This has been GREAT! Sorry I was late. And sadly, I'll have to leave early Looking 
	forward to participating more in the near future!    (37A2)
	PeterYim: @JosephTennis - Thank you for the participation, Joe    (37A3)
	JosephTennis: ciao!    (37A4)
	NicolaGuarino: Folks, unfortunately I have a car emergency, I have to leave. Perhaps I'll manage to 
	connect again in 40 minutes or so, not sure. Sorry missing Mario's presentation.    (37A5)
	PeterYim: @Nicola - thank you for the heads up ... bye!    (37A6)
	ErnieLucier: Is permission to use social networks required?    (37A7)
	HaroldBoley: @Dr. Kim, in Real-world Awareness Computing, could the 
	Observation / Perception / (Communication) / Action sequence be formalized using 
        Event / Condition / Action rules?    (37A8)
	KyoungsookKim: currently, I try to use a rule-based language.    (37A9)
	KyoungsookKim: like datalog    (37AA)
	HaroldBoley: Maybe the premises of datalog rules need to be partitioned into Event and Condition 
	parts?    (37AB)
	HaroldBoley: Events are 'sensed' as external observations. Conditions 'test' the internal knowledge 
	base.    (37AC)
	PeterYim: == MikeFolk presenting ...    (37AD)
	BobSchloss: Just an observation -- many people are working on highly scalable triplestores, some 
	with interesting partitioning and distribution and federation functions. We might sometime convene a 
	panel with all of these people showing what they did. It is not just graph stores that are advancing 
	-- the entire NoSQL movement is starting to develop various interesting strategies, in which some of 
	the classic ACID properties are slightly relaxed.    (37AE)
	AliHashemi: The pdf version of this presentation is not rendering appropriately for me. Is it just 
	me?    (37AF)
	KyoungsookKim: me too    (37AG)
	AliHashemi: @Kyoungsook - I downloaded the file and opened it in Reader - it works ok there, I think 
	my browser's pdf reader doesn't show it correctly.    (37AH)
	KyoungsookKim: thank you.    (37AI)
	AmandaVizedom: @MikeFolk - HDF5 is new to me, so I find I have some "what *is* HDF5?" questions 
	below the level of your talk. I see that at http://www.hdfgroup.org/HDF5/, there are links for "What 
	is HDF5?" "Questions (FAQ)," and "HDF5 Tutorial" links. Would you recommend these as the best source 
	for an overview of the fundamentals of HDF5?    (37AJ)
	DavidOrloff: Sorry need to run to a meeting - if anyone is looking for data to work with that is 
	already tagged with ontologies (9 in use) look at http://www.cellimagelibrary.org and contact me 
	DavidOrloff at dorloff[at]ascb.org - thanks for letting me sit in - I will be back for future calls.    (37AK)
	PeterYim: == MarioPaolucci presenting ...    (37AL)
	PeterYim: @MarioPaolucci - how does the project mitigate between its desire to be "open" versus it's 
	dependencies (ref. your slides) on commercial products (like skype, facebook, etc. ... which are 
	usually non-open)    (37AM)
	MarioPaolucci: @PeterYim: There are several strategies possible. We plan to build alternative data 
	sources and provide access to them. We will not depend on commercial platforms (the names in the 
	slide were there more as an examples of changes brought about by technology), but we hope to create 
	our own data sharing platform, where user themselves authorize access to their data. In a sense, we 
	hope to convince people to reclaim access to their data.    (37AN)
	MarioPaolucci: @PeterYim: We think users would be happier to share data with a privacy preserving, 
	non profit project, but it's a risky bet, I agree.    (37AO)
	PeterYim: @Mario - thank you ... but then, the challenge comes in the form of how (and if at all) 
	one can build out a user base of hundreds of millions of people (like the success some of these 
	commercial social network platforms have achieved)    (37AP)
	HaroldBoley: @MarioPaolucci, what would be the initial steps for moving from a Strongly Coupled 
	System to a Weakly Coupled System?    (37AQ)
	MarioPaolucci: @Harold: I can only provide stylized examples; it depends from the specific problem. 
	But of course you can hardly intervene on the self-organization, so what can be done is changing the 
	terrain where things happen. The examples that come to my mind is the roundabout instead of the 
	intersection; or in a sand pile model, breaking up the table so that cascades remain limited.    (37AR)
	HaroldBoley: @Mario, You could 'overlay' the roundabout -- with 4 quarter-circle 'bypasses' -- over 
	the intersection, so at least to help those not in the center of the congestion.    (37AS)
	MarioPaolucci: @Harold:Exactly. Also, adding new dimensions help - either by digging a tunnel, or - 
	better - providing car with vertical mobility. Think if this as a metaphor - new dimensions are 
	easier to create in virtual worlds, of course    (37AT)
	MarioPaolucci: oops. I forgot to put the links.    (37AU)
	MarioPaolucci: list of supporters: http://www.futurict.eu/the-project/whos-involved    (37AV)
	MarioPaolucci: to join: http://www.futurict.eu/the-project    (37AW)
	PeterYim: == UrsulaKattner presenting ...    (37AX)
	UrsulaKattner: The link for the Materials Genome Initiative is 
	http://www.whitehouse.gov/blog/2011/06/24/materials-genome-initiative-renaissance-american-manufactu 
	ring    (37AY)
	DeborahMacPherson: Interested in discussing the differences between calculated and measured values 
	referred to by the speaker just now    (37AZ)
	UrsulaKattner: @DeborahMacPherson: Measured values have a confidence resulting from the error of the 
	measurement, calculated values have no such error. However, a confidence for these data is needed to 
	properly judge them in the context of data that describe a material.    (37B0)
	DeborahMacPherson: Need to sign off - thanks to all the speakers -    (37B1)
	PeterYim: == EdinMuharemagic presenting ...    (37B2)
	DougFoxvog: I think we're on slide 3    (37B3)
	PeterYim: @Mary - please ask the speaker to call out the slide advance AND slide number    (37B4)
	MaryBrady: Please, if you have questions, post them here...we'll be sure to engage the speakers in 
	answering the questions over e-mail    (37B5)
	MatthewHettinger: @Edin, What open source products are you using?    (37B6)
	PeterYim: == MaryBrady (co-chair) moderating open discussion    (37B7)
	MaryBrady: In particular a number of technologies and use cases for BIG DATA have been presented 
	this afternoon. Any thoughts on potential uses for Ontology?    (37B8)
	MarioPaolucci: @Mary: We have ontology components in all parts of the FuturICT architecture, of 
	course. Nicola Guarino knows more about them. But we have a critical need of ontologies that allows 
	the different components to communicate - think of aligning models and simulation results along 
	disciplines (sociology, complex science) and along levels of detail (individual agents, 
	organizations, groups, etc.) There should be a world of modeling component in which ontology is very 
	important.    (37B9)
	FrankOlken: @PeterYim Will all of the Ontology Summit sessions be webcast? I am thinking of 
	attending remotely ...    (37BA)
	PeterYim: @FrankOlken - are you referring to the OntologySummit2012_Symposium (at NIST on 4/12 & 
	13)?    (37BB)
	PeterYim: @Frank - ... assuming that, the answer is "yes" - remote participation will be supported 
	for all the sessions    (37BC)
	FrankOlken: @PeterYim Yes, I am referring to the Ontology Summit on April 12-13 at NIST. I am 
	already committed to a trip the previous week (Data Engineering conf) and am reluctant to commute to 
	DC twice in 2 weeks.    (37BD)
	SteveRay: @Frank: We may have a problem using Skype when we are at NIST, because they ban Skype 
	there for security reasons.    (37BE)
	PeterYim: @Steve, @Frank - we will be hosting calls in a way (and with the same tools) similar to 
	all Ontolog virtual session ... except that we may not support shared-screen (vnc), but then, that 
	has never been a show stopper for us    (37BF)
	FrankOlken: @PeterYim I think that your infrastructure for supporting the teleconferences has worked 
	quite well. I use skype to listen.    (37BG)
	anonymous morphed into NicolaGuarino    (37BH)
	MarioPaolucci: @Nicola: bentornato!    (37BI)
	MaryBrady: Any thoughts on the integration of ontology components with output from machine learning 
	techniques?    (37BJ)
	FrankOlken: @MaryBrady I recall that some folks have suggest using ontologies to suggest concepts to 
	be learned.    (37BK)
	MaryBrady: @FrankOlken Yes, here at NIST we have used combination techniques between ontologies and 
	machine learning. Simple queries can sometimes take days to complete.    (37BL)
	FrankOlken: Nearly every machine learning algorithm is available under R.    (37BM)
	FrankOlken: There are versions of R that run on clouds with Hadoop.    (37BN)
	FrankOlken: There is recent work at IBM and Univ. of Wisconsin on parallel implementation of 
	stochastic gradient descent for machine learning.    (37BO)
	LeoObrst: Must go now. Very interesting session. Thanks to all!    (37BP)
	DougFoxvog: @Mario You discuss using "Crowd sourcing" and "citizen science" for a platform for 
	economic and political participation. People on different sides of various issues would have 
	competing "science". How would you deal with this?    (37BQ)
	NicolaGuarino: @Doug: here is exactly one of the roles of ontologies in this project: exposing 
	disagreements about different opinions...    (37BR)
	NicolaGuarino: @Doug: the point is *understanding* the different models, not necessarily forcing 
	them to align one each other    (37BS)
	DougFoxvog: @Nicola So long as the different theories/models are kept separate, i strongly agree. 
	The problem i saw was with an "open" system which would allow people to modify theories that they 
	didn't create.    (37BT)
	NicolaGuarino: @Doug: You are right. Definitely people shouldn't be allowed to modify things at 
	their ease... especially if the underlying assumptions are not shared...    (37BU)
	MarioPaolucci: Thank you everybody for listening and for the questions. I have to leave now, bye!    (37BV)
	PeterYim: @KyoungsookKim - how effectively did the systems cited in your use cases turn out (in real 
	life) ... were there metrics available?    (37BW)
	AmandaVizedom: Dr. Kim, someone responded to my G+ posting about your presentation by mentioning 
	evacuation response research such as some at Univ. of Minnesota ( 
	http://gradworks.umi.com/32/05/3205248.html, 
	http://www.spatial.cs.umn.edu/paper_ps/evac_SSTD05.pdf). My response is that this sort of research, 
	in itself valuable, would relate to your use case as a contribution to *one* of the areas of 
	computation involved in the response. It seems to me that what makes your use case such a Grand 
	Challenge type case is that it brings together a variety of such areas, including route-planning and 
	information fusion across very heterogenous sensor and information types and disaster surveillance 
	over networks. Do you agree?    (37BX)
	AmandaVizedom: @KyoungsookKim - I should add that I think it's really a very good Grand Challenge, 
	for a few reasons, including that it is so well grounded in a real need *and* real, existing data 
	environments, and success has such clear benefits.    (37BY)
	ErnieLucier: @Dr. Kim, Is permission to use social networks required or a problem?    (37BZ)
	KyoungsookKim: using social network, we don't try to use personal information itself. We aggregate a 
	group of messages and extract trend information or changing information.    (37C0)
	SteveRay: Must run. Thanks for a stimulating session.    (37C1)
	PeterYim: wonderful session ... great presentations!    (37C2)
	PeterYim: -- session ended: 11:49am PDT --    (37C3)
 -- end of in-session chat-transcript --    (36WB)

Audio Recording of this Session    (36WH)

Additional Resources:    (36WQ)


For the record ...    (36X7)

How To Join (while the session is in progress)    (36X8)