Ontolog Invited Speaker Presentation - Dr. Ramanathan V. Guha - Thu 2011.12.01    (2Z1H)

Agenda & Proceedings:    (2ZEF)

http://ontolog.cim3.net/file/resource/presentation/Schema.org--RVGuha_20111201/RVGuha.jpg [ Dr. Ramanathan V. Guha ]    (2ZMU)

Schema.org provides a collection of schemas, i.e., html tags, that webmasters can use to markup their pages in ways recognized by major search providers. Search engines including Bing, Google, Yahoo! and Yandex rely on this markup to improve the display of search results, making it easier for people to find the right web pages. Many sites are generated from structured data, which is often stored in databases. When this data is formatted into HTML, it becomes very difficult to recover the original structured data. Many applications, especially search engines, can benefit greatly from direct access to this structured data. On-page markup enables search engines to understand the information on web pages and provide richer search results in order to make it easier for users to find relevant information on the web. Markup can also enable new tools and applications that make use of the structure. A shared markup vocabulary makes it easier for webmasters to decide on a markup schema and get the maximum benefit for their efforts. So, in the spirit of sitemaps.org, search engines have come together to provide a shared collection of schemas that webmasters can use.    (2ZMV)

This session will be structured as a Q&A session where Google Fellow Ramanathan Guha will provide a brief introduction to the Schema.org activity and then answer your questions regarding the relation between this work and the broader ontology world.    (2ZMW)

Speaker Bio (with credit to Wikipedia) Ramanathan V. Guha (1965) is an Indian computer scientist. He graduated with B.Tech (Mechanical Engineering) from Indian Institute of Technology Madras, MS (Mechanical engineering) from University of California Berkeley and Ph.D (Computer science) from Stanford University. Since May 2005, he has been working at Google.    (2ZMX)

Guha was one of the early co-leaders of the Cyc Project where he worked from 1987 through 1994 at Microelectronics and Computer Technology Corporation. He was responsible for the design and implementation of key parts of the Cyc system, including the CycL knowledge representation language, the upper ontological layers of the Cyc Knowledge Base and some parts of the original Cyc Natural Language understanding system. Leaving what became Cycorp, Guha founded Q Technology, which created a database schema mapping tool called Babelfish. In 1994, he moved to work at Apple Computer, reporting to Alan Kay, where he developed the Meta Content Framework (MCF) format. In 1997 he joined Netscape Corporation where together with Tim Bray, he created a new version of MCF that used the XML language and which became the main technical precursor to W3C's Resource Description Framework (RDF) standard. Guha also contributed to the "smart browsing" features of Netscape 4.5 and was instrumental in Netscape's acquisition of the Open Directory Project. In March 1999, he created the first version of RSS as part of Netscape's personalized home page project. In 1999 he left Netscape and in May co-founded Epinions where he worked until 2000. Guha founded Alpiri in late 2000 which created TAP, a semantic web application and knowledge base. In 2002, he became a researcher at IBM Almaden Research Center. In 2005 Guha joined Google. He currently leads development of Google Custom Search and is one of the champions of the current Schema.org activity being promoted by Google, Microsoft Bing, Yahoo! and Yandex.    (2ZMY)

Transcript of the online chat during the session:    (2ZMZ)

 see raw transcript here.    (2ZN1)
 (for better clarity, the version below is a re-organized and lightly edited chat-transcript.)
 Participants are welcome to make light edits to their own contributions as they see fit.    (2ZN2)
    -- begin of chat session --    (2ZN3)
	PeterYim: Welcome to the    (2ZRC)
	 = Ontolog Invited Speaker Presentation - Dr. Ramanathan V. Guha - Thu 2011.12.01 =    (2ZRD)
	Session Chair: Dr. SteveRay (CMU)    (2ZRE)
	Invited Speakers: Dr. R V Guha (Google, schema.org)    (2ZRF)
	Session Topic: A conversation with R V Guha and Dan Brickley on "schema.org"    (2ZRG)
	Session page: http://ontolog.cim3.net/cgi-bin/wiki.pl?ConferenceCall_2011_12_01    (2ZRH)
	Phone (US): (206) 402-0100 ... PIN: 141184#
	Skype: call - "joinconference" ... PIN: 141184#
	 if you can't find the skype keypad, try the "Call" drop down menu, and select "Show Dial Pad"    (2ZRI)
	Phone keypad controls: To un-mute, press "*7" ... To mute, press "*6"    (2ZRJ)
	 == Proceedings: ==    (2ZRK)
	anonymous morphed into Guha    (2ZRL)
	SteveRay: Welcome, Guha. Glad you made it!    (2ZRM)
	Guha: thanks    (2ZRN)
	Guha: DanBrickley will be joining me in talking    (2ZRO)
	SteveRay: OK. Noted. I will start with an introduction, then hand things over to you.    (2ZRP)
	danbri just joined    (2ZRQ)
	danbri thanks Guha    (2ZRR)
	anonymous1 morphed into Roger Cutler    (2ZRS)
	K Goodier: Hi y'all    (2ZRT)
	anonymous2 morphed into PeterBenson    (2ZRU)
	anonymous1 morphed into DougFoxvog    (2ZRV)
	anonymous1 morphed into GeraldRadack    (2ZRW)
	anonymous1 morphed into shensley    (2ZRX)
	anonymous2 morphed into Kurt Kirkham    (2ZRY)
	anonymous1 morphed into AndreasHarth    (2ZRZ)
	anonymous2 morphed into Kavitha Srinivas    (2ZS0)
	anonymous1 morphed into KingsleyIdehen    (2ZS1)
	anonymous4 morphed into Stefano Bocconi    (2ZS2)
	anonymous3 morphed into Cirrus Shakeri    (2ZS3)
	anonymous1 morphed into Mike Ward    (2ZS4)
	anonymous2 morphed into Ted Bashor    (2ZS5)
	anonymous morphed into BobbinTeegarden    (2ZS6)
	anonymous morphed into ElizabethFlorescu    (2ZS7)
	anonymous morphed into AdrianWalker    (2ZS8)
	anonymous1 morphed into VladTanasescu    (2ZS9)
	anonymous morphed into Lora Aroyo    (2ZSA)
	anonymous morphed into Stefano Bortoli    (2ZSB)
	PeterYim: -- session formally started 9:38am PST --    (2ZSC)
	danbri: (re old Guha / Bray spec, see http://www.w3.org/Submission/1997/8/ )    (2ZSD)
	danbri: -> http://www.w3.org/TR/WD-rdf-syntax-971002/    (2ZSE)
	danbri: nitpic "RDFa Lite" rather than "RDF Lite"; it's about the in-html notation    (2ZSF)
	danbri: Working Draft out next week    (2ZSG)
	anonymous1 morphed into FrankChum    (2ZSH)
	anonymous1 morphed into GaryBergCross    (2ZSI)
	danbri: discussion of http://en.wikipedia.org/wiki/ISO_8000 
	http://www.dataforge.com/wpblog/index.php/industry-news/iso-22745-standard-based-exchange-of-product-data/    (2ZSJ)
	SteveRay: PeterBenson: ISO 22745 is a set of standard tags with many entries already.    (2ZSK)
	PeterYim: Guha: target audience for schema.org is the "webmasters"    (2ZSL)
	danbri: example: http://schema.org/Movie    (2ZSM)
	DougFoxvog: schema.org could use classification for PhysicalObject. A common superclass Agent of 
	Person & Organization would be useful .    (2ZSN)
	danbri: http://www.rssboard.org/rss-0-9-0    (2ZSO)
	anonymous morphed into Alessander Botti Benevides    (2ZSP)
	LeoObrst: S-expressions in Lisp.    (2ZSQ)
	PeterYim: SteveRay paraphrasing JohnSowa's questions for Guha - ref: 
	http://ontolog.cim3.net/forum/ontolog-forum/2011-11/msg00141.html    (2ZSR)
	danbri: so RDF '97 was PICS-NG, which used s-expressions: http://www.w3.org/TR/NOTE-pics-ng-metadata    (2ZSS)
	danbri: (then XML happened)    (2ZST)
	KingsleyIdehen: John's actual post: 
	http://ontolog.cim3.net/forum/ontolog-forum/2011-11/msg00141.html    (2ZSU)
	JoelBender: (and then N3 happened)    (2ZSV)
	KingsleyIdehen: Then Linked Data happened    (2ZSW)
	danbri: (and then JSON happened...)    (2ZSX)
	DougFoxvog: XML is not restricted to triples. Why was/is RDF so restricted?    (2ZSY)
	KingsleyIdehen: Yes, Linked Data brings it back home to simplicity    (2ZSZ)
	JoelBender: (and now JSON-LD is happening ... maybe)    (2ZT0)
	KingsleyIdehen: Yes, but Linked Data is agnostic re. EAV/SPO based 3-tuples    (2ZT1)
	K Goodier: Keeping things simple and delivering value    (2ZT2)
	KingsleyIdehen: and via HTTP we can negotiate representation    (2ZT3)
	FrankChum: Doug, I like RDF for its simplicity and not as restricted    (2ZT4)
	anonymous morphed into Arnaud J Le Hors    (2ZT5)
	KingsleyIdehen: Good example of this all working, via Linked Data simplicity: 
	http://wiki.goodrelations-vocabulary.org/Microdata    (2ZT6)
	KingsleyIdehen: Yes, we have to "hold our noses" re. large scale adoption . +1    (2ZT7)
	NicolaGuarino: usual problem with skype, sorry    (2ZT8)
	KingsleyIdehen: Here is a link to a note showing how Schema.org mapped to DBpedia leads to network 
	effects: https://plus.google.com/112399767740508618350/posts/ck2yhgTWxtD    (2ZT9)
	KingsleyIdehen: A specific page showing LOD Cloud instance data based on Schema.org cross links: 
	http://lod.openlinksw.com/describe/?url=http%3A%2F%2Fschema.org%2FLandmarksOrHistoricalBuildings&urilookup=1    (2ZTA)
	SteveRay: @Nicola: OK, I'll try you again after Ali is done with his second question, if you raise 
	your hand again.    (2ZTB)
	KingsleyIdehen: Final page showing links between Schema.org and DBpedia (and other vocabularies 
	which appear as you follow-your-nose through the Linked Data): 
	http://lod.openlinksw.com/describe/?url=http%3A%2F%2Fschema.org%2FLandmarksOrHistoricalBuildings&p=1&lp=89&op=-1&last=&gp=1    (2ZTC)
	danbri: on the 'do we need rdf' question, .... we see two trends: (1) people who use RDF, find 
	frustration with the fiddly details of the spec (datatypes, etc.). Perhaps such things are just 
	inherently annoying. There needs to be a rule, but the rule is arbitrary. (2) people who don't use 
	RDF explicitly, often drift towards a data model that is very RDF-like, because RDF didn't appear 
	from nowhere. Graph-shaped data is a very common pattern (cf. Kingsley on EAV). Hence all recent 
	talk on 'social graph', 'interest graph', etc.    (2ZTD)
	KingsleyIdehen: Methinks: Schema.org and Linked Data have a mutually beneficial relationship that in 
	effect fans out to adding more semantic structure to links (actually relations) on the WWW. 
	Schema.org delivers immediate and palpable value    (2ZTE)
	NicolaGuarino: @Steve: sorry, I am not able to talk through skype, too bad    (2ZTF)
	PeterYim: @Nicola: please type out your question on the chat    (2ZTG)
	anonymous1 morphed into DuaneNickull    (2ZTH)
	PeterYim: schema.org - as DanBrickley puts it - characterized by a small working group, consensus, 
	ability to move and make decisions quickly    (2ZTI)
	NicolaGuarino: Here is the comment I wanted to make: The reason why super-simple ontologies like 
	FOAF work is that the words are simple to understand But there are words which everybody 
	understands, and words that are ambiguous and difficult to define or explain (e.g., service, 
	unemployed person). It is a fact that people doing markup don't care about deep semantics of their 
	tag. So if the goal is to get billions of pages marked up, that's fine. But what about USING these 
	marked up pages for information integration, services mashup and so on, instead of just for search? 
	BOTTOM LINE: extensive tagging with little semantics may be very useful for search, but not for 
	integration of information    (2ZTJ)
	NicolaGuarino: @Guha: but even for application-dependent vocabularies we sometime need very crisp 
	formal definitions....    (2ZTK)
	danbri: (re starting points of Web: http://www.w3.org/History/1989/proposal-msw.html has seeds of 
	RDF in there too)    (2ZTL)
	NicolaGuarino: Deep semantics is needed (sometimes) also for application-dependent purposes, not 
	just for universal purposes    (2ZTM)
	AdrianWalker: To go beyond search applications, some degree of NLP is unavoidable?    (2ZTN)
	DougFoxvog: I suggest that small ontologies can build on larger existing ones. Those who use them do 
	not need to use everything from the larger ontologies. Deep ontologies would have rules and 
	reasoning structures that are immaterial to small systems that use parts of them.    (2ZTO)
	PeterBenson: our experience with ISO 8000 is that you need sufficient data to meet a defined 
	requirement - nothing more. As requirements grow so does the depth of data.    (2ZTP)
	NicolaGuarino: Besides schema.org, why not investing on a MINIMAL formal vocabulary, clarifying for 
	instance the various notions of PART or DEPENDENCE?    (2ZTQ)
	Stefano Bortoli: being to narrow in the definition of schemas might end up in a higher cost of 
	maintenance of the application after all. This is a less we should have learned from software 
	engineering at least. So, deep thinking and generalization to some extent is necessary. Simple and 
	easy is good in the short term, but we risk to create asbestos that will be very hard to handle in 
	the future    (2ZTR)
	NicolaGuarino: @Stefano Bortoli: +1    (2ZTS)
	PeterYim: @James Sorace - you can click on the "Settings" button (at the top center of the window) 
	that modify "anonymous" into your real name    (2ZTT)
	DougFoxvog: Contexts can separate ontologies into subsets. Guha is talking about the problems of "an 
	ontology of everything". Cyc developed the idea of Microtheories (but i'm not sure if it was after 
	he left). By placing rules and relationships in such contexts (or microtheories) one can avoid many 
	of the problems of an "ontology of everything". This becomes an issue on the Semantic Web, where 
	triples make it hard to place statements within specific contexts.    (2ZTU)
	VladTanasescu: Any pointers to this ACM article?    (2ZTV)
	GaryBergCross: What consideration has schema.org given to controlled natural languages? Some efforts 
	have tried to make OWL and Common Logic easier to express.    (2ZTW)
	danbri: @DougFoxvog: Guha/ has 'Contexts: A Formalization and Some 
	Applications'...    (2ZTX)
	Stefano Bortoli: @Dough I don't think that anyone is really aiming at the "philosophical ontology", 
	not in the Semantic Web at least. Indeed, the first efforts were spent in automatic ontology 
	mappings, rather than producing semantically annotated data. Contexts are particularly complex to 
	manage in a context-less environment such as the WEB. The less we can do, is to try to be formal in 
	defining concepts to reduce the risk of misunderstanding.    (2ZTY)
	GaryBergCross: One issue with Microtheories is when do your create a new one versus adapt an 
	existing one.    (2ZTZ)
	PeterYim: Guha: currently adoption is in the order of thousands of sites and billions of pages now    (2ZU0)
	SteveRay: Certainly some standards development efforts are importing existing external concepts or 
	"ontologies" to a much greater degree today.    (2ZU1)
	danbri: on re-use, one q is whether publishers/authors of instance data should bear the cost of that 
	sharing/re-use. Mainstream RDF / SemWeb culture is to have instance data cite several different 
	ontologies. Schema.org rather pre-packages things and offers the package as a single usable thing...    (2ZU2)
	danbri: re rNews - see http://blog.schema.org/2011/09/extended-schemaorg-news-support.html for 
	details    (2ZU3)
	danbri: http://www.iptc.org/site/Home/Media_Releases/schema.org_adopts_IPTC's_rNews_for_news_markup    (2ZU4)
	Roger Cutler: I don't think he said billions of pages. Thousands of sites & billions of pages means 
	millions of pages per site, right?    (2ZU5)
	danbri: (yup, we should make the various mappings to/from schema.org easier to find)    (2ZU6)
	DougFoxvog: @Gary -- You can create a new microtheory when describing a narrower field or are using 
	multiple existing contexts, or when presenting information about a specific event or other 
	individual.    (2ZU7)
	danbri: http://wiki.creativecommons.org/LRMI/Specification_v0.5    (2ZU8)
	NicolaGuarino: A couple of problems I find in the current taxonomic structure of schema.org: 1. A 
	governmentOffice is both a place and an organization    (2ZU9)
	ChristopherSpottiswoode: What a privilege that was, to be able to listen in on that conversation, 
	with all that experience! Thank you all.    (2ZUA)
	DougFoxvog: @Gary -- adapt an existing context when providing more info @ same level    (2ZUB)
	Stefano Bortoli: thanks    (2ZUC)
	Stefano Bortoli: bye    (2ZUD)
	PeterYim: Great session ... thank you Guha, Dan and everyone all for coming!    (2ZUE)
	Guha: Thank you everyone    (2ZUF)
	danbri: Thanks all    (2ZUG)
	PeterYim: -- session ended : 11:00am PST --    (2ZUH)
    -- end of chat session --    (2ZN4)

Audio Recording of this Session    (2ZN8)


For the record ...    (2ZNG)

How To Join (while the session is in progress)    (2ZNH)

Conference Call Details    (2ZLM)

Attendees    (2ZMD)