Ontolog Invited Speaker Presentation - Dr. Ramanathan V. Guha - Thu 2011.12.01 (2Z1H)
- Presentation Title: "schema.org" (2ZLF)
- Archive: (2ZLG)
- [ Agenda & Proceedings ] (2ZLH)
- [ Abstract ] (2ZLI)
- there will not be any slide for this talk (2ZLJ)
- [ audio recording of the session ] [ 1:21:31 ; mp3 ; 9.33 MB ] (2ZLK)
- [ Transcript of the online chat session ] during the panel discussion '' (2ZLL)
Agenda & Proceedings: (2ZEF)
- Session Format and Agenda: (2ZMO)
- this will be virtual session over a phone conference setting, augmented by in-session chat and shared computer screen support (2ZMP)
- Introduction of the invited speakers - session chair: SteveRay (2ZMQ)
- Presentation by our invited speakers - RamanathanGuha (30~45 min.) (2ZMR)
- Q&A and Open discussion (30~45 min.) [Kindly identify yourself before speaking.] (2ZMS)
- Presentation Title: "Schema.org" (2ZMT)
[ Dr. Ramanathan V. Guha ] (2ZMU)
- Abstract: (2ZEE)
Schema.org provides a collection of schemas, i.e., html tags, that webmasters can use to markup their pages in ways recognized by major search providers. Search engines including Bing, Google, Yahoo! and Yandex rely on this markup to improve the display of search results, making it easier for people to find the right web pages. Many sites are generated from structured data, which is often stored in databases. When this data is formatted into HTML, it becomes very difficult to recover the original structured data. Many applications, especially search engines, can benefit greatly from direct access to this structured data. On-page markup enables search engines to understand the information on web pages and provide richer search results in order to make it easier for users to find relevant information on the web. Markup can also enable new tools and applications that make use of the structure. A shared markup vocabulary makes it easier for webmasters to decide on a markup schema and get the maximum benefit for their efforts. So, in the spirit of sitemaps.org, search engines have come together to provide a shared collection of schemas that webmasters can use. (2ZMV)
This session will be structured as a Q&A session where Google Fellow Ramanathan Guha will provide a brief introduction to the Schema.org activity and then answer your questions regarding the relation between this work and the broader ontology world. (2ZMW)
- About the Speakers: (2ZK8)
Speaker Bio (with credit to Wikipedia) Ramanathan V. Guha (1965) is an Indian computer scientist. He graduated with B.Tech (Mechanical Engineering) from Indian Institute of Technology Madras, MS (Mechanical engineering) from University of California Berkeley and Ph.D (Computer science) from Stanford University. Since May 2005, he has been working at Google. (2ZMX)
Guha was one of the early co-leaders of the Cyc Project where he worked from 1987 through 1994 at Microelectronics and Computer Technology Corporation. He was responsible for the design and implementation of key parts of the Cyc system, including the CycL knowledge representation language, the upper ontological layers of the Cyc Knowledge Base and some parts of the original Cyc Natural Language understanding system. Leaving what became Cycorp, Guha founded Q Technology, which created a database schema mapping tool called Babelfish. In 1994, he moved to work at Apple Computer, reporting to Alan Kay, where he developed the Meta Content Framework (MCF) format. In 1997 he joined Netscape Corporation where together with Tim Bray, he created a new version of MCF that used the XML language and which became the main technical precursor to W3C's Resource Description Framework (RDF) standard. Guha also contributed to the "smart browsing" features of Netscape 4.5 and was instrumental in Netscape's acquisition of the Open Directory Project. In March 1999, he created the first version of RSS as part of Netscape's personalized home page project. In 1999 he left Netscape and in May co-founded Epinions where he worked until 2000. Guha founded Alpiri in late 2000 which created TAP, a semantic web application and knowledge base. In 2002, he became a researcher at IBM Almaden Research Center. In 2005 Guha joined Google. He currently leads development of Google Custom Search and is one of the champions of the current Schema.org activity being promoted by Google, Microsoft Bing, Yahoo! and Yandex. (2ZMY)
Transcript of the online chat during the session: (2ZMZ)
see raw transcript here. (2ZN1)
(for better clarity, the version below is a re-organized and lightly edited chat-transcript.) Participants are welcome to make light edits to their own contributions as they see fit. (2ZN2)
-- begin of chat session -- (2ZN3)
PeterYim: Welcome to the (2ZRC)
= Ontolog Invited Speaker Presentation - Dr. Ramanathan V. Guha - Thu 2011.12.01 = (2ZRD)
Session Chair: Dr. SteveRay (CMU) (2ZRE)
Invited Speakers: Dr. R V Guha (Google, schema.org) (2ZRF)
Session Topic: A conversation with R V Guha and Dan Brickley on "schema.org" (2ZRG)
Session page: http://ontolog.cim3.net/cgi-bin/wiki.pl?ConferenceCall_2011_12_01 (2ZRH)
Phone (US): (206) 402-0100 ... PIN: 141184# Skype: call - "joinconference" ... PIN: 141184# if you can't find the skype keypad, try the "Call" drop down menu, and select "Show Dial Pad" (2ZRI)
Phone keypad controls: To un-mute, press "*7" ... To mute, press "*6" (2ZRJ)
== Proceedings: == (2ZRK)
anonymous morphed into Guha (2ZRL)
SteveRay: Welcome, Guha. Glad you made it! (2ZRM)
Guha: thanks (2ZRN)
Guha: DanBrickley will be joining me in talking (2ZRO)
SteveRay: OK. Noted. I will start with an introduction, then hand things over to you. (2ZRP)
danbri just joined (2ZRQ)
danbri thanks Guha (2ZRR)
anonymous1 morphed into Roger Cutler (2ZRS)
K Goodier: Hi y'all (2ZRT)
anonymous2 morphed into PeterBenson (2ZRU)
anonymous1 morphed into DougFoxvog (2ZRV)
anonymous1 morphed into GeraldRadack (2ZRW)
anonymous1 morphed into shensley (2ZRX)
anonymous2 morphed into Kurt Kirkham (2ZRY)
anonymous1 morphed into AndreasHarth (2ZRZ)
anonymous2 morphed into Kavitha Srinivas (2ZS0)
anonymous1 morphed into KingsleyIdehen (2ZS1)
anonymous4 morphed into Stefano Bocconi (2ZS2)
anonymous3 morphed into Cirrus Shakeri (2ZS3)
anonymous1 morphed into Mike Ward (2ZS4)
anonymous2 morphed into Ted Bashor (2ZS5)
anonymous morphed into BobbinTeegarden (2ZS6)
anonymous morphed into ElizabethFlorescu (2ZS7)
anonymous morphed into AdrianWalker (2ZS8)
anonymous1 morphed into VladTanasescu (2ZS9)
anonymous morphed into Lora Aroyo (2ZSA)
anonymous morphed into Stefano Bortoli (2ZSB)
PeterYim: -- session formally started 9:38am PST -- (2ZSC)
danbri: (re old Guha / Bray spec, see http://www.w3.org/Submission/1997/8/ ) (2ZSD)
danbri: -> http://www.w3.org/TR/WD-rdf-syntax-971002/ (2ZSE)
danbri: nitpic "RDFa Lite" rather than "RDF Lite"; it's about the in-html notation (2ZSF)
danbri: Working Draft out next week (2ZSG)
anonymous1 morphed into FrankChum (2ZSH)
anonymous1 morphed into GaryBergCross (2ZSI)
danbri: discussion of http://en.wikipedia.org/wiki/ISO_8000 http://www.dataforge.com/wpblog/index.php/industry-news/iso-22745-standard-based-exchange-of-product-data/ (2ZSJ)
SteveRay: PeterBenson: ISO 22745 is a set of standard tags with many entries already. (2ZSK)
PeterYim: Guha: target audience for schema.org is the "webmasters" (2ZSL)
danbri: example: http://schema.org/Movie (2ZSM)
DougFoxvog: schema.org could use classification for PhysicalObject. A common superclass Agent of Person & Organization would be useful . (2ZSN)
danbri: http://www.rssboard.org/rss-0-9-0 (2ZSO)
anonymous morphed into Alessander Botti Benevides (2ZSP)
LeoObrst: S-expressions in Lisp. (2ZSQ)
PeterYim: SteveRay paraphrasing JohnSowa's questions for Guha - ref: http://ontolog.cim3.net/forum/ontolog-forum/2011-11/msg00141.html (2ZSR)
danbri: so RDF '97 was PICS-NG, which used s-expressions: http://www.w3.org/TR/NOTE-pics-ng-metadata (2ZSS)
danbri: (then XML happened) (2ZST)
KingsleyIdehen: John's actual post: http://ontolog.cim3.net/forum/ontolog-forum/2011-11/msg00141.html (2ZSU)
JoelBender: (and then N3 happened) (2ZSV)
KingsleyIdehen: Then Linked Data happened (2ZSW)
danbri: (and then JSON happened...) (2ZSX)
DougFoxvog: XML is not restricted to triples. Why was/is RDF so restricted? (2ZSY)
KingsleyIdehen: Yes, Linked Data brings it back home to simplicity (2ZSZ)
JoelBender: (and now JSON-LD is happening ... maybe) (2ZT0)
KingsleyIdehen: Yes, but Linked Data is agnostic re. EAV/SPO based 3-tuples (2ZT1)
K Goodier: Keeping things simple and delivering value (2ZT2)
KingsleyIdehen: and via HTTP we can negotiate representation (2ZT3)
FrankChum: Doug, I like RDF for its simplicity and not as restricted (2ZT4)
anonymous morphed into Arnaud J Le Hors (2ZT5)
KingsleyIdehen: Good example of this all working, via Linked Data simplicity: http://wiki.goodrelations-vocabulary.org/Microdata (2ZT6)
KingsleyIdehen: Yes, we have to "hold our noses" re. large scale adoption . +1 (2ZT7)
NicolaGuarino: usual problem with skype, sorry (2ZT8)
KingsleyIdehen: Here is a link to a note showing how Schema.org mapped to DBpedia leads to network effects: https://plus.google.com/112399767740508618350/posts/ck2yhgTWxtD (2ZT9)
KingsleyIdehen: A specific page showing LOD Cloud instance data based on Schema.org cross links: http://lod.openlinksw.com/describe/?url=http%3A%2F%2Fschema.org%2FLandmarksOrHistoricalBuildings&urilookup=1 (2ZTA)
SteveRay: @Nicola: OK, I'll try you again after Ali is done with his second question, if you raise your hand again. (2ZTB)
KingsleyIdehen: Final page showing links between Schema.org and DBpedia (and other vocabularies which appear as you follow-your-nose through the Linked Data): http://lod.openlinksw.com/describe/?url=http%3A%2F%2Fschema.org%2FLandmarksOrHistoricalBuildings&p=1&lp=89&op=-1&last=&gp=1 (2ZTC)
danbri: on the 'do we need rdf' question, .... we see two trends: (1) people who use RDF, find frustration with the fiddly details of the spec (datatypes, etc.). Perhaps such things are just inherently annoying. There needs to be a rule, but the rule is arbitrary. (2) people who don't use RDF explicitly, often drift towards a data model that is very RDF-like, because RDF didn't appear from nowhere. Graph-shaped data is a very common pattern (cf. Kingsley on EAV). Hence all recent talk on 'social graph', 'interest graph', etc. (2ZTD)
KingsleyIdehen: Methinks: Schema.org and Linked Data have a mutually beneficial relationship that in effect fans out to adding more semantic structure to links (actually relations) on the WWW. Schema.org delivers immediate and palpable value (2ZTE)
NicolaGuarino: @Steve: sorry, I am not able to talk through skype, too bad (2ZTF)
PeterYim: @Nicola: please type out your question on the chat (2ZTG)
anonymous1 morphed into DuaneNickull (2ZTH)
PeterYim: schema.org - as DanBrickley puts it - characterized by a small working group, consensus, ability to move and make decisions quickly (2ZTI)
NicolaGuarino: Here is the comment I wanted to make: The reason why super-simple ontologies like FOAF work is that the words are simple to understand But there are words which everybody understands, and words that are ambiguous and difficult to define or explain (e.g., service, unemployed person). It is a fact that people doing markup don't care about deep semantics of their tag. So if the goal is to get billions of pages marked up, that's fine. But what about USING these marked up pages for information integration, services mashup and so on, instead of just for search? BOTTOM LINE: extensive tagging with little semantics may be very useful for search, but not for integration of information (2ZTJ)
NicolaGuarino: @Guha: but even for application-dependent vocabularies we sometime need very crisp formal definitions.... (2ZTK)
danbri: (re starting points of Web: http://www.w3.org/History/1989/proposal-msw.html has seeds of RDF in there too) (2ZTL)
NicolaGuarino: Deep semantics is needed (sometimes) also for application-dependent purposes, not just for universal purposes (2ZTM)
AdrianWalker: To go beyond search applications, some degree of NLP is unavoidable? (2ZTN)
DougFoxvog: I suggest that small ontologies can build on larger existing ones. Those who use them do not need to use everything from the larger ontologies. Deep ontologies would have rules and reasoning structures that are immaterial to small systems that use parts of them. (2ZTO)
PeterBenson: our experience with ISO 8000 is that you need sufficient data to meet a defined requirement - nothing more. As requirements grow so does the depth of data. (2ZTP)
NicolaGuarino: Besides schema.org, why not investing on a MINIMAL formal vocabulary, clarifying for instance the various notions of PART or DEPENDENCE? (2ZTQ)
Stefano Bortoli: being to narrow in the definition of schemas might end up in a higher cost of maintenance of the application after all. This is a less we should have learned from software engineering at least. So, deep thinking and generalization to some extent is necessary. Simple and easy is good in the short term, but we risk to create asbestos that will be very hard to handle in the future (2ZTR)
NicolaGuarino: @Stefano Bortoli: +1 (2ZTS)
PeterYim: @James Sorace - you can click on the "Settings" button (at the top center of the window) that modify "anonymous" into your real name (2ZTT)
DougFoxvog: Contexts can separate ontologies into subsets. Guha is talking about the problems of "an ontology of everything". Cyc developed the idea of Microtheories (but i'm not sure if it was after he left). By placing rules and relationships in such contexts (or microtheories) one can avoid many of the problems of an "ontology of everything". This becomes an issue on the Semantic Web, where triples make it hard to place statements within specific contexts. (2ZTU)
VladTanasescu: Any pointers to this ACM article? (2ZTV)
GaryBergCross: What consideration has schema.org given to controlled natural languages? Some efforts have tried to make OWL and Common Logic easier to express. (2ZTW)
danbri: @DougFoxvog: Guha/ has 'Contexts: A Formalization and Some Applications'... (2ZTX)
Stefano Bortoli: @Dough I don't think that anyone is really aiming at the "philosophical ontology", not in the Semantic Web at least. Indeed, the first efforts were spent in automatic ontology mappings, rather than producing semantically annotated data. Contexts are particularly complex to manage in a context-less environment such as the WEB. The less we can do, is to try to be formal in defining concepts to reduce the risk of misunderstanding. (2ZTY)
GaryBergCross: One issue with Microtheories is when do your create a new one versus adapt an existing one. (2ZTZ)
PeterYim: Guha: currently adoption is in the order of thousands of sites and billions of pages now (2ZU0)
SteveRay: Certainly some standards development efforts are importing existing external concepts or "ontologies" to a much greater degree today. (2ZU1)
danbri: on re-use, one q is whether publishers/authors of instance data should bear the cost of that sharing/re-use. Mainstream RDF / SemWeb culture is to have instance data cite several different ontologies. Schema.org rather pre-packages things and offers the package as a single usable thing... (2ZU2)
danbri: re rNews - see http://blog.schema.org/2011/09/extended-schemaorg-news-support.html for details (2ZU3)
danbri: http://www.iptc.org/site/Home/Media_Releases/schema.org_adopts_IPTC's_rNews_for_news_markup (2ZU4)
Roger Cutler: I don't think he said billions of pages. Thousands of sites & billions of pages means millions of pages per site, right? (2ZU5)
danbri: (yup, we should make the various mappings to/from schema.org easier to find) (2ZU6)
DougFoxvog: @Gary -- You can create a new microtheory when describing a narrower field or are using multiple existing contexts, or when presenting information about a specific event or other individual. (2ZU7)
danbri: http://wiki.creativecommons.org/LRMI/Specification_v0.5 (2ZU8)
NicolaGuarino: A couple of problems I find in the current taxonomic structure of schema.org: 1. A governmentOffice is both a place and an organization (2ZU9)
ChristopherSpottiswoode: What a privilege that was, to be able to listen in on that conversation, with all that experience! Thank you all. (2ZUA)
DougFoxvog: @Gary -- adapt an existing context when providing more info @ same level (2ZUB)
Stefano Bortoli: thanks (2ZUC)
Stefano Bortoli: bye (2ZUD)
PeterYim: Great session ... thank you Guha, Dan and everyone all for coming! (2ZUE)
Guha: Thank you everyone (2ZUF)
danbri: Thanks all (2ZUG)
PeterYim: -- session ended : 11:00am PST -- (2ZUH)
-- end of chat session -- (2ZN4)
- ... More Questions (2ZN5)
- For those who have further questions or remarks on the topic, please post them to the [ontolog-forum] so that everyone in the community can benefit from the discourse. (2ZN6)
- if you are not a member of the Ontolog community (meaning to say you are not subscribed to the [ontolog-forum] list) yet, we cordially invite you to join us. See our "Membership" details at: http://ontolog.cim3.net/cgi-bin/wiki.pl?WikiHomePage#nid1J (2ZN7)
Audio Recording of this Session (2ZN8)
- To download the audio recording of the session, click here (2ZN9)
- the playback of the audio files require the proper setup, and an MP3 compatible player on your computer. (2ZNA)
- Conference Date and Time: 1-Dec-2011 9:38 ~ 11:00 am Pacific Standard Time (2ZNB)
- Duration of Recording: 1 Hour 21.5 Minutes (2ZNC)
- Recording File Size: 9.33 MB (in mp3 format) (2ZND)
- suggestion: its best that you listen to the session while having the presentation opened in front of you. You'll be prompted to advance slides by the speaker. (2ZNE)
- Take a look, also, at the rich body of knowledge that this community has built together, over the years, by going through the archives of noteworthy past Ontolog events. (References on how to subscribe to our podcast can also be found there.) (2ZNF)
For the record ... (2ZNG)
How To Join (while the session is in progress) (2ZNH)
- 1. Dial in with a phone and connect through skype: http://ontolog.cim3.net/cgi-bin/wiki.pl?ConferenceCall_2011_12_01#nid2ZLR (2ZNI)
- 2. Open chat in a new browser window: http://webconf.soaphub.org/conf/room/ontolog_20111201 (2ZNJ)
- 3. Download the speaker's presentation (slides) here: http://ontolog.cim3.net/cgi-bin/wiki.pl?ConferenceCall_2011_12_01#nid2ZLJ (2ZNK)
- or, 3.1 access our shared-screen vnc server, if you are not behind a corporate firewall (2ZNL)
Conference Call Details (2ZLM)
- Date: Thursday, 1-Dec-2011 (2ZLN)
- Start Time: 9:30am PST / 12:30pm EST / 6:30pm CET / 17:30 UTC (2ZLO)
- ref: World Clock (2ZLP)
- Expected Call Duration: ~1.5 hours (2ZLQ)
- Dial-in: (2ZLR)
- Shared-screen support (VNC session), if applicable, will be started 5 minutes before the call at: http://vnc2.cim3.net:5800/ (2ZLW)
- view-only password: "ontolog" (2ZLX)
- if you plan to be logging into this shared-screen option (which the speaker may be navigating), and you are not familiar with the process, please try to call in 5 minutes before the start of the session so that we can work out the connection logistics. Help on this will generally not be available once the presentation starts. (2ZLY)
- people behind corporate firewalls may have difficulty accessing this. If that is the case, please download the slides above (where applicable) and running them locally. The speaker(s) will prompt you to advance the slides during the talk. (2ZLZ)
- In-session chat-room url: http://webconf.soaphub.org/conf/room/ontolog_20111201 (2ZM0)
- instructions: once you got access to the page, click on the "settings" button, and identify yourself (by modifying the Name field from "anonymous" to your real name, like "JaneDoe"). (2ZM1)
- You can indicate that you want to ask a question verbally by clicking on the "hand" button, and wait for the moderator to call on you; or, type and send your question into the chat window at the bottom of the screen. (2ZM2)
- thanks to the soaphub.org folks, one can now use a jabber/xmpp client (e.g. gtalk) to join this chatroom. Just add the room as a buddy - (in our case here) ontolog_20111201@soaphub.org ... Handy for mobile devices! (2ZM3)
- Discussions and Q & A: (2ZM4)
- Nominally, when a presentation is in progress, the moderator will mute everyone, except for the speaker. (2ZM5)
- To un-mute, press "*7" ... To mute, press "*6" (please mute your phone, especially if you are in a noisy surrounding, or if you are introducing noise, echoes, etc. into the conference line.) (2ZM6)
- we will usually save all questions and discussions till after all presentations are through. You are encouraged to jot down questions onto the chat-area in the mean time (that way, they get documented; and you might even get some answers in the interim, through the chat.) (2ZM7)
- During the Q&A / discussion segment (when everyone is muted), If you want to speak or have questions or remarks to make, please raise your hand (virtually) by clicking on the "hand button" (lower right) on the chat session page. You may speak when acknowledged by the session moderator (again, press "*7" on your phone to un-mute). Test your voice and introduce yourself first before proceeding with your remarks, please. (Please remember to click on the "hand button" again (to lower your hand) and press "*6" on your phone to mute yourself after you are done speaking.) (2ZM8)
- Please review our Virtual Session Tips and Ground Rules - see: VirtualSpeakerSessionTips (2ZM9)
- RSVP to peter.yim@cim3.com appreciated, ... or simply just by adding yourself to the "Expected Attendee" list below (if you are a member of the team.) (2ZMA)
- This session, like all other Ontolog events, is open to the public. Information relating to this session is shared on this wiki page: http://ontolog.cim3.net/cgi-bin/wiki.pl?ConferenceCall_2011_12_01 (2ZMB)
- Please note that this session may be recorded, and if so, the audio archive is expected to be made available as open content, along with the proceedings of the call to our community membership and the public at-large under our prevailing open IPR policy. (2ZMC)
Attendees (2ZMD)
- Attended: (2ZME)
- SteveRay (chair) (2ZMH)
- R V Guha "Guha" (invited speaker) (2ZMI)
- DanBrickley "danbri" (discussant) (2ZP5)
- PeterYim (2ZMJ)
- RandyKerber (2ZP0)
- BobbinTeegarden (2ZP6)
- FrankChum (2ZP7)
- LeoObrst (2ZPC)
- ElizabethFlorescu (2ZPE)
- BobSchloss (2ZPL)
- DougFoxvog (2ZPI)
- DavidHau (2ZPM)
- BobSmith (2ZPO)
- GeraldRadack (2ZPP)
- StefanoBocconi (2ZPQ)
- Vlad Tanasescu (The University of Edinburgh) (2ZPX)
- Kurt Kirkham (Sallie Mae) (2ZPY)
- KatherineGoodier (2ZPZ)
- JoelBender (2ZQ2)
- GaryBergcross (2ZQ0)
- James Sorace (HHS) (2ZQ3)
- NicolaGuarino (2ZQ4)
- ChristopherSpottiswoode (2ZQ5)
- PeterBenson (2ZQ6)
- Melissa Hildebrand (Scheib) (ECCMA) (2ZQ7)
- FrankAlvidrez (2ZQ8)
- AdrianWalker (2ZQ9)
- AndreasHarth (2ZQB)
- Lora Aroyo (VU, NL) (2ZQD)
- Roger Cutler (Chevron) (2ZQE)
- RamSriram (2ZQF)
- KingsleyIdehen (2ZQH)
- YefimZhuk (2ZPN)
- Alessander Botti Benevides (2ZUI)
- AliHashemi (2ZUJ)
- Arnaud J Le Hors (2ZUK)
- BrianDavis (2ZUL)
- Cirrus Shakeri (2ZUM)
- DuaneNickull (2ZUN)
- Kavitha Srinivas (2ZUO)
- Mike Ward (2ZW5)
- MyCoyne (2ZUP)
- shenley (2ZUQ)
- Stefano Bortoli (2ZUR)
- Ted Bashor (2ZUS)
- YuLin (2ZUT)
- Expecting: (2ZMG)
- (2ZMK)
- (please add yourself to the list if you are a member of this community, or, rsvp to <peter.yim@cim3.com>) (2ZML)
- Regrets: (2ZMM)
- JohnSowa (cannot attend, but has questions that he will ask via the session chair) (2ZQA)
- ChristophLange (traveling) (2ZOZ)
- ToddSchneider (2ZPD)
- MartinHepp (2ZPG)
- FrankOlken (time conflict) (2ZPH)
- ChrisWelty (2ZQ1)
- ... (2ZMN)