All,
A message that I forgot to pass on to this list.
Kingsley
On 1/22/13 11:45 AM, Bernard Vatant wrote:
Hi Dan
I can't believe that such a rich and thoughtful message did not
get any answer (at least any public one) in ten days.
Thanks to putting it down anyway. I wanted to answer right away
when you posted it but had not until today the bandwidth to do so
properly.
So here goes, answers below. Note for those who don't care to
drill down in such a long discussion that the main point about it
is a call to action that will be certainly be duplicated on other
channels, but I extract it here :
ACTION
Make a list of "globally adopted schemas" (vocabularies) and put
a responsible agent name/email/URI whatever Web identifier
in front of it https://docs.google.com/spreadsheet/ccc?key=0AiYc9tLJbL4SdHByWkRYUkYxZU5qS1lQOE5FV0hiNlE#gid=0
Free to edit by anyone. If you are currently responsible
for a vocabulary, put your name and contact email address.
Let's take a month to see what we can gather. A month from now I
will mail all declared responsible to have confirmation, lock the
document, and add this information to LOV vocabularies
description.
If you want to make sure what I mean by "responsible", read
details below.
Best
Bernard
2013/1/13 Dan Brickley <danbri@xxxxxxxxxx>
...
As a member of the RDF community since 1997, I'm painfully
aware of
some of our failings. It is (as has been expressed already in
this
thread) important to avoid over-burdening schema.org with every hope
and aspiration that attaches to the RDF, '[sS]emantic [wW]eb',
'Linked
[open] Data' etc labels. Or put another way; schema.org has no
intention of being overburdened with such things.
Two particular failings of our community come to mind. One is
that we
have an endearing and frustrating architecture of politeness
based on
the use of namespaces that has led to a situation in which we
have a
fragmented suite of independent vocabularies that are hard for
new
parties to adopt.
I'm not sure that fragmentation and independence is the main
obstacle to adoption. Or the same can be said for any linked
data and dataset. Vocabularies are a particular kind of linked
data, but they are linked data. Linked data are also
fragmented and managed by independent sources. Choosing a
reference vocabulary should be no more no less an issue that
choosing reference entities in an authority list, a thesaurus
or any kind of linked data base.
Main issues for adoption of reference URIs are quality and
sustainability of the resources and responsibility of the
publisher.
Discovering vocabularies might be tough, although we have more
and more tools for that (not to mention LOV again here), but
assessing those three key parameters (quality, sustainability,
responsibilty) is a headache mainly because many vocabulary
publishers do not take them seriously, as attested by the
crying lack of documentation and metadata for many of them.
We have now serious people and orgnizations eager to enter the
linked data game, and I meet more and more the question : "can
we trust X or Y to be still available in 5-10 years?"
The culture around RDF is that you only publish
schemas for the 'diffs', the missing vocabulary that wasn't
covered by
a jumbled mix of existing terminology. So anyone doing
document-like
markup would be frowned at - "Did you consider using Dublin
Core?";
anyone publishing an RDF vocabulary describing people "Why
didn't you
use FOAF?", and so on. And the very architecture that
supported this -
namespaces - allowed us to continue to design these parallel
descriptive systems without being forced to sit down together
and work
out how they can be combined to solve real world problems.
Indeed. But previous architectures provided parallel
vocabularies in parallel formats not interoperable at all, so
we have a real progress. We had, and still have around, people
convinced that the linked data technical infrastructure would
work without social interagreement. But "Publish and let the
Web do the REST" just does not work. Not only for
vocabularies, but again for linked data at large. But seems to
me now we have more and more people convinced that the
technical interoperability ensured by the common linked data
infrastructure is not enough if there is no social
coordination. So let's sit down together, indeed.
See e.g. the thematic of next DC conference http://dcevents.dublincore.org/index.php/IntConf/dc-2013.
But I also agree with others that this forum is not
necessarily the one to solve all problems, and certainly not
by bringing every other vocabulary under the schema.org
umbrella. Opening several focused tables of conversation is
certainly more profitable.
A couple of years ago, I did sit down and look at the words
we'd
chosen in various deployed and popular-ish RDF vocabularies; I
called
it "Zoo"; https://github.com/danbri/Zoo/blob/master/zoo.foaf.tv/index.html
https://github.com/danbri/Zoo/blob/master/zoo.foaf.tv/zoo/raw_manifest.txt
... this showed that 'Collection' was used in bibo:, swan:,
'Work' in
skos:; cc: vcard:; 'description' in dcterms: doap: gr: ical:
sioc:,
'category' in 'doap: gr: po: vcard:', 'subject' in dcterms:
po: rdf:
sioc:, title in 'dcterms: foaf: sioc: vcard:' and so on.
Part of my hope for this forum is that -yes, heavily nudged
by the creation of
schema.org - RDF vocabulary managers and
editors could finally take
the time to stay in touch.
Indeed!
That parties
working on vocabularies
designed to be deployed alongside each other, could do the
world a
favour and talk to each other a bit more.
YES !
It is good
that we have the
namespaces technical mechanism; but it has for too long
allowed us to
sidestep the need to talk about how different vocabularies fit
together as more than mere triples.
Having pursued the same objective inside the LOV project for
about three years now, I would say that the main obstacle
we've met is the pervasive lack of responsibility of
vocabulary owners/authors/creators/publishers/curators. We
have gathered more than 300 vocabularies, but for many of them
it is not possible to identify who is the current responsible
entity (person or organisation), under any definition of the
word at http://en.wiktionary.org/wiki/responsible.
In a nutshell people don't make things seriously, and/or they
don't answer when called. I don't say it's a general rule, but
from potential adopters it's very difficult to say if there is
someone responsible behind a given vocabulary, in particular
in those frequent cases where the project is closed, original
editor has moved, or does not answer mails etc etc.
Seems to me a simple basic action should be taken to start
with, either here or under any relevant forum, which would be
in a nutshell : responsible people, step forward. Who wants to
play nicely in this game, how do you make it public, and how
would other know about it. We can define a simple markup on
vocabularies, similar to creative commons spirit, showing the
level of engagement or responsibility involved in the
vocabulary publishing. Lists of vocabularies along with their
current curators, endorsing a certain number of social
rules, like taking part in process where their vocabularies
are put on the table with other relevant ones, etc. could be
easily published and updated on a regular basis. We have
already exchanged with Tom Baker on this, DCMI have thought
seriously about those issues for a while as you (Dan) are well
aware of. 2013 should be a year of serious action on this.
The point is that in this community too many people have come
to know each other too well, so that they don't see why those
implicit connections and involvements should be explicited
anywhere. But for people from outside, all this is currently
totally opaque.
So WebSchemas was designed to be something a bit more than
'the
schema.org mailing list at W3C', and I
still believe that. We (the
larger 'we') need a forum in which all schemas intended for
planet-wide use are equally 'on topic'. The existence of schema.org
should not have a chilling effect on the design, use and
deployment of
other RDF vocabularies. Even if the schema.org
partner companies are
not in a position right now to collectively promise to
support/understand/use/endorse non-schema.org
vocabulary, it is still
healthy to have multiple efforts, initiatives and
perspectives. (The
move towards RDFa Lite is a very positive thing here, btw.)
Very glad to read that. Diversity is good, but my above
suggestions might help to clarify who are 'we' to begin with.
The second failing of the community around RDF is that we have
- as
the years have drifted by - acquired a reputation for enjoying
talk
over action, and this isn't entirely undeserved.
But basically unfair. We've talked a lot, but achieved a lot
also.
There is an amazing lot of people around able to talk and code
at the same time :)
Yesterday I
was
re-reading some old mail threads with the late and lamented
Aaron
Swartz - http://lists.foaf-project.org/pipermail/foaf-dev/2000-August/004215.html
http://lists.w3.org/Archives/Public/www-rdf-interest/2000Jul/0034.html
- that frustration was already present in 2000. In the charter
for
this WebSchemas group i.e.
http://www.w3.org/2001/sw/interest/webschema.html
we list some semweb
permathread themes explicitly as out-of-scope.
"Out of scope topics include:
* Advocacy of data models or syntaxes without attention to
real-world use cases
* The use of inference
* debate over foundational ontologies"
This does not mean that inference and foundational ontologies
are
uninteresting or unimportant, just that every successful forum
needs
to have some core scope, and that we have plenty of other
places
around W3C to debate those topics. What makes the WebSchemas
group
special? Just that here, finally, we have somewhere where
parties
responsible for globally adopted RDF schemas can do the
responsible
thing and stay more carefully in touch with each other.
You wrote the word : responsible. Now let's make a list of
"globally adopted schemas" and put the responsible agent
name/email/URI whatever Web identifier in front of it. Simple
action, I've started here :
https://docs.google.com/spreadsheet/ccc?key=0AiYc9tLJbL4SdHByWkRYUkYxZU5qS1lQOE5FV0hiNlE#gid=0
Free to edit by anyone. If you are currently responsible
for a vocabulary, put your name and contact email address.
Let's take a month to see what we can gather. A month from now
I will mail all declared responsible to have confirmation,
lock the document, and add this information to LOV
vocabularies description.
As Martin points out in a mail that arrived while typing this,
... one
list is not going to be enough for everything. And in terms of
work
style for getting (sub-)schemas created and integrated, one
size
doesn't fit all. What we've found with schema.org is that different
collaboration styles make sense for different domains. I
suggested a
W3C Community Group to Richard Wallis and I'm pleased to see
that it
has independent existence and activity. A few months ago I
helped set
up a 'sports schemas' group (just a Google Group mailing
list), but
that initiative is yet to thrive. We have a very active and
largely
independent community around the LRMI vocabulary managed quite
separately, but linked to this one by mail, wiki and
occasional audio
catchups. There is of course Good Relations, which also enjoys
independent existence.
And there is an ongoing effort to make the Time Ontology move
forward beyond it current "draft status".
In general I think W3C community groups are a fine mechanism
for more
focussed and intense vocabulary collaboration, and this forum
serves
more for integration issues and high level overview on how all
the
pieces of the jigsaw fit together. It could be great, for
example, to
see a community group around modeling fiction (and Comics?),
but we
also need a place where all such efforts can report back to
the wider
community. The creation of schema.org has
made all this more urgent
and timely, but it is something we've needed for a while. In
the
Dublin Core world we talk about this as 'application
profiles';
templates and examples explaining how independently designed
pieces of
vocabulary can be mixed together to address real world
descriptive
needs. It should happen at W3C, schema.org
should engage with it, but
the need is broader. I think WebSchemas is the right place for
it.
I should also mention that there are a few areas now where
groups
elsewhere around W3C have come up with vocabulary (e.g.
Organization +
Registered Organization vocabs; DCAT/ADMS; Geo and post
addresses)
that will likely inform improvements to schema.org. There is a need
for somewhere public to work out details around
stability/versions,
appropriate acknowledgement, etc.
Exactly. What I call "sustainable vocabulary management".
I would like to mention that in France, in the framework of
the Datalift project ( datalift.org) we have among
partners national institutions INSEE (statistics) and IGN
(geographical) working together to publish linked data and
harmonize their vocabularies and data with each other and the
general vocabulary ecosystem. Those are "serious" "normal"
data publishers playing the game nicely.
The fundamental problem of schema design is that the world is
not
tidily partitioned; that all use cases interact and overlap -
'Intertwingularity'. We can make focussed sub-fora for
figuring out
how to describe sports, or fiction, or journals and books, but
the
combinations and scope overlaps can be overwhelming. While
good design
can help, perhaps even more important is communication.
Again, triple YES !
And for that we need somewhere to talk. I don't think it
ultimately
matters hugely whether there is a schema.org-specific mailing
list at
W3C alongside a more general 'all vocabularies' one, versus a
single
list as we have now. My preference is for a unified forum, and
we will
likely spin off various schema.org-specific lists for specific
detailed schema.org topics. But given schema.org's cross-domain
nature, it seems important for the project to remain highly
visible in
a cross-domain, multi-schema forum.
Dan
> //Ed
>
> [1] http://www.w3.org/2001/sw/interest/webschema.html
>
--
Bernard
Vatant
Vocabularies & Data Engineering
Tel : + 33 (0)9 71 48 84 59
--------------------------------------------------------
Mondeca
3 cité Nollez 75018 Paris, France
--
Regards,
Kingsley Idehen
Founder & CEO
OpenLink Software
Company Web: http://www.openlinksw.com
Personal Weblog: http://www.openlinksw.com/blog/~kidehen
Twitter/Identi.ca handle: @kidehen
Google+ Profile: https://plus.google.com/112399767740508618350/about
LinkedIn Profile: http://www.linkedin.com/in/kidehen
|
smime.p7s
Description: S/MIME Cryptographic Signature
_________________________________________________________________
Message Archives: http://ontolog.cim3.net/forum/ontolog-forum/
Config Subscr: http://ontolog.cim3.net/mailman/listinfo/ontolog-forum/
Unsubscribe: mailto:ontolog-forum-leave@xxxxxxxxxxxxxxxx
Shared Files: http://ontolog.cim3.net/file/
Community Wiki: http://ontolog.cim3.net/wiki/
To join: http://ontolog.cim3.net/cgi-bin/wiki.pl?WikiHomePage#nid1J (01)
|