ontolog-forum
[Top] [All Lists]

Re: [ontolog-forum] ONTOLOG community event planning and scheduling sess

To: "'[ontolog-forum] '" <ontolog-forum@xxxxxxxxxxxxxxxx>
From: "Hans Polzer" <hpolzer@xxxxxxxxxxx>
Date: Sun, 15 Sep 2013 21:32:43 -0400
Message-id: <00c501ceb27c$a49d8ac0$edd8a040$@verizon.net>
Paul,    (01)

Remember that enterprises are silos as well, when viewed from the
perspective of other enterprises, just a bit wider in scope than silos that
are constituent parts of an enterprise. Also, most enterprises/institutions
have sub-silos that are significantly segregated from each other for a
variety of legal and business reasons, most notably, those
enterprises/institutions that operate internationally or in different
jurisdictions with significant differences in laws/regulations that impact
the operations of the enterprise/institution. No amount of "silo-busting"
within the scope/power of the enterprise will change that externality. Some
enterprises even deliberately set up competition among their divisions,
encouraging "silo" behavior across a significant portion of each division's
"operational space", even if other portions of operational space are
mandated as shared or common services.    (02)

I'll also note that nothing much gets accomplished without silos. All
enterprises, systems, projects are scope limited in operational space, time
interval, and most importantly, budget/revenue.  That's true even for
individuals and "open-source"  or "crowd-sourced" projects. Put more
positively, silos are focused and goal oriented. They also may have
information about their internal environment (like personnel information)
and about the external environment (like customer info) that can't be shared
for regulatory or competitive survival reasons. Even for information that
could be shared externally, they may represent it internally in ways that
optimize their institutional objectives (which might include the interests
of supply chain partners, for example), rather than satisfy the way some
arbitrary entity external to the enterprise might want to access or
understand that information.    (03)

Silo behavior is neither good nor bad without taking the larger environment,
operational context, and institutional objectives into account. That's why
there are corporate mergers, divestitures, re-engineering, and
re-organizations, not to mention bankruptcies and start-ups - all forms of
silo formation and reformation and dissolution. I'll note that most silo
dissolution is not viewed as a positive outcome, unless we are talking
about, say, North Korea. Silo broadening can often be positive (mainly due
to economies of scale), but it runs the risk of scope creep and excess
internal complexity and interdependency (hence divestitures - which are a
form of silo narrowing).  Silo narrowing can also be positive when it
fosters greater awareness of external entities and flexibility and dynamism
in partnering with other silos (instead of doing everything inside the silo
organizational boundaries), but it runs the risk of focusing on an
ecological niche that might change in significant ways or evaporate
completely.     (04)

I think it is more productive to talk about reviewing/adjusting the
"open-ness" of silos and making them consider explicitly what should be made
accessible external to the silo and what the silo should try to obtain from
outside the silo. One should also ask how much continuous/periodic
monitoring of the silo's environment for changes that present threats or
opportunities for the silo (and its degree of "open-ness") might be
appropriate, given the silo's core objective(s) and resource constraints.
The answers to these questions would then drive the information models used
by the silos for internal operations and for interaction with their
environment. It also will drive the degree of dynamism appropriate in
information model strategy. Unused information model flexibility can be
viewed as wasted development/acquisition and run-time resources.
Insufficient flexibility can lead to institutional failure/dissolution. Best
to encourage for a "Goldilocks" approach rather than saying that more
open-ness or more flexibility is always better.    (05)

Hans    (06)

-----Original Message-----
From: ontolog-forum-bounces@xxxxxxxxxxxxxxxx
[mailto:ontolog-forum-bounces@xxxxxxxxxxxxxxxx] On Behalf Of Paul Tyson
Sent: Sunday, September 15, 2013 9:11 AM
To: [ontolog-forum]
Subject: Re: [ontolog-forum] ONTOLOG community event planning and scheduling
session - Thu 2013.09.12 & Thu 2013.09.19    (07)

I like Kingsley's "data-de-silo-fication" theme. (In fact, I'm soon to give
an internal tech talk called "Down With Silos! How linked data is
beautifying the information landscape").    (08)

I want to contribute a different narrative, orthogonal to the engineering
discussion in this thread, but leading I think to the same place Kingsley is
heading. For brevity I'll keep it to bullet points.    (09)

1. Enterprises depend for their success on people in the enterprise doing
the right thing at the right time.
2. People only know what is the right thing and how to do it by getting good
information in the form most useful to them at the time they need it.
3. They get the information they need primarily from "documents", taken in
the very general sense as some bounded, structured, purposeful package of
distinctions (glyphs, lines, colors, shapes, texture, sound, image, etc.).
4. Documents, being packages of differences, can be decomposed into
particles of significance related in particular ways.
5. RDF is a good way to represent, record, and exchange particles of
significance that are related in particular ways. Along with XML, HTML,
HTTP, and related W3C standards, we have a complete suite of tools for
delivering documents containing the information needed to the people who
need it to act for the success of the enterprise.    (010)

There should be no dispute about RDBMS as an efficient storage and retrieval
machinery for relational data. I appreciate hearing about the engineering
and theoretical issues about such systems. However, those issues are related
to the problems of getting information to people at the point of need only
to the extent that system designers choose to couple data persistence
components to data delivery mechanisms. One of the hallmarks of "legacy"
systems is the unfortunate choice to closely couple these components.    (011)

I expect the discussion in this forum to focus on how to deliver information
to a human in the way that best meets his or her constantly changing and not
entirely predictable needs. Whether the data is persisted on disk or
papyrus, in Elbonian, SQL, NoSQL, or Linear B, may be of great concern to
the designers and engineers tasked with supporting the information needs of
an enterprise. But it should be immaterial to discussions about what happens
directly on each side of the computer screen: that is, how  documents are
composed for display, and how they are interpreted by the human on the other
side of the screen.    (012)

Those of us who focus on the 2-sides-of-the-screen problem domain have found
the W3C basic and semantic web technology stacks of inestimable value.    (013)

Regards,
--Paul    (014)

On Fri, 2013-09-13 at 10:07 -0400, Kingsley Idehen wrote: 
> On 9/13/13 9:33 AM, Michael Brunnbauer wrote:
> 
> > Hello Kingsley,
> > 
> > On Fri, Sep 13, 2013 at 08:37:01AM -0400, Kingsley Idehen wrote:
> > > > I agree wholeheartedly. RDF and SPARQL make data integration 
> > > > easier (without solving the fundamental issues of course).
> > > What is the fundamental issue, as you see it?
> > http://en.wikipedia.org/wiki/Heterogeneous_database_system#Problems_
> > of_heterogeneous_database_integration
> ## In Turtle, for sake of clarity re, my world-view ##
> 
> <http://en.wikipedia.org/wiki/Heterogeneous_database_system#Problems_o
> f_heterogeneous_database_integration>
> <#myLabel> "Data-de-silo-fication" ;
> <#sameAs> <#HeterogeneousDataFederation>, <#DataVirtualization>, 
> <#DataSpaces>, <#MasterDataManagement> ; <#comment> """This problem 
> covers data disparity issues that include:
> shape, location, and relation semantics (or lack thereof)""" .  
> 
> ## Turtle End ##
> 
> So I assume we are in agreement re., the problem? 
> 
> > 
> > http://lists.w3.org/Archives/Public/public-lod/2013Jun/0458.html
> > 
> > > I see the fundamental issue (or pain point) being
data-de-silo-fication.
> > RDF is nice for Extract Transform Load. The problems start if you 
> > want to change data.
> 
> Change sensitivity is handled via the use of Linked Data Views over 
> disparate data sources. This is what R2RML facilitates albeit rarely 
> mentioned, sadly.
> 
> Views can be transient, materialized, or a configurable mix of both.
> That's certainly the case re. Virtuoso i.e., make a change in its SQL 
> DBMS (or a remote ODBC or JDBC accessible DBMS) and they are reflected 
> in all your SPARQL queries and Linked Data URI lookups. The same even 
> applies to RESTful or SOA services that are attached to Virtuoso (we 
> cover 100+ protocols and formats).
> 
> We have Replication (Snapshot and Transactional)  and HTTP (including 
> cache invalidation) baked into Virtuoso.
> 
> > 
> > > > But they are a bad option for data storage because maintaining 
> > > > consistency is so difficult (think about deleting a row or 
> > > > transactions).
> > > I don't know what that really means.
> > Suppose you have an App with user registration. If you store the 
> > user data in a triple store, deleting a user with SPARQL becomes
difficult.
> 
> That doesn't apply to every triplestore. That doesn't apply to 
> Virtuoso. We even have large customer running OLTP like workflows with 
> something like 40 million named graphs. BTW -- as part of the 
> workflow,  Virtuoso has to factor in deltas such that it doesn't 
> perform wholesale named graph deletions etc.
> 
> > Removing
> > a single triple is not enough. Storing the user in a named graph may 
> > help but probably creates other problems and definitely makes 
> > querying a lot more complicated.
> > 
> > What about SPARQL transactions ? Starting a transaction, reading and 
> > updating, commiting the transaction.
> 
> We are a full blown ACID DBMS. See our benchmark reports. These simply 
> aren't new issues since we have a hybrid DBMS.
> 
> > Is there a triple store that supports this with all the fidelity of 
> > modern RDB systems ?
> 
> Yes. It's called Virtuoso !
> 
> > 
> > > I say that because we simply don't have that problem in our hybrid
DBMS.
> > I don't know what that really means. Can I modify data with SPARQL 
> > *and* SQL in your DBMS ? If yes, how does that work ?
> 
> Of course you can. We support SPARQL 1.1 Update. We are SQL-99 
> compliant. We do ACID. We have serious customers doing OLTP like stuff 
> using RDF or SQL aspects of Virtuoso. [1][2][3][4]
> 
> Links:
> 
> 1. http://bit.ly/ZOCmaD -- shows we even have the performance 
> difference between SPARQL and SQL down to insignificant levels via 
> Star Schema Benchmark Report 2. http://bit.ly/10pvAbF -- blog post 
> about this effort 3. http://bit.ly/Yf5etP -- Berlin SPARQL Benchmark 
> Report (note: this particular benchmark is SQL relational DBMS 
> oriented) 4. http://bit.ly/14ULX2F -- 150 Billion triples scale report 
> 5. http://bit.ly/RtdGjA -- CoRelational DBMS Concepts post that 
> includes live links to R2RML Views built atop SQL data 6. 
> http://bit.ly/13fnIbr -- example of R2RML views atop an Oracle DBMS 
> hooked into Virtuoso via ODBC .
> 
> 
> Kingsley
> > 
> > Regards,
> > 
> > Michael Brunnbauer
> > 
> > 
> > 
> > _________________________________________________________________
> > Message Archives: http://ontolog.cim3.net/forum/ontolog-forum/
> > Config Subscr: 
> > http://ontolog.cim3.net/mailman/listinfo/ontolog-forum/
> > Unsubscribe: mailto:ontolog-forum-leave@xxxxxxxxxxxxxxxx
> > Shared Files: http://ontolog.cim3.net/file/ Community Wiki: 
> > http://ontolog.cim3.net/wiki/ To join: 
> > http://ontolog.cim3.net/cgi-bin/wiki.pl?WikiHomePage#nid1J
> >  
> 
> 
> --
> 
> Regards,
> 
> Kingsley Idehen             
> Founder & CEO 
> OpenLink Software     
> Company Web: http://www.openlinksw.com Personal Weblog: 
> http://www.openlinksw.com/blog/~kidehen
> Twitter/Identi.ca handle: @kidehen
> Google+ Profile: https://plus.google.com/112399767740508618350/about
> LinkedIn Profile: http://www.linkedin.com/in/kidehen
> 
> 
> 
> 
> _________________________________________________________________
> Message Archives: http://ontolog.cim3.net/forum/ontolog-forum/
> Config Subscr: http://ontolog.cim3.net/mailman/listinfo/ontolog-forum/
> Unsubscribe: mailto:ontolog-forum-leave@xxxxxxxxxxxxxxxx
> Shared Files: http://ontolog.cim3.net/file/ Community Wiki: 
> http://ontolog.cim3.net/wiki/ To join: 
> http://ontolog.cim3.net/cgi-bin/wiki.pl?WikiHomePage#nid1J
>      (015)


_________________________________________________________________
Message Archives: http://ontolog.cim3.net/forum/ontolog-forum/
Config Subscr: http://ontolog.cim3.net/mailman/listinfo/ontolog-forum/
Unsubscribe: mailto:ontolog-forum-leave@xxxxxxxxxxxxxxxx
Shared Files: http://ontolog.cim3.net/file/ Community Wiki:
http://ontolog.cim3.net/wiki/ To join:
http://ontolog.cim3.net/cgi-bin/wiki.pl?WikiHomePage#nid1J    (016)



_________________________________________________________________
Message Archives: http://ontolog.cim3.net/forum/ontolog-forum/  
Config Subscr: http://ontolog.cim3.net/mailman/listinfo/ontolog-forum/  
Unsubscribe: mailto:ontolog-forum-leave@xxxxxxxxxxxxxxxx
Shared Files: http://ontolog.cim3.net/file/
Community Wiki: http://ontolog.cim3.net/wiki/ 
To join: http://ontolog.cim3.net/cgi-bin/wiki.pl?WikiHomePage#nid1J    (017)

<Prev in Thread] Current Thread [Next in Thread>