ontology-summit
[Top] [All Lists]

Re: [ontology-summit] Data Quality - Was: Invitation to a brainstorming

To: "'Ontology Summit 2011 discussion'" <ontology-summit@xxxxxxxxxxxxxxxx>
From: "Matthew West" <dr.matthew.west@xxxxxxxxx>
Date: Tue, 25 Jan 2011 15:45:02 -0000
Message-id: <4d3ef003.6686d80a.3d42.ffff92d2@xxxxxxxxxxxxx>
Dear Azamat,    (01)

Thanks for the plug.     (02)

The diagram I have used in my book "Developing High Quality Data Models" 
published by Morgan Kaufmann, ISBN: 978-0-12-375106-5 is in the attached pdf 
(P2). You will see it is not so different from the original, and is annotated 
with the ways that data models contribute to information quality.    (03)

I do not attempt to catalogue all properties of information, just those I have 
found in practice most often to have been critical.    (04)

Regards    (05)

Matthew West                            
Information  Junction
Tel: +44 560 302 3685
Mobile: +44 750 3385279
Skype: dr.matthew.west
matthew.west@xxxxxxxxxxxxxxxxxxxxxxxxx
http://www.informationjunction.co.uk/
http://www.matthew-west.org.uk/    (06)

This email originates from Information Junction Ltd. Registered in England and 
Wales No. 6632177.
Registered office: 2 Brookside, Meadow Way, Letchworth Garden City, 
Hertfordshire, SG6 3JE.    (07)




> -----Original Message-----
> From: ontology-summit-bounces@xxxxxxxxxxxxxxxx [mailto:ontology-summit-
> bounces@xxxxxxxxxxxxxxxx] On Behalf Of AzamatAbdoullaev
> Sent: 24 January 2011 19:51
> To: Ontology Summit 2011 discussion
> Subject: Re: [ontology-summit] Data Quality - Was: Invitation to a
> brainstorming call for the 2011 Ontology Summit
> 
> >From Matthew West's study on the high quality data models,
> http://www.matthew-west.org.uk/documents/princ03.pdf,  there is a listing of
> key properties, as pictured below. One could also add relativity, utility,
> worth, correctness, originality, credibility, certainty, truth, etc. As i
> know, there is a new book coming.
> 
> ----- Original Message -----
> From: "MacPherson, Deborah" <dmacpherson@xxxxxxxxxxxxxxxx>
> To: "Ontology Summit 2011 discussion" <ontology-summit@xxxxxxxxxxxxxxxx>
> Sent: Monday, January 24, 2011 4:22 PM
> Subject: [ontology-summit] Data Quality - Was: Invitation to a brainstorming
> call for the 2011 Ontology Summit
> 
> 
> > From the US Dept of Transportation, Federal Highway Administration
> > Policy Information
> >
> > "...Accuracy is only one characteristic of quality, just as validity
> > or conformance to business rules is one characteristic of quality.
> > These characteristics are some of the information quality
> > characteristics categorized as inherent characteristics.
> >
> > Fitness for purpose is the characteristic of usefulness of data for a
> > specific requirement..."
> >
> > http://www.fhwa.dot.gov/policy/ohpi/dataquality.cfm
> >
> >
> >
> > DEBORAH MACPHERSON, CSI CCS, AIA
> > Specifications and Research
> >
> > Cannon Design
> > 1100 Wilson Boulevard, Suite 2900
> > Arlington, Virginia 22209
> >
> > Direct Line 703 907 2353
> > 4 Digit Dial 6353
> >
> > dmacpherson@xxxxxxxxxxxxxxxx
> > cannondesign.com
> >
> > ü Please consider the environment before printing this email.
> >
> > -----Original Message-----
> > From: ontology-summit-bounces@xxxxxxxxxxxxxxxx
> > [mailto:ontology-summit-bounces@xxxxxxxxxxxxxxxx] On Behalf Of Peter R.
> > Benson
> > Sent: Sunday, January 23, 2011 7:42 AM
> > To: Ontology Summit 2011 discussion
> > Cc: Ontology Summit 2011 discussion
> > Subject: Re: [ontology-summit] Invitation to a brainstorming call for
> > the
> > 2011 Ontology Summit
> >
> > The definition of "quality" is pretty simple "meets requirements" the
> > issue is how to define requirements. ISO 22745-30 does a pretty good job.
> >
> > Data and Information are two different concepts, their characteristics
> > are different. Timeliness and relevance are  characteristics of
> > information, the data is what it is. Finally there is no such thing as
> > accuracy only assertions of accuracy. A lot of this ground is covered
> > in ISO 8000
> >
> > Peter
> > Cell: +1 610 462 5923
> >
> > On Jan 22, 2011, at 9:18 AM, "Brian K Lucas" <lucasb@xxxxxxxx> wrote:
> >
> >> Greetings all,
> >>
> >> I like this discussion, and have a few thoughts of my own (as you
> >> come to know me more, you'll discover that to be the norm  ;=)
> >>
> >> A) @Jack : I agree that "adequate, accurate and timely" is a worthy goal.
> >> I'd like to hear more about your definition of "adequate" and "accurate".
> >>
> >> B) @Jack : Regarding quality:  In my experience, "fitness for purpose"
> >> is trinary, when taken from a single consumer's point of view: not
> >> fit, fit, overly fit.  In the real world, however, very few offerings
> >> are consumed by a single entity.  Therefore, I believe that, when
> >> taken in the context of the offering, and including all consumers,
> >> fitness for purpose/quality is indeed scalar, if the measurement is
> >> being taken from the perspective of the producer.  If you complete a
> >> histogram/pie chart of these three values for all consumers, you get
> >> three counts of valuable metrics, which should not be
> >> combined: % fit for purpose, % not fit for purpose, and %overly fit
> >> for purpose.  The requirements not met for the "not fit" group
> >> represent offering "defects"; the requirements met for the "overly
> >> fit" group represent waste of effort for that consumer group.  If
> >> requirements have been implemented that NOBODY in the "fit" group
> >> needs, then that is wasted producer effort - unless, of course, they
> >> serve a future consumer.  My conclusion?  There is value in measuring
> >> "quality" across the existing and intended user base, and improving
> >> the offering to move more consumers into the "fit for purpose" count,
> >> without removing anyone who is already there.
> >> And, because consumer requirements usually change over time,
> >> continuous improvement of "quality" is desirable.  Then, add in that
> >> even for a single consumer, they may present KANO-like ranking of
> >> requirements (must have, should have, could have, etc.), and binary
> >> gets even a little more fuzzy for me.
> >>
> >> C) Producer cost is also key here.  If the value exchange received by
> >> the consumer does not support the cost to meet their requirements,
> >> then a "not fit" offering may still be of value to the consumer, as
> >> they may augment it with other offerings.  An example of this is
> >> Microsoft Word.  It does NOT meet all of my requirements - and yet I
> >> find it "fit for purpose" because I can work around the "defects".
> >>
> >> D) @Nicola : I agree with the case for varying levels of "quality";
> >> however, I also think that case studies are notoriously hard to
> >> quantify.  In my experience, it is usually an opinion that one work
> >> method over another produced "better" results, because most people
> >> don't try it both ways and actually measure fitness for purpose
> >> afterwards.
> >>
> >> E) I have been working on an ontology of organizations and human
> >> value exchange.  I have tried traditional ERD-style modeling,
> >> object-oriented class modeling, and now OWL ontology modeling.  Each
> >> has strengths as a modeling method; each has "defects", or "fitness
> >> of purpose" for my work.
> >> It may be an understanding defect on my part, but one of the primary
> >> issues that I have been facing is polymorphism of modeled objects.
> >> As soon as you "declare" something to be of a "type" of some modeling
> >> class/entity/etc, you constrain it from taking on characteristics of
> >> other classes that it may also play a role in.  An example is a human
> >> - most people would model "human" as a class; but in my domain, a
> >> "human" is also the offering of an educational process (with an
> >> improved knowledge metric).  My solution so far has been to only use
> >> inferred classes to discover a thing's class by its relationships.
> >> So, from my perspective, I am finding ontologies "not fit for purpose"
> >> in their current description language and implementation.  I am
> >> interested in thoughts that the other summit participants may have to
> >> help me remove this obstacle - even if it is more training for me on
> >> building ontologies.  :=)
> >>
> >> Brian K. Lucas
> >> Sponsor, Worldwide Institute for Organization Ontologics
> >> Lucasb-at-wio2-dot-org
> >>
> >> -----Original Message-----
> >> From: ontology-summit-bounces@xxxxxxxxxxxxxxxx
> >> [mailto:ontology-summit-bounces@xxxxxxxxxxxxxxxx] On Behalf Of Tim
> >> Wilson
> >> Sent: Saturday, January 22, 2011 8:11 AM
> >> To: ontology-summit@xxxxxxxxxxxxxxxx
> >> Subject: Re: [ontology-summit] Invitation to a brainstorming call for
> >> the
> >> 2011 Ontology Summit
> >>
> >> Gentlemen,
> >>
> >> If I may jump in here.  This discussion makes me think of the 85/15
> >> rule where finding and fixing 85% of all software bugs is relatively
> >> easy, the last 15% is much more difficult in terms of time and effort.
> >> There comes a point where developers have to say that the ontology is
> >> 'good enough'.  Jack is arguing that this does not constitute 'high
> >> quality'
> >> and therefore the comment on quality being binary.  Some person or
> >> persons must make a decision that the product is good enough (until
> >> the next serious bug is uncovered).  You may think that there are no
> >> more blue balls in the bin, but yet one is found.  Quality instantly
> >> goes from "1" to "0" until the issue is analyzed and a choice is made
> >> to either ignore it or fix it.
> >>
> >> Tim Wilson
> >>
> >> On 12/15/2010 3:22 AM, Matthew West wrote:
> >>> Dear Jack,
> >>>
> >>>> MW,
> >>>> Standing on the shoulders of Deming, Crosby, Juran, etc. I would
> >>>> first ask
> >>> the
> >>>> owner a) Is the fifth one guaranteed irrelevant
> >>> MW: I am assuming it is relevant.
> >>>
> >>>> and b) what is your level of
> >>>> confidence there are not 6 errors?
> >>>> Jack
> >>> MW: Indeed, but then by the same token how can you be certain
> >>> anything is defect free, even if no defects are apparent?
> >>>
> >>> MW: I think it is more useful to think of quality as the degree to
> >>> which requirements are met. Then when you fix some bugs you have
> >>> improved the quality, though you may not have met all the requirements.
> >>>
> >>> Regards
> >>>
> >>> Matthew West
> >>> Information  Junction
> >>> Tel: +44 560 302 3685
> >>> Mobile: +44 750 3385279
> >>> matthew.west@xxxxxxxxxxxxxxxxxxxxxxxxx
> >>> http://www.informationjunction.co.uk/
> >>> http://www.matthew-west.org.uk/
> >>>
> >>> This email originates from Information Junction Ltd. Registered in
> >>> England and Wales No. 6632177.
> >>> Registered office: 2 Brookside, Meadow Way, Letchworth Garden City,
> >>> Hertfordshire, SG6 3JE.
> >>>
> >>>
> >>>
> >>>
> >>>> On Dec 14, 2010, at 3:45 PM, Matthew West wrote:
> >>>>
> >>>>> Dear Jack,
> >>>>>
> >>>>>> Regarding Nicola's quite relevant concern (below) it may be
> >>>>>> useful to
> >>> note
> >>>>>> that
> >>>>>> a) quality is binary, not a scalar (Crosby, Deming, Juran, etc.)
> >>> Quality
> >>>>>> signifies conformance to requirements, Yes or No,  therefore
> >>>>>> 'high
> >>>>> quality' is
> >>>>>> meaningless.
> >>>>> MW: So presumably you would argue that if an ontology has 5
> >>>>> defects, and
> >>> 4
> >>>>> of them are fixed, there is not improvement in quality as a result....
> >>>>>
> >>>>> Regards
> >>>>>
> >>>>> Matthew West
> >>>>> Information  Junction
> >>>>> Tel: +44 560 302 3685
> >>>>> Mobile: +44 750 3385279
> >>>>> matthew.west@xxxxxxxxxxxxxxxxxxxxxxxxx
> >>>>> http://www.informationjunction.co.uk/
> >>>>> http://www.matthew-west.org.uk/
> >>>>>
> >>>>> This email originates from Information Junction Ltd. Registered in
> >>> England
> >>>>> and Wales No. 6632177.
> >>>>> Registered office: 2 Brookside, Meadow Way, Letchworth Garden
> >>>>> City, Hertfordshire, SG6 3JE.
> >>>>>
> >>>>>
> >>>>>
> >>>>>
> >>>>>> b) note carefully that from the usage viewpoint the requirements
> >>>>>> amount
> >>> to
> >>>>>> 'fit for purpose' (Checkland) or 'satisficing' (Simon).
> >>>>>> c) both proof of correctness and exhaustive test are futile,
> >>>>>> therefore
> >>> not
> >>>>>> included.
> >>>>>> d) the goal becomes warranty that the ontology of interest is
> >>>>>> devoid of internal faults and external incompatibilities wherein
> >>>>>> warranty means
> >>> zero
> >>>>>> false positives and false negatives.
> >>>>>> e) an appropriate theme may be "Making the case for adequate,
> >>>>>> accurate
> >>> and
> >>>>>> timely ontologies" which embraces both the result and the
> >>>>>> development activity.
> >>>>>> f) whether any ontology is viable or not depends on both the
> >>>>>> ontology
> >>> and
> >>>>> the
> >>>>>> intended usage.
> >>>>>> g) this means that any cadre of ontology developers must include
> >>> members
> >>>>> who
> >>>>>> are dedicated to independent and objective assessment of the
> >>>>>> viability
> >>> of
> >>>>> any
> >>>>>> ontology or patch thereof or ordered set of patches.
> >>>>>> h) fortunately, technologies, tools and methods exist (or are
> >>>>>> imminent)
> >>>>> for
> >>>>>> viability assessment of algorithms of all classes and types with
> >>> respect
> >>>>> to
> >>>>>> intended usage. This includes ontologies. Even the spaghetti code
> >>>>>> in
> >>> most
> >>>>> OWL-
> >>>>>> based examples can be assessed, even simplified, and potentially
> >>>>>> made
> >>> more
> >>>>>> "lean" without inducing 'brittle.'
> >>>>>> i) this is one reason why I suggested to Steve Ray that one
> >>>>>> corner of
> >>> the
> >>>>>> Summit allow open-mind dialogue regarding new technologies.
> >>>>>>
> >>>>>> Onward,
> >>>>>> Jack Ring
> >>>>>>
> >>>>>>
> >>>>>> On Dec 14, 2010, at 5:00 AM, Nicola Guarino wrote:
> >>>>>>
> >>>>>>> Dear colleagues,
> >>>>>>>
> >>>>>>>    I also agree very much with John and Matthew concerning the
> >>>>> importance
> >>>>>> of high quality ontologies, and on their observation that the
> >>>>>> quest for
> >>>>> high
> >>>>>> quality data models in software engineering definitely reflects a
> >>>>> sensitivity
> >>>>>> to important ontological aspects much higher than what we find in
> >>> people
> >>>>> just
> >>>>>> focusing on ontology languages.
> >>>>>>>    In the light of this, I suggest to specify a bit more the
> >>>>>>> overall
> >>>>> theme
> >>>>>> of our Summit, which in my opinion could be "Making the case for
> >>>>> ontological
> >>>>>> analysis" instead of "Making the case for ontology". An
> >>>>>> alternative
> >>> could
> >>>>> be
> >>>>>> "Making the case for high-quality ontologies".
> >>>>>>>    The reason for this proposal should be self-evident, I believe.
> >>>>> Deciding
> >>>>>> how much effort to put in developing a particular ontology is a
> >>>>>> crucial choice, and it is very important to distinguish the cases
> >>>>>> where a
> >>> proper
> >>>>>> ontological analysis pays off, and is indeed a crucial aspect of
> >>> success,
> >>>>> from
> >>>>>> those where a "lightweight" approach is sufficient.
> >>>>>>>    Just brainstorming...
> >>>>>>>
> >>>>>>> Talk to you soon,
> >>>>>>>
> >>>>>>> Nicola
> >>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>> On 9 Dec 2010, at 16:03, John F. Sowa wrote:
> >>>>>>>
> >>>>>>>> Dear Matthew and Peter,
> >>>>>>>>
> >>>>>>>> MW:
> >>>>>>>>> ... my forthcoming book "Developing High Quality Data Models".
> >>>>> Substitute
> >>>>>>>>> ontology for data model and the same argument applies. The
> >>>>>>>>> benefits
> >>>>> come
> >>>>>>>>> from improving and automating decision making through
> >>> fit-for-purpose
> >>>>>>>>> information to support those decisions.
> >>>>>>>> I very strongly agree.  Software engineers have been doing
> >>>>>>>> ontology (avant la lettre, as they say) for a very long time.
> >>>>>>>> And much of
> >>> that
> >>>>>>>> work has been very good -- sometimes much better than what
> >>>>>>>> people are doing with so-called ontology languages.
> >>>>>>>>
> >>>>>>>
> >>>>>>> ________________________________________________________________
> >>>>>>> _ Msg Archives: http://ontolog.cim3.net/forum/ontology-summit/
> >>>>>>> Subscribe/Config:
> >>>>> http://ontolog.cim3.net/mailman/listinfo/ontology-summit/
> >>>>>>> Unsubscribe: mailto:ontology-summit-leave@xxxxxxxxxxxxxxxx
> >>>>>>> Community Files:
> >>>>>>> http://ontolog.cim3.net/file/work/OntologySummit2011/
> >>>>>>> Community Wiki:
> >>>>> http://ontolog.cim3.net/cgi-bin/wiki.pl?OntologySummit2011
> >>>>>>> Community Portal: http://ontolog.cim3.net/wiki/
> >>>>>>
> >>>>>> _________________________________________________________________
> >>>>>> Msg Archives: http://ontolog.cim3.net/forum/ontology-summit/
> >>>>>> Subscribe/Config:
> >>>>> http://ontolog.cim3.net/mailman/listinfo/ontology-summit/
> >>>>>> Unsubscribe: mailto:ontology-summit-leave@xxxxxxxxxxxxxxxx
> >>>>>> Community Files:
> >>>>>> http://ontolog.cim3.net/file/work/OntologySummit2011/
> >>>>>> Community Wiki:
> >>> http://ontolog.cim3.net/cgi-bin/wiki.pl?OntologySummit2011
> >>>>>> Community Portal: http://ontolog.cim3.net/wiki/
> >>>>>
> >>>>> _________________________________________________________________
> >>>>> Msg Archives: http://ontolog.cim3.net/forum/ontology-summit/
> >>>>> Subscribe/Config:
> >>> http://ontolog.cim3.net/mailman/listinfo/ontology-summit/
> >>>>> Unsubscribe: mailto:ontology-summit-leave@xxxxxxxxxxxxxxxx
> >>>>> Community Files:
> >>>>> http://ontolog.cim3.net/file/work/OntologySummit2011/
> >>>>> Community Wiki:
> >>> http://ontolog.cim3.net/cgi-bin/wiki.pl?OntologySummit2011
> >>>>> Community Portal: http://ontolog.cim3.net/wiki/
> >>>>
> >>>> _________________________________________________________________
> >>>> Msg Archives: http://ontolog.cim3.net/forum/ontology-summit/
> >>>> Subscribe/Config:
> >>> http://ontolog.cim3.net/mailman/listinfo/ontology-summit/
> >>>> Unsubscribe: mailto:ontology-summit-leave@xxxxxxxxxxxxxxxx
> >>>> Community Files:
> >>>> http://ontolog.cim3.net/file/work/OntologySummit2011/
> >>>> Community Wiki:
> >>>> http://ontolog.cim3.net/cgi-bin/wiki.pl?OntologySummit2011
> >>>> Community Portal: http://ontolog.cim3.net/wiki/
> >>>
> >>> _________________________________________________________________
> >>> Msg Archives: http://ontolog.cim3.net/forum/ontology-summit/
> >>> Subscribe/Config:
> >>> http://ontolog.cim3.net/mailman/listinfo/ontology-summit/
> >>> Unsubscribe: mailto:ontology-summit-leave@xxxxxxxxxxxxxxxx
> >>> Community Files:
> >>> http://ontolog.cim3.net/file/work/OntologySummit2011/
> >>> Community Wiki:
> >>> http://ontolog.cim3.net/cgi-bin/wiki.pl?OntologySummit2011
> >>> Community Portal: http://ontolog.cim3.net/wiki/
> >>>
> >>
> >> --
> >> Timothy C. Wilson
> >> Graduate Student in Knowledge Management Kent State University
> >> Expected
> >> Completion: August 2011
> >>
> >>
> >> _________________________________________________________________
> >> Msg Archives: http://ontolog.cim3.net/forum/ontology-summit/
> >> Subscribe/Config:
> >> http://ontolog.cim3.net/mailman/listinfo/ontology-summit/
> >> Unsubscribe: mailto:ontology-summit-leave@xxxxxxxxxxxxxxxx
> >> Community Files:
> >> http://ontolog.cim3.net/file/work/OntologySummit2011/
> >> Community Wiki:
> >> http://ontolog.cim3.net/cgi-bin/wiki.pl?OntologySummit2011
> >> Community Portal: http://ontolog.cim3.net/wiki/
> >>
> >>
> >>
> >> _________________________________________________________________
> >> Msg Archives: http://ontolog.cim3.net/forum/ontology-summit/
> >> Subscribe/Config:
> >> http://ontolog.cim3.net/mailman/listinfo/ontology-summit/
> >> Unsubscribe: mailto:ontology-summit-leave@xxxxxxxxxxxxxxxx
> >> Community Files:
> >> http://ontolog.cim3.net/file/work/OntologySummit2011/
> >> Community Wiki:
> >> http://ontolog.cim3.net/cgi-bin/wiki.pl?OntologySummit2011
> >> Community Portal: http://ontolog.cim3.net/wiki/
> >
> > _________________________________________________________________
> > Msg Archives: http://ontolog.cim3.net/forum/ontology-summit/
> > Subscribe/Config:
> > http://ontolog.cim3.net/mailman/listinfo/ontology-summit/
> > Unsubscribe: mailto:ontology-summit-leave@xxxxxxxxxxxxxxxx
> > Community Files: http://ontolog.cim3.net/file/work/OntologySummit2011/
> > Community Wiki:
> > http://ontolog.cim3.net/cgi-bin/wiki.pl?OntologySummit2011
> > Community Portal: http://ontolog.cim3.net/wiki/
> >
> > _________________________________________________________________
> > Msg Archives: http://ontolog.cim3.net/forum/ontology-summit/
> > Subscribe/Config:
> > http://ontolog.cim3.net/mailman/listinfo/ontology-summit/
> > Unsubscribe: mailto:ontology-summit-leave@xxxxxxxxxxxxxxxx
> > Community Files: http://ontolog.cim3.net/file/work/OntologySummit2011/
> > Community Wiki:
> > http://ontolog.cim3.net/cgi-bin/wiki.pl?OntologySummit2011
> > Community Portal: http://ontolog.cim3.net/wiki/
> >    (08)

Attachment: Data Quality.pdf
Description: Adobe PDF document


_________________________________________________________________
Msg Archives: http://ontolog.cim3.net/forum/ontology-summit/   
Subscribe/Config: http://ontolog.cim3.net/mailman/listinfo/ontology-summit/  
Unsubscribe: mailto:ontology-summit-leave@xxxxxxxxxxxxxxxx
Community Files: http://ontolog.cim3.net/file/work/OntologySummit2011/
Community Wiki: http://ontolog.cim3.net/cgi-bin/wiki.pl?OntologySummit2011  
Community Portal: http://ontolog.cim3.net/wiki/     (01)
<Prev in Thread] Current Thread [Next in Thread>