ontolog-forum
[Top] [All Lists]

Re: [ontolog-forum] Science, Statistics and Ontology

To: "[ontolog-forum] " <ontolog-forum@xxxxxxxxxxxxxxxx>
From: Len Yabloko <lenyabloko@xxxxxxxxx>
Date: Sun, 13 Nov 2011 09:02:05 -0800 (PST)
Message-id: <1321203725.58971.YahooMailNeo@xxxxxxxxxxxxxxxxxxxxxxxxxxx>

Dear Len,    (01)

Thanks for the feedback.    (02)


On Thu, Nov 10, 2011 at 6:08 PM, Len Yabloko <lenyabloko@xxxxxxxxx> wrote:    (03)

Ali,
>Thank-you for the post. I am not "ontologist" but ontological questions are 
>inevitable if one is to make any sense of reality. It is also inevitable in 
>any scientific method. The real issue is efficiency and that was always an 
>issue with data analysis.      (04)

I'm not entirely sure how you came to this conclusion from this article. If I 
were to distil the article into three broad themes (incidentally, summarized 
quite well here http://xkcd.com/882/ ;), it'd be: 
    1. Misunderstanding the theory behind the statistics
    2. Incorrectly combining hypotheses (especially from a statistical 
perspective - but it rests on semantic misinterpretations)
    3. Incorrectly revising hypotheses 
Efficiency to me seems more ancillary than the semantic concerns in each. 
Admittedly, (1) arises from "just" misunderstanding and misapplying the basic 
theories underpinning entire experimental regimes. But (2) is really about 
incompletely combining two or more distinct but overlapping conceptualizations 
(the experimental design and the hypothesis).     (05)

The nature itself seem to have opted for statistical processing first and logic 
following from it. The two sides can never be completely separated but the 
balance is a matter of efficiency - not principle.     (06)

Could you elaborate what you mean by efficiency in this context? I certainly 
agree that there are a multitude of other issues at play.     (07)

 
Dear Ali, 
 
While these problems can be discussed in great technical detail, they are 
still rooted in the original compromise - replacing explanation of 
observation in terms of cause and effect  - with statistical explanation. The 
way I see it - this compromise greatly increased efficiency of scientific 
investigation and engineering beyond what could be analysed as mechanical 
chain of events. The price however had to be paid of replacing the 
ontological substance of observation with statistical one. All subsequent 
development of the scientific method had to recover the ontology 
from sequences of observations. Never mind that David Hume thought there was 
not any cause and effect to begin with. The inductive method in 
science is necessary due to limited deductive power. Kolmogorov's method can 
be seen as attempt to bring back some of that power in a form of "prior 
knowledge". But it does not unify (IMHO) the two original methods. This 
remains to
 be a challenge which beyond "fixing" or imprroving some techniques. At the 
same time oen can consider the human evolution as a natural solution to 
combining the methods with maximum efficiency dictated by selection.     
 
     (08)

However, one area which the article highlighted were some of the difficulties 
in combining hypotheses, whereby the meta-analysis must integrate different 
protocols, methodologies and vocabularies. When one attempts to statistically 
combine a set of experiments that are based on testing for a null hypothesis, 
it is important that the vocabularies, methodologies and even hypotheses in 
each experimental set up be correctly aligned. This point was a takeaway for 
me, when they wrote:     (09)

[quote] For one thing, all the studies conducted on the drug must be included 
published and unpublished. And all the studies should have been performed in a 
similar way, using the same protocols, definitions, types of patients and 
doses. When combining studies with differences, it is necessary first to show 
that those differences would not affect the analysis, Goodman notes, but that 
seldom happens. “That’s not a formal part of most meta-analyses,he says. 
...
“Across the trials, there was no standard method for identifying or 
validating outcomes; events ... may have been missed or misclassified,Bruce 
Psaty and Curt Furberg wrote in an editorial accompanying the New England 
Journal report. “A few events either way might have changed the findings.”    (010)

So there's the issue of clearly understanding what the implications are from 
each experiment and data individually, and then trying to tease out in what 
ways these accounts can be combined. They also make a point that logicians know 
well - unpublished results or failures are just as important in trying to 
determine the models that satisfy a theory.     (011)

In a very literal (though implicit) way, each lab is deploying its own ontology 
regarding what procedures make sense (and what those commitments entail), but 
also what variables are thought significant, what they tried to control for and 
what was out of the scope of the experiment (and why). It quickly becomes very 
messy when trying to integrate results across two (let alone multiple) labs, 
unless one is directly replicating an experiment.  In some cases, each 
hypothesis may carry implicit assumptions that are reflected in how the 
experiment is ultimately performed.     (012)

In cases where one is not replicating an experiment but trying to extend a 
hypothesis or is allowing for other factors, then these differences across the 
experiments need to be accounted for. At the very least, accounted to minimum 
level of analysis which reveals their statistical implications.     (013)

To some extent, this is being addressed for certain cultural subgroups in 
various parts of science, with the establishment of a variety of protocol 
libraries or standards. Indeed, in some cases, ontologies are being developed 
to help align protocols [1], [2], [3], [4]. But it's often taken for granted 
that people are correctly interpreting the intent of authors through the papers 
that are being published. Imo, as this article points out, many errors abound 
and many practitioners conduct poor semantic mappings across experiments in the 
same domain.     (014)

Lastly, especially in the age of micro-publishing, when novel data are 
published that "break" an existing ontology, and/or call for a revision, 
perhaps the ontology update cycle is better served through Bayesian revision. 
This one seems to be applicable on more of a case-by-case basis, but the 
principle seems worth investigating.     (015)

[1] Michel Kinsy, Zoé Lacroix, Christophe Legendre, Piotr Wlodarczyk and Nadia 
Yacoubi. ProtocolDB: Storing Scientific Protocols with a Domain Ontology. In 
WISE 2007 Workshops, 2007
[2] http://bioinformatics.eas.asu.edu/siteProtocolDB/protocolDB.htm
[3] Maccagnan A, Riva M, Feltrin E, Simionati B, Vardanega T, Valle G, Cannata 
N. Combining ontologies and workflows to design formal protocols for biological 
laboratories. Automated Experimentation, 2010.
[4] Larisa N. Soldatova, Wayne Aubrey, Ross D. King and Amanda Clare. Combining 
ontologies and workflows to design formal protocols for biological 
laboratories. Bioinformatics, 2008.       (016)

_________________________________________________________________
Message Archives: http://ontolog.cim3.net/forum/ontolog-forum/
Config Subscr: http://ontolog.cim3.net/mailman/listinfo/ontolog-forum/
Unsubscribe: mailto:ontolog-forum-leave@xxxxxxxxxxxxxxxx
Shared Files: http://ontolog.cim3.net/file/
Community Wiki: http://ontolog.cim3.net/wiki/
To join: http://ontolog.cim3.net/cgi-bin/wiki.pl?WikiHomePage#nid1J—; ” ” 
           (017)

_________________________________________________________________
Message Archives: http://ontolog.cim3.net/forum/ontolog-forum/  
Config Subscr: http://ontolog.cim3.net/mailman/listinfo/ontolog-forum/  
Unsubscribe: mailto:ontolog-forum-leave@xxxxxxxxxxxxxxxx
Shared Files: http://ontolog.cim3.net/file/
Community Wiki: http://ontolog.cim3.net/wiki/ 
To join: http://ontolog.cim3.net/cgi-bin/wiki.pl?WikiHomePage#nid1J    (018)

<Prev in Thread] Current Thread [Next in Thread>