Hi Doug, (01)
You raise some intersting points, sorry it took me a while to try and do
this justice... (02)
doug foxvog wrote:
> On Mon, May 31, 2010 15:42, Mike Bennett said:
>
>> ... We are
>> developing a formal semantic model of terms in the financial services
>> industry, and to do this we are using the underlying concepts of OWL but
>> restating these in English. In OWL there are classes (with the
>> super-class of owl:Thing), and two types or properties, Object
>> Properties and Datatype Properties. We refer to the OWL Classes as
>> "Things" and properties as "Facts" namely "Relationship Facts" and
>> "Simple Facts" respectively.
>>
>
> Could you clarify? Since OWL Classes are a sub-class of Thing, shouldn't
> the instances of the OWL Classes be "Things"? Likewise, shouldn't "Facts"
> be the assignment of Properties to Things?
>
Using the ODM spec (in an earlier draft than the current one) we have
created a UML model in which certain defined UML base classes are used
to represent certain OWL constructs, identified by suitable UML
stereotypes. Of these, a UML base class of "Class" is used for
owl:class. This has the stereotype of owlClass. However, we don't use
the same words externally as we do internally, since the model is
intended (and used) for presentation of model content to business
subject matter experts. So it's called a class internally but not
described in those terms. (03)
Meanwhile as you say, all the OWL Classes in the model are ultimately
sub-types of the library object "Thing", itself an OWL Class. (04)
We considered using the more accurate term "Entity", as suggested also
by John a while back. However, Entity, like Class, already comes with
its own semantic baggage: a lot of business SMEs have over the years
used both class models and more frequently entity relationship diagrams,
to specify what they fondly believe is a business view of the world,
using a language that the technies understand. In my view, this is poor
management of the "language interface" that should exist between
business conceptual models and design models, but the reality is we live
with the consequences of poor IT management, so Entity as a word was not
in a good place. "Thing" was the only unharmed word I could find, and
even if OWL models did not have Thing at the top of the taxonomic tree,
it's the word I would have used to describe the world of "Things and
facts". (05)
Similarly, I would not want to expose words like Object Property or
Datatype Property to business SMEs. It is futile to expect to educate
business domain experts in a new language in order to get their
participation, and dangerous to assume they have learnt you favourite
new language. To get the maximum confidence that a draft model presented
to SMEs for approval has been understood and validated (or updated)
correctly, you need to know that all business folks have the same
understanding of the same terms. Hence the need to use only terms that
have no history and no part in some formal language. Sadly the English
language is running out of such words. (06)
So when I describe the framework in terms of "Things and Facts" I am not
presenting that explanation to an OWL-savvy audience of ontologists and
I am not asking them to interpret them in the light of what they know
about ontology. I am presenting it as an explanation of how when you
look at the diagrams and the spreadsheets from the model, what you will
see are things and facts. Then I go on to explain how Things are
essentially set theory constructs, and how the facts about them can be
relationship facts (relating one thing to another thing) or simple facts
where the fact is stated in terms of simple stuff like text and dates
and so on. Meanwhile for the ontologist it should be apparent that
relationship facts are Object Properties and that simple facts are
Object Properties. And of course that facts are properties. (07)
There are almost no instances (owl:individual) in the model, except
where specific instances have to be identified for modeling reasons e.g.
the USA as an instance of a Country, ISIN as an instance of security
identifier and so on. I have a feeling that many folks who work more
extensively with OWL models that have both class and individual data
would tend to refer to individuals as "things", maybe I'm wrong about
that. But that's not the language I use or am using here. As it happens
I refer to individuals as "Individuals", one OWL term I have not felt a
need to re-cast for presentation to SMEs. Is this what you mean by
"thing" in your message? But here we are talking about the words we use
not the concepts in the model. (08)
I should add that there are never going to be large numbers of
Individuals in the EDM Council model, since is not intended to become a
repository of actual securities data, the volume of securities data out
there is too vast for that to even be thinkable. Rather, it is intended
as the thus-far missing business conceptual model against which various
logical data models designs may be built, or may be referenced to after
the event. At present the semantics underlying most logical data model
designs in the industry are in the designer's head or on informally
structured spreadsheets, so this is an attempt to improve upon that
broken process using semantic technology. I believe we can do more with
it, but that's the original use case. So if I seem a little ignorant of
the terminology of linked data and of OWL models with individual data,
that's because I don't really need to engage with it for this project. (09)
So instances of OWL classes, where they exist, are "Individuals", just
like they are in OWL. Every class in the model is a class of "Thing",
that is it is asserted to represent a real thing and not a data
construct. Any data model developed from this would have logical classes
of data, and of course instances of those data model classes would be
instances of data. Not Things. (010)
>
>> These are modeled in a UML modeling tool
>> from which we produce both diagrams and spreadsheets, for review via a
>> website (this is at www.hypercube.co.uk/edmcouncil ).
>>
>
>
>> Anyway, each "Thing" and each "Fact" has a label which is a simple
>> textual name for that term, using whatever term business domain experts
>> are most comfortable with.
>>
>
> I'm pleased that you distinguish labels from names at this point.
> However, below you discuss "names", not labels.
>
Loose terminology on my part, I'm afraid, but what is a name if it is
not some kind of label? Some of the ideas you bring up below are things
I should probably have thought of sooner, and will try and implement at
some point in the future. Meanwhile, when I am describing what is there
now, I should clarify that what is there now is a UML model, in which
every UML class, UML association and attribute has a name, identified in
the UML tool as "Name". This name is a label for the class, association
or attribute, in the same way that any name is a label for the thing it
is a name of. (011)
UML also generates UIDs, but I have not made reference to these. (012)
One decision I made when starting this model, was not to create a
separate tagged value in which to create and hand-edit a URI. My
thinking was that at some point, we would want to find a way of
converting the model content into OWL, and that I would far rather that
the OWL URIs are assigned as part of that process, rather than being
maintained by hand with the errors that would introduce. (013)
Something that might have been worth doing, and which I would discuss as
part of any future tightening up of this repository, is to have a new
tagged value for a formal unique label, perhaps structured in accordance
with ISO 11179 Naming and Design Rules. I did think about using ISO
11179 NDR for the formal names in the UML model, but I decided against
this on the basis that business subject matter experts would struggle
with this, and it would re-introduce something techie-looking. My
thinking is that once something looks even remotely techie, business
experts in the financial industry tend to glaze over and assume that the
techies mut have got it right (and it will be the techies' fault if it
isn't). This is not good if you want to take the more industrial
approach, as I do, that all technical artefacts must trace back to some
business conceptual model that is understood, reviewed and signed off by
business as being correct. This is a bit of a cultural shift for the
financial securities industry, which is why we often make mistakes that
cost millions of dollars (the counter-example I usually give is the oil
industry but I guess now's not a good time to go there!). So that's why
I didn't use NDR. The same think applies to anything in camel case, and
anything with punctuation marks you would not find in a novel or a
newspaper. (014)
So a possible future update to this model would be to add an ISO 11179
or other unique label which remains the same no matter what label is
presented to the user. That could also be in camel case, making the
transformation to an owl model a lot simpler since one would not have to
try and collapse the white spaces that inevitably exist in natural
language labels such as our current names. (015)
Of course in UML tools the only label that can be presented to the user
as the "name" of the thing is the UML "Name" label since this sits at
the top of the box or (for attributes and associations) is the only
visible text on diagrams. To do any of the other things we are
discussing would require a tool which is more than a UML modeling tool. (016)
> A solution to the problems you discuss may lie in distinguishing an
> internal name for a term (which is used by programs for access and
> processing) from labels which identify the terms to users. In other
> areas of computer science, the users of programs have no idea of what
> internal names are used in programs and could care less. Why should
> semantic programs make internal names visible to users?
>
As noted above, this is a very good idea which I should possibly have
thought of at the outset. (017)
At some future point, we hope to put this model content forward as a
possible semantic layer for the financial industry messaging standard
ISO20022. In order to do this, a lot of the things where I have taken a
pragmatic approach will need to be re-done, and of course both OWL and
ODM have been considerably updated since I created the framework for
this model. The UML tool I use (Enterprise Architect) does not seem to
handle changes made to stereotypes on the fly, i.e. if a add a new
tagged value (such as for "Internal Label"), it is not propagated to
existing terms that use that stereotype. I doubt any UML tool does this
but it would be good if they did. (018)
So at some point the whole thing will need to rebuilt. My agenda over
the next couple of years (including in here and at SemTech) is to
identify all the things that should be added or changed, so that it can
be rebuilt right, once. Meanwhile we build in the existing format, in
which every term has a UML "Name" label and any number of synonyms, but
no immutable internal label (except the UML GUID of course). I wish I'd
thought of it, but it's not preventing me doing anything at present. (019)
>
>> Other words with precisely the same meaning
>> are identified as "Synonym" using a tag set up for that purpose. For
>> instance the other day I renamed "MBS Issue" to "MBS Deal" since I
>> learnt that's what they call it most often, and put the previous name
>> into the "Synonym" tag.
>>
>
> If EnglishWords and Phrases had denotations as terms and terms had
> preferred phrases (which could vary by context), then renaming would
> not be an issue. Internally, an OWL Class named MortgageBackedSecurityIssue
> could have have "MBS Issue" as a preferred denotation. In the given
> case, it would also become a denotation "MBS Deal", which (in the new
> context) would become its preferred phrase.
>
Indeed. What would be interesting would be if there were some way of
associating different denotations with different contexts. In fact,
since the model is partitioned according to the "Independent / Relative
/ Mediating" top level partition set, the terms for the actual contexts
can be defined under "Mediating", and some already are. If a tool (again
not this UML tool) could (020)
> A company in Saginaw, Michigan, could use the same ontology, adding
> a Class named MBSAirportBondIssue, giving that as a denotation of "MBS
> Issue", which might be its preferred phrase.
>
Yes. A lot of people, indeed most people, seem to use terms very locally
and often struggle with the idea that their meaning for a word is not
the only one in the universe. Hence the use of a semantic model at all,
if words were sufficient we would not need to do any of this. (021)
I think if one were to create a framework similar to the EDM Council SR,
but which people could extend and edit locally, then this sort of local
labelling would be a must-have feature. Again, this is not offered by
the UML tool. (022)
> How to deal with conflicting inputs would be up to the API, which could
> have context-related rules.
>
>
>> You are right that context is needed to deal with meaning. One can
>> either come up with contextual display arrangements such as hover help,
>> or use semantic modelling, such as OWL, which implements (albeit
>> imperfectly) the fundamentals of logic. Using a semantic notation then
>> allows us to formally define what a term means, both by its position in
>> a taxonomy (so Linnaeus' Taxonomy of Species tells us what kind of thing
>> a lynx is), and by the logical statement of facts about those things.
>> The facts are what distinguishes an ontology from a taxonomy, in most
>> accepted definitions of those two words.
>>
>
>
>
>
>> The difficult bit is keeping that definitional rigour and yet presenting
>> the information in ways that subject matter experts can understand.
>>
>
> Once you separate the internal name from the phrase presented to the
> user, this becomes easier. Hover help can certainly assist. As could
> contextual choice of phrase to produce.
>
>
It sounds like we are putting together the outline specification for a
tool. At present I have to work within the limitations of a tool
configured to do something different. I wonder if anyone would develop
such a tool? There is a lot I would add to its functionality. (023)
>> The
>> logic is every bit as complex as any programming concepts, but has no
>> relation to software development concepts, so it takes a while to
>> communicate this to the subject matter experts, in my experience. Also
>> some business folks are more comfortable looking at spreadsheets whereas
>> others are better looking at diagrams - this is a difference between
>> different people in any walk of life. Hence we represent all the same
>> information in both formats. It still isn't easy, but it means that for
>> every term on the diagram or in the spreadsheet, there are enough
>> qualifying terms around it to precisely disambiguate it from any terms
>> that might have the same or a similar name (heteronyms). It should be
>> possible to take any one term and rename it "banana" and still identify
>> what is meant by banana in that context, if it's modeled right.
>>
>
> I question this. There is a distinction between providing enough infor-
> mation to distinguish homonyms and having enough information to precisely
> define a term.
>
True. There are a couple of interesting questions around this. On the
one hand, you have the pragmatic consideration that it is never
practical to model every fact about a given kind of thing, for a given
application or requirement. So there is the basic decision taken by any
ontologist about what facts are germane to the application they are
developing. This is the well known ontological commitment, and is a
consideration for any developer whether the ontology for their
application is ever modeled using an ontology notation, or is just kept
in their head and reflected in the logical data model design. We all
make ontological commitments for a given application. This is made more
complicated for a standards initiative like this one, because we need to
identify all the facts that might be relevant to any application or
securities processing context (investment, risk management, securities
processing, risk management, compliance, and of course systemic risk at
trans-national level, a new requirement). (024)
So you don't need to model the entire DNA sequence of a duck to model
something as being a duck. (025)
That's the easy bit and I've said nothing new here. (026)
Another aspect of this is that in theory, for each new kind of "Thing",
I need only identify one fact, one facet about that thing, that
distinguishes it from other things. So I have a class of things called
Bond, and I create a set of sub-classes called Municipal bond, Sovereign
Bond and Corporate Bond. It works well in that example - the single,
defining feature of a Municipal Bond is that it is issued by a
municipality; the three are distinguished around the facet of "issued
by", which itself is a fact about the parent class, narrowed in the
child classes. (027)
There are however a number of facts that only apply to bonds issued by
municipalities. So those are necessary facts versus incidental facts. (028)
However, aside from that simple example, in some places in the model
there may be so many unique facts about a given kind of thing that it
becomes difficult to say which are necessary, and impossible to reach
consensus on that question. We could, in theory, introduce a new level
of taxonomic hieararchy for every one fact. However, a typical set of
securities terms has a thousand or more unique terms, so it would be
impractical to branch the hierarchy for every new fact or facet. (029)
Instead what we have done, based on existing industry standards, is to
take the hierarchy of classifications that already exists in the
industry, and put in all the facts for the different kinds of thing that
are already defined. Sometimes we have had to add intermediate classes
that nobody uses (for example, the kind of thing of which Mortgage
Backed Security and Asset Backed Security are both a type). (030)
There was one real difficulty with the pre-existing terms which is that
the most detailed standard (the ISO 20022 FIBIM or Securities Data
Model) defines most properties at the highest level at which they might
apply, and makes everything optional. This means that the business
knowledge which led, for example, to "Redemption" being a property of
Security and not of Debt Security has been lost. It is precisely this
loss of business knowledge that we are trying to address. So where
possible I have moved these optional terms down the taxonomy to the
actual things about which they are a fact, but this has not always been
possible. (031)
Also, the stated aim of the model is to have all the terms you would
find in a data model, not just the ones that prove definitive of a given
kind of thing. This is so each term can have one agreed
industry-reviewed written definition. The terms are essentially our end,
not the means to an end in this case. So we know in advance what terms
we need to define, since they are the terms people need to communicate
about when exchanging information about securities. But that's a scoping
question not a semantics question. (032)
In practice I have not tried to distinguish necessary facts from
incidental facts. Would that be a useful distinction for the future? I
think it might be. Also as I may have noted elsewhere, identifying the
different facets by which different sets of sub-sets are defined would
be useful. (033)
> Using the distinction i raised at the beginning, you are referring to
> relabeling a single term in user output, not internally renaming it. The
> internal name need have no bearing on the output text.
>
>
Correct. At present the only internal name is the UML GUID. I don't
think most of the improvements we are discussing can be achieved within
an existing UML tool, but it would be nice if they could be persuaded to
develop the tool in these directions. Or, maybe we have a specification
for a product on our hands? (034)
Cheers, (035)
Mike
>> They say "meaning is context" and that's sort of true in a trivial way,
>> but all the context should be definable in an ontology if it's set up
>> right.
>>
>
> Agreed.
>
> -- doug foxvog
>
>
>> I hope that is a bit clearer.
>>
>>
>> Mike
>>
>> David Eddy wrote:
>>
>>> Mike -
>>>
>>> On May 31, 2010, at 1:30 PM, Mike Bennett wrote:
>>>
>>>
>>>
>>>> We don't rely on words for meanings, and I see no reason why anyone
>>>> would. Terms are either Things or Facts, and each of these has a label
>>>> which happens to be whichever word business domain experts are most
>>>> comfortable with, and any number of synonyms which are other words
>>>> with
>>>> the same meaning.
>>>>
>>>>
>>> This looses me.
>>>
>>> Best as I've experienced, humans tend to be strongly attached to
>>> terms/words/phrases having meanings. I am NOT in favor of using
>>> numbers to represent meaning to humans.
>>>
>>> Naturally a huge issue here is that I see a word, recognize it & am
>>> comfortable with the implicit meaning. You see the same word, which
>>> evokes a different meaning (say "Table" in context of running a
>>> meeting, not furniture, in American English & UK English). We're
>>> both comfortable with what we assume to be the meaning, but one of us
>>> is wrong.
>>>
>>> What I want to see is a term/word/phrase/acronym plus various
>>> available CONTEXTUAL meanings. In a document where there are
>>> potentially ambiguous terms, there could be "footnotes," tags, or
>>> "hovering help" expressing explicit meaning.
>>>
>>>
>>>
>>> What do you mean by "Terms are either Things or Facts"?
>>>
>>> ___________________
>>> David Eddy
>>> deddy@xxxxxxxxxxxxx
>>>
>>> 781-455-0949
>>>
> begin_of_the_skype_highlighting 781-455-0949
>end_of_the_skype_highlighting
>
>>> _________________________________________________________________
>>> Message Archives: http://ontolog.cim3.net/forum/ontolog-forum/
>>> Config Subscr: http://ontolog.cim3.net/mailman/listinfo/ontolog-forum/
>>> Unsubscribe: mailto:ontolog-forum-leave@xxxxxxxxxxxxxxxx
>>> Shared Files: http://ontolog.cim3.net/file/
>>> Community Wiki: http://ontolog.cim3.net/wiki/
>>> To join: http://ontolog.cim3.net/cgi-bin/wiki.pl?WikiHomePage#nid1J
>>> To Post: mailto:ontolog-forum@xxxxxxxxxxxxxxxx
>>>
>>>
>>>
>>>
>>>
>> --
>> Mike Bennett
>> Director
>> Hypercube Ltd.
>> 89 Worship Street
>> London EC2A 2BF
>> Tel: +44 (0) 20 7917 9522
>> Mob: +44 (0) 7721 420 730
>> www.hypercube.co.uk
>> Registered in England and Wales No. 2461068
>>
>>
>> _________________________________________________________________
>> Message Archives: http://ontolog.cim3.net/forum/ontolog-forum/
>> Config Subscr: http://ontolog.cim3.net/mailman/listinfo/ontolog-forum/
>> Unsubscribe: mailto:ontolog-forum-leave@xxxxxxxxxxxxxxxx
>> Shared Files: http://ontolog.cim3.net/file/
>> Community Wiki: http://ontolog.cim3.net/wiki/
>> To join: http://ontolog.cim3.net/cgi-bin/wiki.pl?WikiHomePage#nid1J
>> To Post: mailto:ontolog-forum@xxxxxxxxxxxxxxxx
>>
>>
>>
>
>
> =============================================================
> doug foxvog doug@xxxxxxxxxx http://ProgressiveAustin.org
>
> "I speak as an American to the leaders of my own nation. The great
> initiative in this war is ours. The initiative to stop it must be ours."
> - Dr. Martin Luther King Jr.
> =============================================================
>
>
> _________________________________________________________________
> Message Archives: http://ontolog.cim3.net/forum/ontolog-forum/
> Config Subscr: http://ontolog.cim3.net/mailman/listinfo/ontolog-forum/
> Unsubscribe: mailto:ontolog-forum-leave@xxxxxxxxxxxxxxxx
> Shared Files: http://ontolog.cim3.net/file/
> Community Wiki: http://ontolog.cim3.net/wiki/
> To join: http://ontolog.cim3.net/cgi-bin/wiki.pl?WikiHomePage#nid1J
> To Post: mailto:ontolog-forum@xxxxxxxxxxxxxxxx
>
>
>
> (036)
--
Mike Bennett
Director
Hypercube Ltd.
89 Worship Street
London EC2A 2BF
Tel: +44 (0) 20 7917 9522
Mob: +44 (0) 7721 420 730
www.hypercube.co.uk
Registered in England and Wales No. 2461068 (037)
_________________________________________________________________
Message Archives: http://ontolog.cim3.net/forum/ontolog-forum/
Config Subscr: http://ontolog.cim3.net/mailman/listinfo/ontolog-forum/
Unsubscribe: mailto:ontolog-forum-leave@xxxxxxxxxxxxxxxx
Shared Files: http://ontolog.cim3.net/file/
Community Wiki: http://ontolog.cim3.net/wiki/
To join: http://ontolog.cim3.net/cgi-bin/wiki.pl?WikiHomePage#nid1J
To Post: mailto:ontolog-forum@xxxxxxxxxxxxxxxx (038)
|