Case
Study for Ontology Summit 2010
Building
an Ontology
Energy-Water
Nexus vis-a-vis the Climate
Status
Report - February 16, 2010
Much
is happening on each of the different aspects of this project.
The
first question that always pops up is: What do you want me to do?
Right now, the
answer is simple:
1) Learners / students and Subject Matter
Experts (SMEs) can begin entering their questions into Column 1 of a
spreadsheet.
2) Answers go into Column 4 (if known)
Project
Management:
In Project
Management style, this project has subcomponents that are functionally linked,
following an entangled schedule. One way to approach this is from a Matrix
management perspective, aligning skills with tools and schedules. But first,
let’s construct our Input-Output Matrix.
Input:
A question
Black
box:
Output:
An answer
Input
Questions:
Just as this
project was being conceived, DOE Island in a Virtual World (Second Life = SL)
opened its doors in a beta mode. Energy Ant, the host of DOE’s Energy
Kids Page, is now a bot avatar, positioned close where SL avatars enter the DOE
Career Center. He answers questions through a current version of a highly
evolved AI engine. To the best of my knowledge, I have received all the
requisite permissions to build my knowledge into this engine via a Q&A
style spreadsheet. I’d like to allow Energy Ant to learn answers to
questions (from on-line SMEs) - something mentioned in the AI engine’s
documentation, but I have not received an answer on this option.
These questions
are envisioned as the inputs. I have requested that these questions be
harvested from the bot’s files, so the answers can be constructed
offline, assuming a parallel path for injecting knowledge into a dumb bot. If
someone has experience with AI tools, please let me know and help us get this
part functioning.
The plan is: enter
these natural language questions into column 1 of a simple spreadsheet (answers
are in column 4), build incrementally into a master file, then run this
concatenated spreadsheet through an AIML generator to create the AIML driver
file. The details on this process are unfolding.
The approach is
simple:
1) Learners / students and Subject Matter
Experts (SMEs) can begin entering their questions into Column 1 of a spreadsheet.
2) Answers go into Column 4 (if known)
We may need to
copy and paste these into the Energy Ant bot - one at a time. I’m not
sure yet how we will pass the answers into the bot, but I read that this is doable
two ways. One way is to allow each authorized person to enter questions and
their answers. Another way is to enter the answer in another column on a
spreadsheet. I don’t know the extent of the capabilities of the AI bot -
such as, can it handle Riemann/Vector notation for questions and answers? can
it handle reflexive questions correctly? can it build out a robust set of
nested/iterative sub-niches? does it learn how to be deductive correctly? does
it learn how to be inductive? does it learn how to be abductive? Can it input
questions and automatically arrange them into a structure of knowledge that can
be output for verification?
A suggestion was
made to use a topic map generator to convert the pipeline model into an
ontology. The associations between the nodes/bubbles on a pipeline model are
tricky to represent, but topic maps seem to be able to handle these. Going with
this suggestion to use a Topic Map, I found only one open-source package for
building topic maps. I downloaded and unzipped it. A problem: I am composing
this status report on a MacBook, and I don’t know how to get their open
source software to run on this platform. I am investigating whether this can be
put on the PC in my office, on a Linux server, etc. If someone has experience
with any appropriate tool, please let me know and help us get this part
functioning.
Depending on a few
factors, the AIML file (mentioned above) could also become an Input into the
Ontology builder. Exactly how? well, that still remains to be seen, but the knowledge
organization tool would need to be based on the same tool used to create the
relationships between the different pieces of the Pipeline Model (which is
represented by the Physical Fuel Cycle graphic). If someone has experience with
specifying and implementing pipeline modeling, please let me know and help us
get this part functioning. The current graphic was created with Adobe CS4 as a
pds file with vectors and layers; if anyone knows of an add-on to Photoshop
that would output a useable Topic Map file, please let me know.
Output
Options and Ontology Verification:
There is a tool
that (we have asked for) that does metrics on Ontologies. After a version of
the ontology is constructed, we should be able to run this tool to measure the
ontology’s functionality. I imagine this to be an iterative process.
Building inputs to this through the free version of TopBraid has been
suggested, and an open-source version of both the tool and TopBraid are
enroute. The source (the company that pointed this tool out to us) has offered
a non-zero (but very limited) hand in bringing this to functionality. If
someone wants to lend a hand on this, please get involved.
Separate
Inputs: Taxonomies of
energy terms, water terms and climate terms need to be input and built out into
a single ontology. There exist glossaries with these terms. What do not exist
are the mechanisms for describing the relationships between the terms. Starting
both from the top (i.e., the Enterprise Architecture Glossary in cim3) and the
bottom (i.e., a data dictionary), and working the middle ground (i.e., the
various versions of glossaries available in the Department of Energy, Dept. of
Interior, etc), we need to establish the relationships that will drive the
answers to the questions posed to Energy Ant.
Two paths are
being pursued in building the ontology: the manual path and an automated path.
Note: At any point
that someone wants to contribute to this effort in a way not being described,
please feel free to interrupt the structure being described and insert changes,
or insert new or more detailed directions, etc.
For example,
someone familiar with Common Logic might offer to reconstruct each question
into 1st Order Logic syntax. I’ve had no experience in coding into 1st
Order Logic syntax, can read it with difficulty, and would prefer a tool that
automates the conversion from natural language to another syntax, such a CL.
Personally,
I’d prefer 3rd Order Logic, with Operators (linguistic and mathematical,
reflexive and not, with disabled copula-bound structures, with
something-like-RDF-triples’ keys) attached functionally to the highest
/deepest nth-Order Logic level - but where is there any open-source tool for
this? If someone has experience with this kind of tool, please let me know and
help us get this part functioning.
Back
to the manual path:
The notion is to
create a strand of knowledge that is responsive to questions all along the
strand. Each strand needs to be built manually. A file of bread-crumbs from
energy-oriented, water-oriented and climate-oriented web encyclopedias would
make fine starting points, too.
Automated
Path
The notion is to
create a strand of knowledge that is responsive to questions all along the
strand. Each strand needs to be built with links between the various
developmental levels for each subject matter. If there is an ontology-building
tool that supports this build, please let me know and help us get this part
functioning.
Thanks and have a good day,
Jim Disbrow
Jim.disbrow@xxxxxxxxxxx
202-586-1868