Hi Ken,
What level of help do you want?
Private solutions, off the shelf category engines, or code tricks to
implement in a process model?
There is a discussion going on in KMTech with Dr. Clough (Peircian
semiotics professor, et al) and myself regarding a Word Grokker.
Perhaps our process models (in our heads) share overlapping features.
Regards,
----- Original Message -----
Sent: Friday, February 06, 2004 6:45
AM
Subject: [ontolog-forum] Ontology Word
Processor
I'm looking for a lossless compression text format that finds
repeated words or patterns in a text and stores them in a dictionary. In the
body of the text the words/patterns are 'transcluded,' to use a new word, by
reference to the dictionary. In other words, if you have the word 'ontology'
repeated in a text (or web/wiki page/site) 1000 times - you only write
it out once in the dictionary. In the text body is just the ID# of the word
'ontology.' when you read the text, the word is there (transcluded), not the
ID# of course. If you change the word in the dictionary, every location of
that word in the text is changed.
This would be like gif
compression.
.BMP file format. That is a bitmap. The reason .BMP are so
big (in file size) is because they literally just store all the information
about every pixel. So pixel by pixel, the .BMP tells the computer what the
color and brightness should be.
the way gif works is that it looks for
repeating color patterns in your image. It creates it's own shorthand for each
type of pattern. Rather then storing all those repeating patterns, it is
stores the shorthand version and the table of shorthand
translations.
Has anyone heard of something like this?
After the
text analysis, the extracted words could be compared to an ontology. If new
words were found, they could be added to the ontology (hierarchical
dictionary). I'm thinking of an app that had the dictionary management tool
build-in, while exporting to an ontology format. Is this a future plugin for
Protege? Cut and paste text into the plugin window - have the text analyzed,
save the file with its ontology dictionary and encoded body. To read the file,
you have to have a decoder. In fact you don't need to save the dictionary with
the document if you have the new enhanced wordNet on your machine :). The doc
would pick up the words from wordNet. Hey an automatic translator tool also -
Chinese word net.
Sorry for the late night
thoughts. Ken.
|