[Top] [All Lists]

Re: [ontolog-forum] Ontology Word Processor

To: "[ontolog-forum] " <ontolog-forum@xxxxxxxxxxxxxxxx>
From: "DonEMitchell" <DonEMitchell@xxxxxxx>
Date: Sat, 7 Feb 2004 18:05:52 -0800
Message-id: <001e01c3ede8$14e826a0$6401a8c0@CCC>
Hi Ken,
What level of help do you want?
Private solutions, off the shelf category engines, or code tricks to implement in a process model?
There is a discussion going on in KMTech with Dr. Clough (Peircian semiotics professor, et al) and myself regarding a Word Grokker.
Perhaps our process models (in our heads) share overlapping features.
----- Original Message -----
Sent: Friday, February 06, 2004 6:45 AM
Subject: [ontolog-forum] Ontology Word Processor

I'm looking for a lossless compression text format that finds repeated words or patterns in a text and stores them in a dictionary. In the body of the text the words/patterns are 'transcluded,' to use a new word, by reference to the dictionary. In other words, if you have the word 'ontology' repeated in a text (or web/wiki page/site) 1000 times - you only write it out once in the dictionary. In the text body is just the ID# of the word 'ontology.' when you read the text, the word is there (transcluded), not the ID# of course. If you change the word in the dictionary, every location of that word in the text is changed.

This would be like gif compression.

.BMP file format. That is a bitmap. The reason .BMP are so big (in file size) is because they literally just store all the information about every pixel. So pixel by pixel, the .BMP tells the computer what the color and brightness should be.

the way gif works is that it looks for repeating color patterns in your image. It creates it's own shorthand for each type of pattern. Rather then storing all those repeating patterns, it is stores the shorthand version and the table of shorthand translations.

Has anyone heard of something like this?

After the text analysis, the extracted words could be compared to an ontology. If new words were found, they could be added to the ontology (hierarchical dictionary). I'm thinking of an app that had the dictionary management tool build-in, while exporting to an ontology format. Is this a future plugin for Protege? Cut and paste text into the plugin window - have the text analyzed, save the file with its ontology dictionary and encoded body. To read the file, you have to have a decoder. In fact you don't need to save the dictionary with the document if you have the new enhanced wordNet on your machine :). The doc would pick up the words from wordNet. Hey an automatic translator tool also - Chinese word net.

Sorry for the late night thoughts.
<Prev in Thread] Current Thread [Next in Thread>