|From:||Kenneth Fields <ken@xxxxxxxxxxx>|
|Date:||Fri, 6 Feb 2004 22:45:15 +0800|
I'm looking for a lossless compression text format that finds repeated words or patterns in a text and stores them in a dictionary. In the body of the text the words/patterns are 'transcluded,' to use a new word, by reference to the dictionary. In other words, if you have the word 'ontology' repeated in a text (or web/wiki page/site) 1000 times - you only write it out once in the dictionary. In the text body is just the ID# of the word 'ontology.' when you read the text, the word is there (transcluded), not the ID# of course. If you change the word in the dictionary, every location of that word in the text is changed.|
This would be like gif compression.
.BMP file format. That is a bitmap. The reason .BMP are so big (in file size) is because they literally just store all the information about every pixel. So pixel by pixel, the .BMP tells the computer what the color and brightness should be.
the way gif works is that it looks for repeating color patterns in your image. It creates it's own shorthand for each type of pattern. Rather then storing all those repeating patterns, it is stores the shorthand version and the table of shorthand translations.
Has anyone heard of something like this?
After the text analysis, the extracted words could be compared to an ontology. If new words were found, they could be added to the ontology (hierarchical dictionary). I'm thinking of an app that had the dictionary management tool build-in, while exporting to an ontology format. Is this a future plugin for Protege? Cut and paste text into the plugin window - have the text analyzed, save the file with its ontology dictionary and encoded body. To read the file, you have to have a decoder. In fact you don't need to save the dictionary with the document if you have the new enhanced wordNet on your machine :). The doc would pick up the words from wordNet. Hey an automatic translator tool also - Chinese word net.
Sorry for the late night thoughts.
|<Prev in Thread]||Current Thread||[Next in Thread>|
|Previous by Date:||[ontolog-forum] Ontolog Invited Speaker Presentation - Steve Ray - Thu 2004.02.12, Peter Yim|
|Next by Date:||RE: [ontolog-forum] Ontology Word Processor, Danny Ayers|
|Previous by Thread:||[ontolog-forum] Ontolog Invited Speaker Presentation - Steve Ray - Thu 2004.02.12, Peter Yim|
|Next by Thread:||RE: [ontolog-forum] Ontology Word Processor, Danny Ayers|
|Indexes:||[Date] [Thread] [Top] [All Lists]|