[Top] [All Lists]

RE: [ontolog-forum] Ontology Word Processor

To: "[ontolog-forum] " <ontolog-forum@xxxxxxxxxxxxxxxx>, "DigitalArtOntology" <dao-forum@xxxxxxxxxxxx>
From: "Danny Ayers" <danny666@xxxxxxxxxxx>
Date: Sat, 7 Feb 2004 11:38:13 +0100
Message-id: <BKELLDAGKABIOCHDFDBPIEDLFHAA.danny666@xxxxxxxxxxx>

I'm looking for a lossless compression text format that finds repeated words or patterns in a text and stores them in a dictionary. In the body of the text the words/patterns are 'transcluded,' to use a new word, by reference to the dictionary. In other words, if you have the word 'ontology' repeated in a text (or web/wiki page/site) 1000 times - you only write it out once in the dictionary. In the text body is just the ID# of the word 'ontology.' when you read the text, the word is there (transcluded), not the ID# of course. If you change the word in the dictionary, every location of that word in the text is changed.

I've got something similar on my to-do list, but I was only thinking about pulling out proper nouns (as instances). The background is here:
and elsewhere - links here
 I personally think there's loads of machine learning/statistical stuff that could be of benefit in the less scruffy ontology world. ('SemText' is a little side project I've got lined up). 

Sorry for the late night thoughts.
Keep 'em coming!
<Prev in Thread] Current Thread [Next in Thread>