Thank you, Monica, for sharing this. (01)
Mind provoking ... and definitely worth the bits and bytes this piece
is consuming -- although I will have to take a position that it is not
even a matter of "either-or" in this case. (02)
I'm passing this on ... (03)
-ppy
-- (04)
Monica J. Martin wrote Wed, 23 Apr 2003 22:03:43 -0600: (05)
> -------- Original Message --------
> Subject: [xml-dev] Statistical vs "semantic web" approaches to making
> sense of the Net
> Date: Wed, 23 Apr 2003 21:09:48 -0400
> From: Mike Champion <mc@xxxxxxxxxxx>
> To: "xml-dev@xxxxxxxxxxxxx" <xml-dev@xxxxxxxxxxxxx> (06)
> There was an interesting conjunction of articles on the ACM "technews"
> page [http://www.acm.org/technews/current/homepage.html] -- one on "AI"
> approaches to spam filtering
> http://www.nwfusion.com/news/tech/2003/0414techupdate.html and the other
> on the Semantic Web
> http://www.computerworld.com/news/2003/story/0,11280,80479,00.html. (07)
> What struck me is that the "AI" approach (I'll guess it makes heavy use
> of pattern matching and statistical techniques such as Bayesian
> inference) is working with raw text that the authors are deliberately
> trying to obfuscate the meaning of to get past "keyword" spam filters,
> and the Semantic Web approach seems to require explicit, honest markup.
> Given the "metacrap" argument about semantic metadata
> (http://www.well.com/~doctorow/metacrap.htm) I suspect that in general
> the only way we're going to see a "Semantic Web" is for
> statistical/pattern matching software to create the semantic markup and
> metadata. That is, if such tools can make useful inferences today about
> spam that pretends to be something else, they should be very useful in
> making inferences tomorrow about text written by people who try to say
> what they mean. (08)
> This raises a question, for me anyway: If it will take a "better Google
> than Google" (or perhaps an "Autonomy meets RDF") that uses Baysian or
> similar statistical techniques to create the markup that the Semantic
> Web will exploit, what's the point of the semantic markup? Why won't
> people just use the "intelligent" software directly? Wearing my "XML
> database guy" hat, I hope that the answer is that it will be much more
> efficient and programmer-friendly to query databases generated by the
> 'bots containing markup and metadata to find the information one needs.
> But I must admit that 5-6 years ago I thought the world would need
> standardized, widely deployed XML markup before we could get the quality
> of searches that Google allows today using only raw HTML and PageRank
> heuristic algorithm. (09)
> So, anyone care to pick holes in my assumptions, or reasoning? If one
> does accept the hypothesis that it will take smart software to produce
> the markup that the Semantic Web will exploit, what *is* the case for
> believing that it will be ontology-based logical inference engines
> rather than statistically-based heuristic search engines that people
> will be using in 5-10 years? Or is this a false dichotomy? Or is the
> "metacrap" argument wrong, and people really can be persuaded to create
> honest, accurate, self- aware, etc. metadata and semantic markup? (010)
> [please note that my employer, and many colleagues at W3C, may have a
> very different take on this and please don't blame anyone but me for
> this blather!] (011)
_________________________________________________________________
Message Archives: http://ontolog.cim3.net/forum/ontolog-forum/
Subscribe/Unsubscribe/Config:
http://ontolog.cim3.net/mailman/listinfo/ontolog-forum/
Shared Files: http://ontolog.cim3.net/file/
Community Wiki: http://ontolog.cim3.net/wiki/
To Post: mailto:ontolog-forum@xxxxxxxxxxxxxxxx (012)
|