Hi Denise, (01)
Apparently not all of your email messages are being received by my email
service. I will probe for the problem at my end. (02)
I am receiving other emails OK. (03)
The Next steps sound excellent to me. (04)
Peter, do you see any opportunities to use CIM3 collaborative services
(Specifically VNC)? (05)
Cheers, (06)
Bob
---------------------------------- (07)
>Next steps:
>
> Taxo-thesaurus team advises which items they would (a) like to include
in the concept extraction run -- bottom up extraction of concepts to check
coverage of topics, etc.; and (b) for which items they would like to have
metadata generated. (08)
Edited copies of the inventory would be great, or simple instructions would
work, too.
Denise will then capture the content on a drive that can be accessed by the
Teragram tools, and (a) run the concepts and identify clusters, (b) do the
proper noun - people and institutions - extractions; (c) run country
profiles.
Taxo thesaurus team will review the outputs and update the ontolog
domains/topics.
Denise will then build an ontolog domain/topic profile and process the items
identified by the team. (09)
Parallel steps: (010)
Denise will start a second inventory on a wiki feature to generate another
inventory. (011)
Also, I caution that at this time, we will need to limit the
processing to English because although I have other language tools
installed, I will not have the time to build the domain/topic profile in
other than English.
>
> How does this sound to everyone? (012)
-----Original Message-----
From: ontologizing-bounces@xxxxxxxxxxxxxxxx
[mailto:ontologizing-bounces@xxxxxxxxxxxxxxxx] On Behalf Of Lisa
Sent: Monday, June 25, 2007 10:13 PM
To: ontologizing@xxxxxxxxxxxxxxxx
Subject: Re: [ontologizing] Draft Taxo-Thesaurus Facets (013)
Hi, (014)
Reposting to "ontologizing" for visibility into process. (015)
Lisa (016)
>
> I did a quick scan of the URLs to see what could be useful (of the html
pages) for
> concept-extraction consideration and what could be left out.
>
> Some domains appear to be spam:
>
> blogs.ihola.net
> sapo.pt
> {carinsurance.home.sapo.pt
> hydrocone.home.sapo.pt
> ...}
> directory.planetdns.net
> ephedra.*.*
> {ephedra.eu.gg
> ephedra.guest.de
> ephedra.guests.de
> ephedra.jixx.de
> ephedra.us.gg
> ephedra.jix.net
> ephedra.web.gg }
>
> Perhaps you can add some stopwords like "foreclosure , carinsurance, rx"
and various
> pharmaceutical names to exclude the spam URLs?
>
> The rest of the URLs appear to be a mix of organization pages (public and
private), individual
> pages, standard committees, and information sources (wikipedia) which
could be useful.
>
> For the ontolog-specific URLs, these appear to be "useful":
>
> . event planning documents
>
(http://ontolog.cim3.net/file/work/Ontolog-planning/Ontolog_Event_Plan_2006_
20060309d.doc)
> . media supporting the website or presentations (gif,mp3 and ppt files)
> . pages generated by some queries (which point to "interesting" content
(have "wiki.pl" tag -
> includes things like Individual WikiWord pages, projects, Conference
Calls)
> . presentation/ working files (often marked with metadata/ name of file)
>
> and these appear "less useful":
> . pages generated by some queries which point to "non-interesting"
content (like DIFF pages -
> have "wiki.pl" tag - but also have "diff" in the URL), edit-mode pages
(have "action=edit" in
> the
> URL) and login pages (have "action=login" in the URL)
> . time and date files (have "timeanddate.com" in the URL)
> . individual e-mails ("mailto:someone@...") pages
>
> I'm sure I missed some things. This is a first manual pass.
>
> Please forward to <ontologizing> if you think this should be. Thanks!
>
> :) Lisa
> (017)
_________________________________________________________________
Msg Archives: http://ontolog.cim3.net/forum/ontologizing/
Subscribe/Unsubscribe/Config:
http://ontolog.cim3.net/mailman/listinfo/ontologizing/
Community Portal: http://ontolog.cim3.net/
Community Files: http://ontolog.cim3.net/file/work/OntologizingOntolog/
Community Wiki: http://ontolog.cim3.net/cgi-bin/wiki.pl?OntologizingOntolog (018)
_________________________________________________________________
Msg Archives: http://ontolog.cim3.net/forum/ontologizing/
Subscribe/Unsubscribe/Config:
http://ontolog.cim3.net/mailman/listinfo/ontologizing/
Community Portal: http://ontolog.cim3.net/
Community Files: http://ontolog.cim3.net/file/work/OntologizingOntolog/
Community Wiki: http://ontolog.cim3.net/cgi-bin/wiki.pl?OntologizingOntolog (019)
|