ontolog-forum
[Top] [All Lists]

Re: [ontolog-forum] Average Daily Word Exposure

To: "[ontolog-forum]" <ontolog-forum@xxxxxxxxxxxxxxxx>
From: John Bottoms <john@xxxxxxxxxxxxxxxxxx>
Date: Thu, 12 Aug 2010 11:38:55 -0400
Message-id: <4C64158F.8070907@xxxxxxxxxxxxxxxxxx>
Ali,    (01)

That is an old and classic question. Charles Kay Ogden addressed
it when he was exploring the concepts underpinning "Basic English"
in 1925. He arrived at a multi-tiered solution with a core
vocabulary of 850 words and 0 or more technical vocabularies
on top of that.    (02)

So, you probably knew that, but it does point out a number of
issues. Ogden was working with rural settlements at an earlier
time than today. Common English was not nearly as rich as today,
and cultures were not as highly mixed.    (03)

Short of replicating his work your best approach might be to
select a corpus of representative works minus an appropriate
number of noise words, and see what comes of it. We use 100 and
500 word noise lists, and you can readily find lists on the web.
We have found that 6000-8000 words is fairly typical but your
mileage may vary. I'm sure the context which calls for a number
of technical vocabularies will be the determining factor.    (04)

David Eddy maintains a list of "common" usage terms for computing,
such as ZIP, SS#, employeeID, for computing and it contains a
brazillion entries.    (05)

-John Bottoms
  FirstStar
  Concord, MA USA
  T: 978-505-9878    (06)

On 8/12/2010 10:43 AM, Ali Hashemi wrote:
> Hi All,
>
> I've been digging around for the past few days and have hit a dead end.
>
> I'm looking for the average number of words that an average western
> adult is exposed to daily.
>
> Any combination of words:
> spoken
> heard
> read
> seen
> would help.
>
> The main "sources" I've found are via:
> http://www.boston.com/news/globe/ideas/articles/2006/09/24/sex_on_the_brain/
> which focus on words spoken. And as that article notes (and I've
> confirmed for the studies I've been able to track down), almost none of
> those "sources" actually cite or indicate how the word count was derived.
>
> Does anyone have any leads on something a bit more substantive? And on
> anything that g oes beyond spoken to also heard + read + seen?
>
> Thanks,
> Ali
>
> --
> www.reseed.ca <http://www.reseed.ca>
> www.pinkarmy.org <http://www.pinkarmy.org>
>
> (`'.(`'.().').') .,.,
>
>
>
>
> _________________________________________________________________
> Message Archives: http://ontolog.cim3.net/forum/ontolog-forum/
> Config Subscr: http://ontolog.cim3.net/mailman/listinfo/ontolog-forum/
> Unsubscribe: mailto:ontolog-forum-leave@xxxxxxxxxxxxxxxx
> Shared Files: http://ontolog.cim3.net/file/
> Community Wiki: http://ontolog.cim3.net/wiki/
> To join: http://ontolog.cim3.net/cgi-bin/wiki.pl?WikiHomePage#nid1J
> To Post: mailto:ontolog-forum@xxxxxxxxxxxxxxxx
>    (07)

_________________________________________________________________
Message Archives: http://ontolog.cim3.net/forum/ontolog-forum/  
Config Subscr: http://ontolog.cim3.net/mailman/listinfo/ontolog-forum/  
Unsubscribe: mailto:ontolog-forum-leave@xxxxxxxxxxxxxxxx
Shared Files: http://ontolog.cim3.net/file/
Community Wiki: http://ontolog.cim3.net/wiki/ 
To join: http://ontolog.cim3.net/cgi-bin/wiki.pl?WikiHomePage#nid1J
To Post: mailto:ontolog-forum@xxxxxxxxxxxxxxxx    (08)

<Prev in Thread] Current Thread [Next in Thread>