[Top] [All Lists]

Re: [ontolog-forum] History of Machine Translation?

To: ontolog-forum@xxxxxxxxxxxxxxxx
From: "John F. Sowa" <sowa@xxxxxxxxxxx>
Date: Sun, 20 Feb 2011 14:51:01 -0500
Message-id: <4D6170A5.5010803@xxxxxxxxxxx>
David, Ron, and Len,    (01)

The amount of history is enormous.  I suggest that we focus the
discussion on implications for the future of NLP -- including
the implications of IBM Watson.    (02)

I'd like to mention a note by Len Yabloko from Thursday (2/17),
which I had intended to comment on, but got sidetracked:    (03)

> What bothers me [about Watson] is that while opening some great
> possibilities to commercialize the technology (which is necessary
> step to make a real progress), it may close some other maybe
> greater directions of progress.    (04)

That is a good question.  There have been many fads and trends
in every branch of science that have opened up new topics, but
with the unfortunate side effect of diverting attention from
other directions that could be more promising.    (05)

> Unfortunately, working last year at IBM Research as a consultant
> I came to conclusion that IBM made a strategic decision to
> suspend all their research in Semantic Web, especially the work
> on ontology-based approaches. I am not claiming this as a fact,
> just merely as my own observation. This happened roughly in the
> same time frame as Watson project gained momentum.    (06)

I hadn't noticed that point, but it's possible. In any case, one
of my major complaints about the Semantic Web is that they ignored
many other developments that had been very promising.  I discussed
that in slides 21 to 23 of the following talk:    (07)

    Integrating Semantic Systems    (08)

While we look at the history of MT and Watson, we should also
consider the bigger picture of how all versions of semantic
systems are, may be, or should be heading.    (09)

> As I dimly remember, there was an IBM demonstration in the Madison
> Ave. office in maybe 1954 which "translated" Russian into English.
> There was evidently a LOT of HUZZAHs!!! about how with just a little
> more effort/money/technology and the machine translation problem
> would be solved.    (010)

1954 was when IBM began a partnership with the GAT project
(Georgetown Automatic Translator), but I didn't think that they
had anything to demonstrate that early.  IBM did have a large
demo at the 1964 World's Fair of their Russian-English translator    (011)

For a quick survey of the early MT work see the following slides
by Jaime Carbonell:    (012)

http://www.cs.cmu.edu/afs/cs/project/cmt-55/lti/Courses/731/www/HistoryOfMachineTranslation-2007.pdf    (013)

> I use Google Translate a lot to construct French translations from English.
> It is not perfect but it is surprisingly good.    (014)

Google uses statistical methods to "learn" the translation patterns
from one language to another.  But their accuracy depends heavily on
the amount of parallel text that they can find for any pair of
languages.    (015)

The French-English pair happens to be especially good because the
Canadian Parliament requires all proceedings to be available in
both languages.  Fred Jelinek at IBM research in the 1980s used
that material to develop the first statistical parser between
French and English.    (016)

Google has continued along that path, and they have been gathering
as much statistics as they can for every language pair.  But French
and English still have a head start with a large volume of data.    (017)

Research on the GAT project was terminated in 1963, but some of
the original developers used that work as the basis for Systran.
The original code was written for the IBM 704 and 7094, but they
translated Systran to the IBM 360 in the late 1960s and to the PC
in the 1980s.  Systran is still available on the WWW under the
name of Babelfish.    (018)

Depending on the amount of data available, different translators
have different advantages.  The Systran dictionaries were all
developed by hand, mostly for political topics in the European
Union, but its original dictionaries for Russian-English physics
texts are still available.    (019)

If you're using free translators on the WWW, you might compare
the results of Google and Bablefish for different texts in
different languages.    (020)

There's a lot more to say, but I have to run.    (021)

John    (022)

Message Archives: http://ontolog.cim3.net/forum/ontolog-forum/  
Config Subscr: http://ontolog.cim3.net/mailman/listinfo/ontolog-forum/  
Unsubscribe: mailto:ontolog-forum-leave@xxxxxxxxxxxxxxxx
Shared Files: http://ontolog.cim3.net/file/
Community Wiki: http://ontolog.cim3.net/wiki/ 
To join: http://ontolog.cim3.net/cgi-bin/wiki.pl?WikiHomePage#nid1J
To Post: mailto:ontolog-forum@xxxxxxxxxxxxxxxx    (023)

<Prev in Thread] Current Thread [Next in Thread>