On 6/7/2015 11:09 PM, John F Sowa wrote:
> The ability to *discover* features automatically is critical. (01)
I received an offline note that raised some questions about the
tradeoffs between hand-coded features (AKA feature engineering)
vs the "deep learning" systems that learn or discover higher-level
features as combinations of lower-level features. (02)
Traditional machine-learning methods (starting with Art Samuel's
checker-playing system around 1959) are based on hand-coded features
that are tailored for the task. For an article about feature
engineering, see
http://machinelearningmastery.com/discover-feature-engineering-how-to-engineer-features-and-how-to-get-good-at-it/ (03)
Some excerpts from that article:
> The results you achieve are a factor of the model you choose, the data
> you have available and the features you prepared. Even your framing of
> the problem and objective measures you’re using to estimate accuracy
> play a part. Your results are dependent on many inter-dependent
> properties...
>
> Modern deep learning methods are achieving some success in this area,
> such as autoencoders and restricted Boltzmann machines. They have been
> shown to automatically and in a unsupervised or semi-supervised way,
> learn abstract representations of features (a compressed form), that in
> turn have supported state-of-the-art results in domains such as speech
> recognition, image classification, object recognition and other areas.
>
> We do not have automatic feature extraction or construction, yet, and
> we will probably never have automatic feature engineering. (04)
Geoffrey Hinton and other proponents of the deep-learning (multilayer)
neural networks (DNNs) consider that last claim an admission of failure.
(Note his claim that DNNs already learn at the level of a toddler.)
They claim that their methods are on track toward human-level AI
-- i.e., totally automated learning at every level. (05)
For a more technical talk that Hinton presented at Google, see
https://www.youtube.com/watch?v=AyzOUbkUf3M (06)
Following is an excerpt from his summary slide 34 at the 39 minute mark:
> * Restricted Boltzman Machines provide a simple way to learn a layer
> of features without any supervision.
> * Many layers of representation can be learned by treating the hidden
> states of one RBM as the visible data for training the next RBM.
> * This creates good generative models that can then be fine tuned... (07)
Most of the R & D with DNNs has been done with static patterns, such
as handwritten images and photos. For speech recognition, time-varying
patterns are critical. Following is a PhD thesis by Navdeep Jaitly
with Hinton as his thesis advisor: "Exploring deep learning methods
for discovering features in speech signals.
http://www.cs.toronto.edu/~ndjaitly/Jaitly_Navdeep_201411_PhD_thesis.pdf (08)
Since the DNNs were designed for recognizing static patterns, they
aren't very good for time-varying data, such as speech. What Jaitly
did was to use DNNs for "discovering features", which he then used
in a Hidden Markov Model (HMM) for recognizing speech. The result
is a hybrid DNN-HMM system for speech recognition. The critical
innovation is to use features discovered by the DNN in the HMM. (09)
But as I said in my earlier notes, this kind of pattern recognition
is still very far from what educational psychologists define as truly
"deep" learning by children. They learn to recognize speech several
years before they begin first grade. For the next 16 years, what
they learn in school is based on symbolic methods -- not the kind
of pattern recognition done by DNNs or even DNN-HMM hybrids. (010)
John (011)
_________________________________________________________________
Message Archives: http://ontolog.cim3.net/forum/ontolog-forum/
Config Subscr: http://ontolog.cim3.net/mailman/listinfo/ontolog-forum/
Unsubscribe: mailto:ontolog-forum-leave@xxxxxxxxxxxxxxxx
Shared Files: http://ontolog.cim3.net/file/
Community Wiki: http://ontolog.cim3.net/wiki/
To join: http://ontolog.cim3.net/cgi-bin/wiki.pl?WikiHomePage#nid1J (012)
|