Tiled Convolutional Neural Networks - cs.stanford.edu

Tiled Convolutional Neural Networks - cs.stanford.edu

S
T
A
N
F
O
R
D

Tiled Convolutional Neural Networks

Quoc V. Le, Jiquan Ngiam, Zhenghao Chen, Daniel Chia, Pang Wei Koh, and Andrew Y. Ng
3

Abstract
Convolutional neural networks (CNNs) have been successfully
applied to many tasks such as digit and object recognition. Using
convolutional (tied) weights significantly reduces the number of
parameters that have to be learned, and also allows translational
invariance to be hard-coded into the architecture. In this paper,
we consider the problem of learning invariances, rather than
relying on hard-coding. We propose tiled convolutional neural
networks (Tiled CNNs), which use a regular tiled pattern of
tied weights that does not require that adjacent hidden units
share identical weights, but instead requires only that hidden
units k steps away from each other to have tied weights. By
pooling over neighboring units, this architecture is able to learn
complex invariances (such as scale and rotational invariance)
beyond translational invariance. Further, it also enjoys much of
CNNs advantage of having a relatively small number of learned
parameters (such as ease of learning and greater scalability). We
provide an efficient learning algorithm for Tiled CNNs based on
Topographic ICA, and show that learning complex invariant
features allows us to achieve highly competitive results for both
the NORB and CIFAR-10 datasets.

7

Tiled Convolutional Neural Networks
Pooling Units

Tied
Weights

Weight
untying

Tile Size (k) = 2

Tied
Weights

Algorithms
Deep Tiled CNNs [this work]
CNNs [LeCun et al]
3D Deep Belief Networks [Nair & Hinton]
Deep Boltzmann Machines [Salakhutdinov et al]
TICA
SVMs

Pooling Size = 3
Number
of Maps = 3

Simple Units

Input

4

Tiled CNN with multiple feature maps
(Our model)

Tiled CNN

CNN

Pretraining with Topographic ICA
Sqrt

p1 p2 p3 .
V

Local
Orthorgonalization

Untied
Weights

Square

Local
orthorgonalization

Motivation

TICA network architecture

Convolutional neural networks [1] work well for many
recognition tasks:
- Local receptive fields for computational reasons
- Weight sharing gives translational invariance

Overcompleteness (multiple maps) can be
achieved by local orthogonalization. We
localize neurons that have identical
receptive fields

TICA can be used to pretrain Tiled CNNs because it can learn invariances even
when trained only on unlabeled data [4, 5].

5

Tied
Weights

6

Algorithm

9
Optimization for
Sparsity at Pooling units

Local Receptive Fields

TICA Speedup

Accuracy
74.5%
73.1%
72.3%
71.0%
64.8%
56.1%
41.1%

Evaluating benefits of convolutional
training
Training on 8x8 samples and using these
weights in a Tiled CNN obtains only
51.54% on the test set compared to
58.66% using our proposed method.
Visualization:
Networks learn concepts like edge
detectors, corner detectors
Invariant to translation, rotation and
scaling

Speedup over non-local (fully-connected) TICA

Simple Units

Input

State-of-the-art results on NORB

Results on the CIFAR-10 dataset

Algorithms
LCC++ [Yu et al]
Deep Tiled CNNs [this work]
LCC [Yu et al]
mcRBMs [Ranzato & Hinton]
Best of all RBMs [Krizhevsky et al]
TICA
Raw pixels

Convolutional Neural Networks
Pooling Units

Pretraining with TICA finds invariant and
discriminative features and works well
with finetuning.

TICA first layer filters
(2D topography, 25 rows of W).

Algorithms for pretraining convolutional neural networks [2,3] do not use untied
weights to learn invariances.

However, weight sharing can be restrictive because it prevents
us from learning other kinds of invariances.

8

Accuracy
96.1%
94.1%
93.5%
92.8%
89.6%
88.4%

Tiled CNNs are more flexible and usually
better than fully convolutional neural
networks.

Tile size (k)

W
2

Multiple
maps

Results on the NORB dataset

Test Set Accuracy

1

Projection step
Locality, Tie weight and
Orthogonality contraints

References
[1] Y. LeCun, L. Bottou, Y. Bengio, and P. Haffner. Gradient based learning applied to document recognition. Proceeding of the IEEE, 1998.
[2] H. Lee, R. Grosse, R. Ranganath, and A.Y. Ng. Convolutional deep belief networks for scalable unsupervised learning of hierarchical
representations. In ICML, 2009.
[3] M.A. Ranzato, K. Jarrett, K. Kavukcuoglu and Y. LeCun. What is the best multi-stage architecture for object recognition? In ICCV, 2009.
[4] A. Hyvarinen and P. Hoyer. Topographic independent component analysis as a model of V1 organization and receptive fields. Neural
Computation, 2001.
[5] A. Hyvarinen, J. Hurri, and P. Hoyer. Natural Image Statistics. Springer, 2009.
[6] K. Kavukcuoglu, M. Ranzato, R. Fergus, Y. LeCun. Learning invariant features through topographic filter maps . In CVPR, 2009.
[7] K. Gregor, Y. LeCun. Emergence of Complex-Like Cells in a Temporal Product Network with Local Receptive Fields. ARXIV, 2010.

http://ai.stanford.edu/~quocle/

Recently Viewed Presentations

  • Towards Sustainable Fisheries in the Northern California Current

    Towards Sustainable Fisheries in the Northern California Current

    From Nathan Taylor UBC Fisheries Centre A North-South see-saw in salmon production These are hard times for west coast salmon and salmon fisheries Wild salmon abundance now just a few percent of historic levels, and hatchery programs have only partially...
  • Free PPT Templates Insert the Sub Title of

    Free PPT Templates Insert the Sub Title of

    PPT PRESENTATION. HALLOWEEN FESTIVAL. Awesome. P. resentation. Infographic Style. You can simply impress your audience and add a unique zing and appeal to your Presentations. I hope and I believe that this Template will your Time, Money and Reputation. Easy...
  • 2.2.3 Enzymes - PDST

    2.2.3 Enzymes - PDST

    Q. Suggest one reason why enzymes are not found in body soap or shampoo. Adenosine Triphosphate. By the volume of suds produced. FALSE. Enzyme trapped in beads or gel so that it will react with, but not mix with, its...
  • Schedule - sampsonsquared.files.wordpress.com

    Schedule - sampsonsquared.files.wordpress.com

    Consider doing a quick demonstration on how to draw a simple prism on isometric dot paper. (Start with one cube and then add a cube in each dimension.) As students work, consider arranging two students with contrasting designs or strategies...
  • Infix to Postfix Conversion - Florida State University

    Infix to Postfix Conversion - Florida State University

    Use stack of tokens. Repeat. If operand, push onto stack. If operator. pop operands off the stack. evaluate operator on operands. push result onto stack. Until expression is read. Return top of stack. Most CPUs have hardware support for this...
  • By Del Siegle, PhD del.siegle@uconn.edu www.delsiegle.info Press the

    By Del Siegle, PhD [email protected] www.delsiegle.info Press the

    I have also created a PowerPoint presentation that demonstrates how to use the Excel spreadsheet. A correlated (or paired) t test is concerned with the difference between the average scores of a single sample of individuals who is assessed at...
  • Cristalografía

    Cristalografía

    Cristalografía Marco Antonio Márquez Godoy Grupo de Mineralogía Aplicada (GMA) Universidad Nacional de Colombia-Sede Medellín Diagrama de polos - CUBO Diagrama de polos octaedro Diagrama de polos Dodecaedro rómbico Diagrama de polos Trapezoedro Diagrama de polos Triaquisoctaedro Diagrama de polos...
  • Poetry - birdvilleschools.net

    Poetry - birdvilleschools.net

    Poetry Terms. Meter - rhythmical pattern of a poem. Stanza - two or more lines in a poem with a set meter and rhyme "Ickle Me, Pickle Me, Tickle Me Too" by Shel Silverstein.