Professional Documents
Culture Documents
describe complicated
variables by constructing
interactions
Parameter sharing
• Using the same parameter for more than one function in a
model
• Limited Capacity to linear functions
• network has tied weights, because the value of the weight
applied to one input is tied to the value of a weight applied
elsewhere
• CNN - each member of the kernel is used at every position of
the input
• convolution operation means that rather than learning a
separate set of parameters for every location, instead learn
from one set
Parameter sharing (ctd …)
dramatically more efficient than
dense matrix multiplication in terms
of the memory requirements and
statistical efficiency.
Equivariance to translation - if the
input changes, the output changes
in the same way
Function f(x) is equivariant to a
function g.
Parameter sharing (ctd …)
Pooling
• CNN comprises three stages
– Stage 1 performs several convolutions in parallel to
produce a set of linear activations.
– Stage 2 linear activation is run through a nonlinear
activation function ReLU (detector stage)
– Stage 3 pooling function to modify the o/p of the layer.
connected layer
• Neighboring locations will have
connected layer
• Memory required for storing
feature map.
Tiled convolution
• K
– 6D tensor output locations cycle through a
set of t different choices of kernel stack in each
direction, t – o/p width, % - modulo operation,