TinyML Epi 7

Exploring Machine Learning Scenarios
Sanjay
27 July 2023-31 July 2023
Abstract
This article contains the details of the course TinyML which is
provided by HarvardX. This is the 7th episode of the series, so you
can only understand it by learning the notes of the previous episodes.
So be sure to check them.
1 Introduction
We will be looking at Convolutions and the limitations of Deep Neural Net-
works. We will also look at image processing, the algorithms associated with
this, and the code that is required for this. We will also look at some special
layers that are famous but are not covered in the TinyML course.
We will also look at the labeling of images and how Models are able to
identify images. We will look at the examples and the code behind them.
We will also look at the math behind convolution in depth. So if you feel
you don’t want it you can skip it.
2 Going Beyond
This will be a shorter section and explains the things that will be peculiar
to us. First of all, we have learned only the Deep Neural Network. This uses
Dense layers in it. But we also saw Flatten which morphs the data. Now we
can look at some of the layers that are similar to these.
1
2.1 Recurrent Layer
First, the Recurrent layer. Actually, this has nothing to do with the course,
but it is one of the great examples of a peculiar layer. Unlike other layers,
the neurons in this layer are connected to each other, which makes them
change the data before sending it to the next layer. There will be an inter-
mediate neuron that helps the layer to exchange ideas. Flatten has a similar
architecture. We can look at this later.
Figure 1: Recurrent Layer
The basic idea of the recurrent layer is as follows:
If we have a sequence of values, X0 , X1 , X2 that we want to

learn a corresponding sequence Y0 , Y1 , Y2 for, then the idea is the
neurons (F) can not only try to match X0 → Y0 but also pass
values along the sequence, indicated by the right-pointing arrows,
so the input to the second neuron is the output from the first and
X1 , from which it can learn the parameters for Y1 and so on.
Let me explain this. Let’s call the first neuron F0 . This F0 gets the input
from X0 and learns Y0 in a sequence. But not only that, it has to send that
to the next neuron say F1 , and this input will affect the output of F1 . Not
only F1 gets the input from X1 but also from F0 and only from these it can
learn the parameters of the Y1 and this cycle continues so on.
2.1.1 Long Short Term Memory

We have different types of Recurrent layers, but one of the most important
recurrent layers is Long Short Term Memory.
2
Figure 2: Long Short Term Memory
Here the X0 not only affects Y0 and Y1 and this cycle continuously affects
the consecutive values of Yn . But if the X0 can impact, say Y99 then it is
a Long Short Term Memory. This is done through the Cell state, which is
transferred through every layer.
3 Image Processing
Well, we are moving into really complicated territory. We have to look at
some math to understand all the things that are going on here. So we can
begin.
3.1 Convolution
We have several math works here. So buckle up. Convolution is an operator.
Just like addition and multiplication, you can convolute two functions. It is
represented as f (x) ∗ g(x). This is linear convolution, but this is enough for
us for now. Now the mathematical definition is given as follows:
Z ∞
f ∗ g(t) = f (x)g(t − x)dx
−∞
Well, that is a bit complicated1 . Well to understand this, we can look

2
at
P∞the discrete case. We can write the integrals as the following sum :
k=−∞ f (k)g(t − k). If we assume the interval to be finite, which is in
our case, then the expression would be:
1
You can check the video by 3b1b.
2
Note this is only for understanding purposes
3
b
X
f (k)g(t − k)
k=a
We see here that the value of function f at k is multiplied by the value

of function g(t) shifted by a value k. This might sound a bit too much but
you will understand it when we move further3 .
Take this image:
Figure 3: The process of convoluting two functions
What we know is that we are getting a third function out of these two
functions, but this is the process of obtaining it. The process is here as
follows4 :
1. Initially, we have two functions f (t) and g(t).
2. Then the g(t) is converted into g(−τ ), in our case it is x.
3. Then the function g(t) is shifted by −τ .
4. When the value of the τ becomes dynamic, there is a dotting between

both functions. The non-zero values are dotted and the value is out-
putted at that point. And the next image continues the same.
This one will be so much useful for probability. But what should we do for
image processing? Well, for that we have to look at Discrete Convolution.
3
This portion is explained greatly in 3b1b’s video.
4
You can look at the GIF.
4
3.1.1 Discrete Convolution
You might have noticed that we mentioned this equation:
b
X
f (k)g(t − k)
k=a
That is the Discrete Convolution. It is not enough to explain everything.

Let’s look at it in much simpler terms.
Assume you have two tuples: A = (a0 , a1 , a2 ) and B = (b0 , b1 , b2 ). If we
take convolution between A and B, we will have to follow the following steps:
1. First, flip B. So we get B as (b2 , b1 , b0 ).
2. Now assume that A is above this flipped B and we slide it across right.
First we will multiply a0 and b0 . This product is the first term of A ∗ B.
So we have (a0 .b0 ).
3. Now again slide it to the right. We will have two multiplication between
a0 , b1 and a1 , b0 . Now add these products. This is our second term.
So we have (a0 .b0 , a1 .b0 + a0 .b1 ).
4. Similarly we get the next term as a0 .b2 +a1 .b1 +a2 .b0 . Have you noticed
the pattern? The sum of subscripts in all three cases is the same in
their respective indices.
5. We have to continue and we will get the final answer as (a0 .b0 , a1 .b0 +
a0 .b1 , a0 .b2 + a1 .b1 + a2 .b0 , a1 .b2 + a2 .b1 , a2 .b2 ).
This is basically it. Lemme show you a simple example. Assume you are
convoluting between (1, 1, 1) and (1, 1, 1), we get: (1, 2, 3, 2, 1). If you still
have doubts, try this code:
1 import numpy as np
2 print ( np . convolute ((1 ,1 ,1) ,(1 ,1 ,1) ) )
As I said during the steps, we have a common pattern in this process, so

we can also define the discrete convolution as follows:
X
(A ∗ B)n = Ai .Bj
i,j
i+j=n
5
Now what do these have to do with image processing? We can see those,
but before that, we have to look at the average, that we can obtain from a
rectangular
P sine wave. Assume you have an array of numbers yi such that
i yi = 1. And we have a rectangular sine wave-like thing but at its lowest
value, it is not zero instead say 0.1. And assume the array of numbers to be
[0.2, 0.2, 0.2, 0.2, 0.2], and assume this array, say X, be the points of our sine
wave which contain [0.1, 0.1, 0.1, 0.1, 0.1, 1, 1, 1, 1, 1, 0.1, 0.1, 0.1, 0.1, 0.1]. Do
work out this so that you can understand the following. We will get X ∗ y as
[0.02, 0.04, 0.06, 0.08, 0.1, 0.28, 0.46, 0.64, 0.82, 1, 0.82, 0.64, 0.46, 0.28, 0.1, 0.08, 0.06, 0.04, 0.02]
If we plot these, we get a sine wave! But how does a rectangular sine
wave become so smooth? Well, it is mainly because the array that we used
to convolute a layer gets affected by the surrounding values. Please try doing
this, you can easily understand it.
Now to our main point, how does this affect images? We can see this in
the next subsection.
3.2 Convolution and Blurring

Before getting any further let’s understand colors in the context of computer
people. Let’s discuss RGB color.
Now we can dicuss5 about the color wheel. Each color in the color wheel
has a value for Red, Green, and Blue. Well, that was too much. Now
255
as computer students, we represent this as a vector. For example: 213
87
represents the color ffd557. So this is the key thing to understanding what
comes next.
As we said each color can be represented as the linear combination of these
three colors. And each pixel has color, so we can represent each pixel as this
vector and when we do the same averaging process, but in two dimensions
we will get the blurring and other effects.
Take a look at this picture:
5
Check here.
6
Figure 4: Picture of Mario and blurring
Every pixel of this image has a vector associated with and we do the
averaging process but with a 3 × 3 array. All the values in the given array
should add up to 1. Assume we take Aij Then the pixels A(i−1)(j−1) to
A(i+1)(j+1) affect the value of the pixel Aij . So an average color is formed.
Let’s see how convolution is done for a pixel. Assume you have the
following Array A:
1 1 1
 
9 9 9
1 1 1
9 9 9
1 1 1
9 9 9
And you are in the pixel yi . The central element of the A22 is multiplied
with yi,j and all other elements are multiplied with the elements they overlap,
like the A11 is multiplied with yi−1,j−1 and so on.
In the following figure, we have every value in the array be 19 :
Figure 5: Blurring
So this is how blurring is done.
3.2.1 Gaussian Blur

We have seen Gaussian blur in almost every video to image editing software.
Gaussian Blur is done through a special matrix. We will have a 5×5 matrix
7
in which the central value is much higher than the values to the edges. To
put it more accurately we will have a bell curve in 3 dimensions for this6 .
3.3 Convolution and Filtering

Now let’s get to the ML aspect of the convolution. We usually have to filter
out the pattern of the image. For which we use image processing. As we
saw earlier we can process images using arrays. We are going to do the same
thing here. We will use different types of arrays.
 
a o −a
 b 0 −b 
a 0 −a
We use 3 × 3 arrays like this to brighten the vertical lines in the image.
How does this help us anyway?
Assume you are recognizing a human using your model, we know that
every human has legs, and the legs are essentially two bunches of vertical
lines. So we can convolute the array with the pixels.
 
a b a
 0 0 0 
−a −b −a
This matrix helps us identify the horizontal lines similarly. But how does
this converse in Neural Networks? We have a special layer for that. We call
it the Convolution Layer. And the neural network that uses this is known as
Convolution Neural Network.
Now let’s get to the programming aspects of this.
3.4 Convolution Neural Network

3.4.1 Why is it better than DNN?
To appreciate the power of CNN, one should understand the limitations of
DNN. We have already detected digits using a DNN, but the conditions in
it are more limited. It can only be a 28 × 28 image and can be only a gray-
scale image. And for example, if we are gotta identify, say a shoe, there
6
check 3b1b video for that.
8
are complicated inputs other than the shoe. Say we train our DNN with
Fashion MNIST, the Network is limited to the constraints of the mnist.
On the other hand, Convolution helps us identify different aspects of the
image using processing. By this, we can easily devise the rules and even
in the case of harder examples we can get a better output. Note: It won’t
always be right, but it is definitely better than DNN.
3.4.2 CNN
Let us do some image processing now. We can use ascent dataset which is
available in Scipy library. It has grayscale images of the dimensions 512×512.
But before getting any further one little important concept. We have
weights here, but it is a little different here. Say you’re creating a filter and
the sum of all the elements in the filter doesn’t add up to 1 or 0, then we
have to multiply our convolution value with a normalizing factor, and this
normalizing factor is the weight in our case.
So let’s begin. We begin by importing the necessary libraries. Also in-
troduction to a new library CV2. We won’t use it here but will use it in the
future.
1 import cv2
2 import numpy as np
3 from scipy import datasets
4 import matplotlib . pyplot as plt
5 i = datasets . ascent ()
Now that we have assigned a variable with ascent, we can check the
dimensions. And we will also create a copy to work with the image. So that
we won’t disturb the original image.
1 i_transformed = np . copy ( i )
2 size_x = i_transformed . shape [0]
3 size_y = i_transformed . shape [1]
4 print ( size_x )
5 print ( size_y )
Output
512
512
We have imported matplotlib to display the image. We will use implot

function for plotting the image.
9
1 plt . grid ( False )
2 plt . gray ()
3 plt . axis ( ’ off ’)
4 plt . imshow ( i )
5 plt . show ()
Output
So this is the image that we are gonna play with. Now let’s define the
filter and weights. Although I have mentioned filters as arrays we will use
matrix format in the code. We will also look at the other filters later. For
now we will look at the following filter as mentioned in the code:
1 filter = [ [0 , 1 , 0] , [1 , -4 , 1] , [0 , 1 , 0]]
2 weight = 1
Well now let’s look at the code to convolute the image.

1 for x in range (1 , size_x -1) :
2 for y in range (1 , size_y -1) :
3 convolution = 0.0
4 convolution = convolution + ( i [ x - 1 , y -1] * filter
[0][0])
5 convolution = convolution + ( i [x , y -1] * filter [1][0])
6 convolution = convolution + ( i [ x + 1 , y -1] * filter
[2][0])
7 convolution = convolution + ( i [x -1 , y ] * filter [0][1])
8 convolution = convolution + ( i [x , y ] * filter [1][1])
9 convolution = convolution + ( i [ x +1 , y ] * filter [2][1])
10 convolution = convolution + ( i [x -1 , y +1] * filter
[0][2])
10
11 convolution = convolution + ( i [x , y +1] * filter [1][2])
12 convolution = convolution + ( i [ x +1 , y +1] * filter
[2][2])
13 convolution = convolution * weight
14 if ( convolution <0) :
15 convolution =0
16 if ( convolution >255) :
17 convolution =255
18 i_transformed [x , y ] = convolution
Well, that’s a big code. Believe me, it’s not complicated. Here things are
kept so simple, that’s why it is so big. What we are doing here is convoluting
the pixel (x, y). Note the loop variables are x and y, also take a look at the
filter that we have defined. Did you find that? It is the same convolution
process. In the fourth line, we are multiplying the pixel value of the pixel at
(x − 1, y − 1) and the (1, 1) element of the filter.
Similarly, we are multiplying (1, 2) element of filter with (x, y − 1), and
so on. The convolution value cannot exceed 255 or be lesser than 0, so we
are defining the condition for that. And the range of the for loops is the
dimension of our image. So, we can cover every pixel in the image. We
multiply the weight with the convolution. Finally, we assign the convolution
to the copy.
Basically, that’s it! Now let’s see what happened to our image.
1 plt . gray ()
3 plt . imshow ( i_transformed )
4 plt . show ()
Output
11
What we have here is the edge-detected image. We can obtain different
by using diffrent filters. For example:
1 filter = [ [ -1 , -2 , -1] , [0 , 0 , 0] , [1 , 2 , 1]]
We have already seen this filter. This is the filter for horizontal line
detector. So when we apply this filter we get horizontal lines of the image.
Output
If we use this filter we will get vertical lines from the image.
1 filter = [ [ -1 , 0 , 1] , [ -2 , 0 , 2] , [ -1 , 0 , 1]]
Output
Similarly. we can try different filters and weights to play with the image.
12
3.5 Pooling
Let’s look at an important process pooling. If a network was supposed to
analyse a big image, it might consume a lot of time and memory. For which
we will use this pooling process.
Let’s look at the Max Pooling. This is the pooling in which a sqaure of
pixel is replaced with the a single pixel and the single pixel has the highest
value among the square of the pixel.
Figure 6: Max Pooling
So this bascially reduces the size of the image. But what about its quality?
To be honest, it will increase the quality in most of the cases. It is mainly
because the size of image is decreased. Like 512 × 512 can be converted
128 × 128. Ans the density of the pixel will increase because the max pixels
are replacing almost all the dim pixels.
So let’s look at the code for this.
1 new_x = int ( size_x /4)
2 new_y = int ( size_y /4)
3 newImage = np . zeros (( new_x , new_y ) )
4 for x in range (0 , size_x , 4) :
5 for y in range (0 , size_y , 4) :
6 pixels = []
7 pixels . append ( i_transformed [x , y ])
8 pixels . append ( i_transformed [ x +1 , y ])
11 pixels . append ( i_transformed [x , y +1])
12 pixels . append ( i_transformed [ x +1 , y +1])
13
23 pixels . sort ( reverse = True )
24 newImage [ int ( x /4) , int ( y /4) ] = pixels [0]
Again this code is big but simple. We are pooling our 512 × 512 image
into 128 × 128. That is making the image quarter of what it is. So we will
pixel square of size 4 × 4. So there are 16 quantities in that square. These
16 entries are those quantities.
We are appending these 16 quantities into a list and finding the maximum
value out of it. We then replace it with one-fourth of the coordinates’ place,
so that the image is one-fourth-ed.
So let’s look at the image we get out of it.
1 plt . gray ()
3 plt . imshow ( newImage )
4 plt . show ()
Output
As you can see the vertical lines are brightened. So pooling can be a
useful process.
14
4 Improving Deep Neural Network
4.1 Kernel
The array that we are mentioning is actually a matrix. Sorry to say this
later. But we can look at this matrix now. We can call this as Mask. Now
let’s look at different kernels.
The kernel matrix is represented as ω. And we denote this process as:
g(x, y) = ω ∗ f (x, y)
We have several types of filters which can be checked here.

The reason we are discussing it here because the convolution network
that we use does not rely on one process, instead it does its own bunch of
processes. It might use edge detection to find the object and then vertical
and horizontal lines for verification. Remembering this we can look at the
code.
4.2 The Code

Now let’s go back to our old DNN. But this time let’s allow it to predict
pictures from Fashion MNIST.
1 import tensorflow as tf
2 mnist = tf . keras . datasets . fashion_mnist
3 ( training_images , training_labels ) , ( val_images , val_labels )
= mnist . load_data ()
4 training_images = training_images / 255.0
5 val_images = val_images / 255.0
6 model = tf . keras . models . Sequential ([
7 tf . keras . layers . Flatten () ,
8 tf . keras . layers . Dense (20 , activation = tf . nn . relu ) ,
9 tf . keras . layers . Dense (10 , activation = tf . nn . softmax )
10 ])
11 model . compile ( optimizer = ’ adam ’ , loss = ’
s p a r s e _ c a t e g o r i c a l _ c r o s s e n t r o p y ’ , metrics =[ ’ accuracy ’ ])
12 model . fit ( training_images , training_labels , validation_data =(
val_images , val_labels ) , epochs =20)
When we run this we get an accuracy of 85%. But we can improve this
with CNN. Let’s do it.
First of all, we have to define the variables of training images and valida-
tion images.
15
1 import tensorflow as tf
2 mnist = tf . keras . datasets . fashion_mnist
3 ( training_images , training_labels ) , ( val_images , val_labels )
= mnist . load_data ()
4 training_images = training_images . reshape (60000 , 28 , 28 , 1)
5 training_images = training_images / 255.0
6 val_images = val_images . reshape (10000 , 28 , 28 , 1)
7 val_images = val_images /255.0
Why are we reshaping the images into 60000 × 28 × 28 × 1 tensor? It’s

because the CNN architecture is defined that way. Instead of passing 60000
tensors we are sending attributes of 60000 images as a single tensor.
Now to the next part of the code.
1 model = tf . keras . models . Sequential ([
2 tf . keras . layers . Conv2D (64 , (3 ,3) , activation = ’ relu ’ ,
input_shape =(28 , 28 , 1) ) ,
3 tf . keras . layers . MaxPooling2D (2 , 2) ,
Now we are defining the sequential model. We can actually see different
models in the future. Now to the topic. We are defining the first two layers
now. First, we have to process the image. So we are using the convolution
layer and the max pooling layer. Let’s look at the convolution layer.
We have 4 attributes given here. There are actually six attributes asso-
ciated with it.
1. Filters: Tells the number of filters to be applied to the image. In our

case, it is 64. So 64 filters are applied to it.
2. Kernel size: This is the size of the filter. In our case, we have a 3 × 3
filter.
3. Strides: Strides are the step size. But there is a catch. If stride is 1
the kernel moves 1 step right and 1 step down in one step. So it moves
in a diagonal fashion.
4. Padding: Padding is the number of pixels that should be added

above/below the border pixels, to ensure that the calculation is cor-
rect.
5. Activation: Activation describes the activation function that should

be used. We have used Relu in our case.
16
6. input shape: It describes the shape of the matrix that is inputted.
We have the input shape of (28,28,1).
We also have initializers and regularizers. But we don’t need that now.
Then we have Max Pooling 2d layer. We have inputted one attribute. It
describes the shape of the square array and it takes max value out of it.
We will have the same two layers continuously because that makes our
model calculate harder images. So we are doing this.
1 tf . keras . layers . Conv2D (64 , (3 ,3) , activation = ’ relu ’) ,
2 tf . keras . layers . MaxPooling2D (2 ,2) ,
3 tf . keras . layers . Flatten () ,
4 tf . keras . layers . Dense (20 , activation = ’ relu ’) ,
5 tf . keras . layers . Dense (10 , activation = ’ softmax ’)
6 ])
7 model . compile ( optimizer = ’ adam ’ , loss = ’
s p a r s e _ c a t e g o r i c a l _ c r o s s e n t r o p y ’ , metrics =[ ’ accuracy ’ ])
8 model . fit ( training_images , training_labels , validation_data =(
val_images , val_labels ) , epochs =20)
Now we will have to flatten. Basically, we are having the same infrastruc-
ture. But we are making the job easier for our DNN.
4.3 Changing factors

Let’s see how things change with other things. If the kernel’s order is in-
creased, then the accuracy decreases. When the number of filters increases,
the accuracy increases.
5 Labelling and mapping feature

Now this is a different section. Let’s see how the computer maps the image7 .
Assume you are giving training images to the computer and the computer
immediately blurs the images as soon it receives them.
7
This is exactly the thing that is mentioned in the TinyML course.
17
Figure 7: Blurred image
Now, assume it is applying a filter that essentially anti-blurs the image

when it sees something like a cylindrical shape.
Figure 8: Leg
It doesn’t know that those are legs. But it will see that pattern emerge
in the images that we have fed the model with. Similarly, it might notice
something like this.
Figure 9: Results
18
And that might combine all these data to declare that it is a human. But
not only that it will also see the patterns that humans can’t see!
5.1 The Code

Let’s look at the code to check how the computer learns. Lemme share the
code later. But let’s look at the results.
Figure 10: How a filter identifies a sole.
Using a filter, the model was able to learn the sole part of the shoe by
which it was able to predict the images as a shoe. Now let’s look at the code
that is used here.
1 import matplotlib . pyplot as plt
2 def show_image ( img ) :
3 plt . figure ()
4 plt . imshow ( val_images [ img ]. reshape (28 ,28) )
6 plt . show ()
7
8 f , axarr = plt . subplots (3 ,2)
9 # By scanning the list above I saw that the 0 , 23 and 28
entries are all label 9
10 FIRST_IMAGE =0
11 SECOND_IMAGE =23
12 THIRD_IMAGE =28
13
14 # For shoes (0 , 23 , 28) , C onv ol ut io n_ Nu mb er =1 ( i . e . the

second filter ) shows
15 # the sole being filtered out very clearly
16
19
17 C ONV OL UT IO N_ NU MB ER = 1
18 from tensorflow . keras import models
19 layer_outputs = [ layer . output for layer in model . layers ]
20 activation_model = tf . keras . models . Model ( inputs = model . input
, outputs = layer_outputs )
21 for x in range (0 ,2) :
22 f1 = activation_model . predict ( val_images [ FIRST_IMAGE ].
reshape (1 , 28 , 28 , 1) ) [ x ]
23 axarr [0 , x ]. imshow ( f1 [0 , : , : , C ONV OL UT IO N_ NU MB ER ] , cmap = ’
inferno ’)
24 axarr [0 , x ]. grid ( False )
25 f2 = activation_model . predict ( val_images [ SECOND_IMAGE ].
reshape (1 , 28 , 28 , 1) ) [ x ]
inferno ’)
28 f3 = activation_model . predict ( val_images [ THIRD_IMAGE ].
reshape (1 , 28 , 28 , 1) ) [ x ]
inferno ’)
31
32
33 show_image ( FIRST_IMAGE )
34 show_image ( SECOND_IMAGE )
35 show_image ( THIRD_IMAGE )
This code is a bit complicated but we can break this down. Before that,
we can look at a new term, Feature maps.
It’s nothing but the edited image by the computer to identify the image.
Basically, the image that we saw earlier is a feature map. These are the
edited images.
Now let’s look at the code. First, we are defining a function that helps us
to display the image. Then we are creating an axis array with three columns
and two rows, just like how we saw in the previous image. The first row
represents the feature map of the first image and so on.
Then we are choosing the labels of the images. We are taking these images
from validation data of the mnist and we are choosing the indexes of shoes
from the list.
1 from keras . datasets import fashion_mnist
2 mnist = fashion_mnist
3 ( training_images , training_labels ) ,( val_images , val_labels ) =
mnist . load_data ()
20
4 print ( val_labels [:100])
Output
[9 2 1 1 6 1 4 6 5 7 4 5 7 3 4 1 2 4 8 0
2 5 7 9 1 4 6 0 9 3 8 8 3 3 8 0 7 5 7 9 6
1 3 7 6 7 2 1 2 2 4 4 5 8 2 2 8 4 8 0 7 7
8 5 1 1 2 3 9 8 7 0 2 6 2 3 1 2 8 4 1 8 5
9 5 0 3 2 0 6 5 3 6 7 1 8 0 1 4 2]
Label 9 indicates that it is a shoe. So the indices of those are 0, 23, 28.
So we are taking them to be the value.
Now using activationmodel we assign a defined model to it. Now to the
for loop. It’s simple by now. It is just for displaying the feature maps of
those images in arrax. It assigns the pictures to the appropriate index of the
arrax.
Finally, we are seeing the real images. Now let’s see those real images.
Figure 11: Images of the shoes that we have used
So this is how a model identifies an object.
6 Links
1. Exploring Convolutions: TinyML
2. Fashion MNIST: TinyML
21

TinyML Epi 7

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

TinyML Epi 7

Uploaded by

Copyright:

Available Formats

Exploring Machine Learning Scenarios

Figure 1: Recurrent Layer

The basic idea of the recurrent layer is as follows:

If we have a sequence of values, X0 , X1 , X2 that we want to

2.1.1 Long Short Term Memory

Well, that is a bit complicated1 . Well to understand this, we can look

We see here that the value of function f at k is multiplied by the value

Figure 3: The process of convoluting two functions

1. Initially, we have two functions f (t) and g(t).

2. Then the g(t) is converted into g(−τ ), in our case it is x.

3. Then the function g(t) is shifted by −τ .

4. When the value of the τ becomes dynamic, there is a dotting between

That is the Discrete Convolution. It is not enough to explain everything.

1. First, flip B. So we get B as (b2 , b1 , b0 ).

As I said during the steps, we have a common pattern in this process, so

3.2 Convolution and Blurring

So this is how blurring is done.

3.2.1 Gaussian Blur

3.3 Convolution and Filtering

3.4 Convolution Neural Network

We have imported matplotlib to display the image. We will use implot

Well now let’s look at the code to convolute the image.

Figure 6: Max Pooling

We have several types of filters which can be checked here.

4.2 The Code

Why are we reshaping the images into 60000 × 28 × 28 × 1 tensor? It’s

1. Filters: Tells the number of filters to be applied to the image. In our

4. Padding: Padding is the number of pixels that should be added

5. Activation: Activation describes the activation function that should

4.3 Changing factors

5 Labelling and mapping feature

Now, assume it is applying a filter that essentially anti-blurs the image

5.1 The Code

Figure 10: How a filter identifies a sole.

14 # For shoes (0 , 23 , 28) , C onv ol ut io n_ Nu mb er =1 ( i . e . the

Figure 11: Images of the shoes that we have used

So this is how a model identifies an object.

2. Fashion MNIST: TinyML

You might also like