What is Convolutional Neural Network??

Convolutional Neural Network is a class in Deep Learning which is takes images as inputs and identifies the edges and features that differentiates between the images and gives out the results.

How does CNN works?

CNN used filters and maxpooling to identify the edges horizontally and vertically and get only the required features or features that are giving relevant information.


The input image performs convolutional operation with the filter to extract out the edges. There are multiple filters example: sobel filter, scharr filter, etc.

Let there be an image with 7x7 matrix and a filter of 3x3.

Convolutional function

The 1st 3x3 block from input image is multiplied with the filter (Feature Detector in image) and the 1st value of the Feature map is obtained. This is shown in below:


Similarly for 2nd block in 1st row,


Like that taking a stride of one each block value is calculated and this represents the edges of the images

The size of the Feature map matrix is = size of the input image -the size of the Filter + 1 i.e. in given example it would be 7–3+1=5 which is 5x5.

But sometimes this creates a problem that some features or information is lost as the size is reduced. This information can be important in that case the padding is applied to the input image.

After padding the 7x7 matrix becomes 9x9 so the feature map size will be 9–3+1=7 i.e. it will be 7x7 which means there is not loss of information.

Maxpooling Layer:

In maxpooling suppose the matrix size of the maxpooling layer is 2x2 then from the maximum value from each 2x2 matrix in imput image is taken and a matrix is formed which is output matrix or output of the maxpooling layer which contains the important features or the features which contains the more information about the input.

MaxPooling Layer

An advantage of filter and maxpooling is that we get the important features from the input image and the size of the image is also reduced.

Now let’s see the code for filter and maxpooling using keras:

import tensorflow as tf

model = models.Sequential()

model.add(layers.Conv2D(64, (3, 3), activation='relu', input_shape=(32, 32, 3)))

model.add(layers.MaxPooling2D((2, 2)))

model.add(layers.Conv2D(32, (3, 3), activation='relu'))

model.add(layers.MaxPooling2D((2, 2)))

model.add(layers.Conv2D(32, (3, 3), activation='relu'))

model.add(layers.Dense(64, activation='relu'))


In above code, the 1st layer contains 64 filters of size 3x3 and the input shape is (32,32,3) which represents an image of 32x32 with 3 channels. The output shape of layer 1 will be (30,30,3). The 2nd is the maxpooling layer of size 2x2. The output shape from 2nd layer will be (15,15,3). Similarly the 3rd layer have 32 filters and of size 3x3 and output shape is (13,13,3) and 4th layer will have output shape of (6,6,3). Last 3 layers are the Flatten, Dense and Output layer with 10 classes.

Model: "sequential_1" _________________________________________________________________ Layer (type)                 Output Shape              Param #    ================================================================= conv2d_3 (Conv2D)            (None, 30, 30, 64)        1792       _________________________________________________________________ max_pooling2d_2 (MaxPooling2 (None, 15, 15, 64)        0          _________________________________________________________________ conv2d_4 (Conv2D)            (None, 13, 13, 32)        18464      _________________________________________________________________ max_pooling2d_3 (MaxPooling2 (None, 6, 6, 32)          0          _________________________________________________________________ conv2d_5 (Conv2D)            (None, 4, 4, 32)          9248       _________________________________________________________________ flatten (Flatten)            (None, 512)               0          _________________________________________________________________ dense (Dense)                (None, 64)                32832      _________________________________________________________________ dense_1 (Dense)              (None, 10)                650        ================================================================= Total params: 62,986
Trainable params: 62,986
Non-trainable params: 0 _________________________________________________________________