You are on page 1of 4

Classification of diseased corn

leaves
Dataset:
The plantvillage dataset consists of images of leaves from various
plants. The images of tomato leaves are used for this experiment.
There are 9 kinds of diseased leaves available in the dataset and they
are individually classified along with healthy leaf images. So, the
model classifies a given image in on of the 10 categories:
1. Healthy
2. 2
3. 3
4. 4
5. 5
6. 6
7. 7
8. 8
9. 9
10. 10

Data Preparation:
The dataset is split into train, validation and test set in the ratio
0.7:0.2:0.1.
Architecture Choice:
CNN architecture was used to classify the models.
● CNN is the most commonly used architecture for image data.
● CNNs can learn to extract color gradient details from the images.
As the classification is mainly based on color, CNNs are a good
choice.
● The images are of size (256, 256, 3). So CNNs are most efficient
as they can extract information and also reduce the size before
passing data to ANN classifier for classification.
● The exact architecture of each model is shown in the next section

Loss Function, Optimizer, Metrics:


● Loss Function: Cross entropy loss.

● Optimizer: Adam, with a learning rate of 0.01 for all models.


● Metrics: Accuracy, Confusion Matrix, Roc curve.

Models:

Model 1 - Baseline model:


1. 6 convolution layers
2. 2 fully connected layers
3. Accuracy on test data: 90.25%.
4. Confusion Matrix:
This is the baseline model with less layers. This model has a lot
of confusion among classes. A model with more no. of layers
might fit the data better.

Model 2 - Increase of layers from baseline model.


1. 9 convolution layers
2. 4 fully connected layers
3. Accuracy on test data: 88.87%
4. Confusion Matrix:
As we increased the model complexity, the weight gradients for
initial layers would become zero. As a result, the model is
performing poorly. This can be fixed by using skip connections.

Model 3 - Adding skip connections


1. 12 convolution layers
2. 2 fully connected layers
3. Skip/Residual connections for every 4 convolution layers
4. Accuracy on test data: 97%
5. Confusion Matrix:

Adding skip connections is making the model perform very well.


Only those images which are very less diseased were classified
wrongly by this model. We can see that the model performs very
well compared to the similar model used for corn dataset. This is
because of the increased amount of images per label(for some
labels) for the tomato dataset.

You might also like