Professional Documents
Culture Documents
Machine Learning Evaluation
Machine Learning Evaluation
In this study the classifier worked as desired; however, data cleaning is a bigger task
than inputting the data into a classifier and producing the output. I am happy with the
results obtained as the accuracy calculated using the confusion matrix was 95.6% and the
precision was 98.3%.
(TP+TN ) (235+ 4)
Accuracy= = =0.956
Total 250
where, TP – true positive and TP – true negative from the above equation it can be seen that
the accuracy obtained was 95.6% and hence the around 95.6% times the classifier worked
correctly.
TP 235
Precison= = =0.983
pred yes 239
From the above equation it is evident that the precision in predicting a yes when it is
correct is high (98.3%). Table 1 below provides more details on the same.
Describe what images your model classifies well and which they classify badly?
Almost all the images collected from the website were clear which the model was
able to classify. However, the images clicked by zooming in and out needed some attention.
Only the images for the pavement with distresses were clicked by zooming in to capture the
distresses. A slight reduction or increase in the area that was zoomed for certain images
were the ones that the classifier was not able to classify. Examples of misclassified images
for different distresses are represented below,
Explain why you think it performed well or badly on the images you described in the last
part?
The images of the normal pavement were clicked along the length of the pavement.
The image captured having single lane or two lanes, during the ongoing construction, the
movement of pedestrians walking, the movements of vehicles, footpaths, and road
markings. In this study, images of two different types of pavement was captured which
includes earthen and paved pavements. The classifier failed to detect the earthen pavement
and the road under construction having the shortest length. The trained model had all the
roads of greater length.
In the second part, images were collected related to the pavement cracks. The
pavement cracks were zoomed to capture the distresses. The classifier failed because the
pavement cracks captured near the road markings, or near movement of pedestrians and
movement of vehicles was not able to detect as this was misleading the trained model with
the normal pictures.
The third part, deals with the images having potholes (depressions on the
pavement). The potholes were zoomed and captured. Here, the trained model was not able
to classify the images because it was trained with the depth of depression of potholes. The
images with flat depressions and larger area were the ones that were not classified
correctly.
Were there problems with the classifier that you were able to solve? Describe your
strategy for solving the problem
The dataset worked well with the classifier a such there were no problems. The classifier
was trained with 70% of the training data to attain the accuracy. It is required to have big
data for higher precision.