PES1PG21CA154

PES UNIVERSITY
(Established under Karnataka Act No. 16 of 2013)

100-ft Ring Road, Bengaluru – 560 085, Karnataka, India
3rd Semester P roject Report

on
CROP YIELD PREDICTION USING

MACHINE LEARNING
Submitted in partial fulfillment of

the requirements for the award of the degree of
Master of Computer Applications
Submitted by
NARASIMHALU R
(PES1PG21CA154)
September 22 - January 2023

Under the guidance of
Ms.SAMYUKTA D KUMTA
Asst.Professor
Department of Computer Applications

PES University, Bangalore 560085
PES UNIVERSITY
Faculty of Engineering
Department Of Computer Applications
Certificate
This is to certify that the project entitled
CROP YIELD PREDICTION USING MACHINE LEARNING
is a bonafide work carried out by
NARASIMHALU R
(PES1PG21CA154)
in partial fulfillment for the completion of 3th semester project work in the Program of
Study MCA under the rules and regulations of PES University, Bengaluru during the
period January 2022 - May 2022. The project report has been approved as it satisfies
the 3th semester academic requirements
Ms.SAMYUKTA D KUMTA Dr. Veena S Dr. B K Keshavan

Asst.Professor Chairperson Dean
Dept. of Computer Dept. of Computer Faculty of Engineering and
Applications Applications Technology
Date : Date : Date :

Declaration
I, NARASIMHALU R, hereby declare that the project entitled, CROP YIELD PREDICTION
USING MACHINE LEARNING, is an original work done by me under the guidance of Ms.SAMYUKTA
D KUMTA (PES University) is being submitted in partial fulfillment of the requirements for com-
pletion of 3th Semester course work in the Program of Study MCA.
All corrections/suggestions indicated for internal assessment have been incorporated in the report.
The plagiarism check has been done for the report and is below the given threshold.
I further declare that the work reported in this project has not been submitted and will not be submit-
ted, either in part or in full, for the award of any other course.
Place: Bengaluru
Date : January 12, 2023
NARASIMHALU R
PES1PG21CA154
Acknowledgment
I take a great pleasure in expressing my sincere gratitude to all those who have guided me and supported
me to successfully complete this project.
I would like to express my sincere gratitude to the Vice Chancellor of PES University, Dr. J Suryaprasad
and Chairperson Dr. Veena S, who gave me an opportunity to go ahead with this project.
I am grateful to my guide, Ms.SAMYUKTA D KUMTA,Asst.Professor, Department of Computer

Applications, who has been my source of inspiration and provided me with guidance, encouragement
and support, during the course of project.
NARASIMHALU R
PES1PG21CA154
Abstract
The number of crops that will be produced in a particular year is predicted using statistical models
and algorithms for agricultural yield prediction. This may be accomplished utilising a variety of data
sources, including weather, soil, and historical yield data. Crop yield forecasting aims to support farmers,
agricultural businesses, and governments in making well-informed choices on crop management and out-
put.Machine learning may be used to forecast agricultural output in a number of ways. Using supervised
learning techniques, which entail modelling a labelled collection of historical yield data, is one strategy.
The model may then be used to forecast crop yields in the future depending on input information like
weather and soil conditions. Using unsupervised learning algorithms, which may find patterns in the
data without explicitly being trained on labelled data, is an alternative strategy. A considerable amount
of high-quality data is necessary in order to produce reliable forecasts. This comprises information on a
range of elements, including temperature, precipitation, soil type and quality, and the presence of pests
and diseases, that might impact crop output. After analysing this data, machine learning algorithms
may be used to forecast future crop yields. Overall, crop management and production may be optimised
to boost effectiveness and decrease waste by farmers, agricultural businesses, and governments with the
use of machine learning. By making sure there is enough food to fulfil demand, it can also assist to
promote food security.
Contents
1 Introduction 3
1.1 Project Description . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.1.1 Problem Statement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.1.2 Proposed Solution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.1.3 Purpose . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.1.4 Scope and Limitations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
2 Literature Survey 5
2.1 Domain Survey . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
2.2 Existing Systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
2.2.1 Comparative study of Existing Systems . . . . . . . . . . . . . . . . . . . . . . . . 6
2.3 Tools and Technologies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
2.3.1 Tools . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
2.3.2 Technologies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
2.4 Feasibility Study . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
3 Hardware and Software Requirements 9

3.1 Hardware Specification . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
3.2 Software Specification . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
4 Software Requirements Specification 10

4.1 Users . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
4.2 Functional Requirements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
4.3 Non-Functional Requirements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
5 System Design 14
5.1 Architecture Diagram . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
5.2 Data Flow Diagram . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
5.2.1 Context Diagram . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
5.3 Process Flow Diagram . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
6 Detailed Design 17
6.1 Use Case Diagram . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
6.2 Sequence Diagram . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
6.3 Class Diagram . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
7 Implementation 20
7.1 Screenshots . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
8 Testing 33
8.1 Frontend . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
8.2 Backend . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
9 Conclusion 35
10 Future Work 36
Appendix A :References 37
Appendix B : User Manual 38
Appendix C: Plagarism Report 38
Appendix D: Paper 39
Appendix E: Poster 40
List of Tables
3.1 Hardware Specification . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9

3.2 Software Specification . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
CROP YIELD PREDICTION USING MACHINE LEARNING Sept 22 - Jan 23
List of Figures
5.1 Architecture Diagram . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14

5.2 Context Diagram . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
5.3 Process Flow Diagram . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
6.1 Use Case Diagram . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17

6.2 Sequence Diagram . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
6.3 Class Diagram . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
7.1 Screenshot 01 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
7.2 Screenshot 02 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
7.3 Screenshot 03 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
7.4 Screenshot 04 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
7.5 Screenshot 05 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
7.6 Screenshot 06 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
7.7 Screenshot 07 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
7.8 Screenshot 08 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
7.9 Screenshot 09 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
7.10 Screenshot 10 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
7.11 Screenshot 11 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
7.12 Screenshot 12 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
7.13 Screenshot 13 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
7.14 Screenshot 14 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
7.15 Screenshot 15 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
7.16 Screenshot 16 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
7.17 Screenshot 17 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
7.18 Screenshot 18 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
7.19 Screenshot 19 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
7.20 Screenshot 20 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
7.21 Screenshot 21 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
7.22 Screenshot 22 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
7.23 Screenshot 23 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
7.24 Screenshot 24 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
7.25 Screenshot 25 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
8.1 Test Case . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33

8.2 Test Case . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
Department of Computer Applications - PES University 1

Chapter 1
Introduction
1.1 Project Description

1.1.1 Problem Statement
Farmers feel more pressured to embrace intensive farming methods and sustainable agricultural practises
as a result of the serious issues facing the agriculture industry, which will increase expenses on both an
economic and environmental level. To solve this issue, we suggested a machine learning auto demand
and yield prediction procedure. Farmers now anticipate manually the demand of grains and vegetables
hence its influence to farmers economically.
1.1.2 Proposed Solution

Due to the major difficulties confronting the agriculture business, which will result in higher costs
on an economic and environmental level, farmers feel increased pressure to adopt intensive farming
techniques and sustainable agricultural practices. We proposed a machine learning auto demand and
yield prediction approach to address this problem. Farmers now manually predict the demand for grains
and vegetables, which has an impact on the economy.
1.1.3 Purpose
Crop yield forecasting is meant to assist farmers, agricultural businesses, and governments in making
well-informed choices on crop management and output. Choosing what crops to plant, when to plant
them, and how to maintain them to increase production are all examples of this.
Crop yield forecasting may assist farmers in maximising their agricultural methods and making the
most of their available resources and land. To determine which crops are most likely to be productive in
a particular year, for instance, and to modify their planting and management techniques appropriately,
farmers might utilise yield prediction. By doing so, waste may be decreased and crop production
efficiency can be improved. Crop yield forecasting is not only useful for farmers, but it may also be
used by agricultural firms and governments to plan for future food requirements and make sure there is
enough food supply to fulfil demand. This might contribute to increased food stability and security in
both wealthy and developing nations. Overall, crop yield prediction is a crucial instrument for increasing
agricultural production’s efficiency and effectiveness and for ensuring that there is enough food to fulfil
the demands of the world’s expanding population.

1.1.4 Scope and Limitations

Crop yield forecasting is meant to assist farmers, agricultural businesses, and governments in making
well-informed choices on crop management and output. Choosing what crops to plant, when to plant
them, and how to maintain them to increase production are all examples of this.
Crop yield forecasting may assist farmers in maximising their agricultural methods and making the
most of their available resources and land. To determine which crops are most likely to be productive in
a particular year, for instance, and to modify their planting and management techniques appropriately,
farmers might utilise yield prediction. By doing so, waste may be decreased and crop production
efficiency can be improved. Crop yield forecasting is not only useful for farmers, but it may also be
used by agricultural firms and governments to plan for future food requirements and make sure there is
enough food supply to fulfil demand. This might contribute to increased food stability and security in
both wealthy and developing nations.
Overall, crop yield prediction is a crucial instrument for increasing agricultural production’s efficiency
and effectiveness and for ensuring that there is enough food to fulfil the demands of the world’s expanding
population.

Chapter 2
Literature Survey
2.1 Domain Survey

Machine Learning :
A branch of computer science and artificial intelligence (AI) called machine learning focuses on
utilising data and algorithms to simulate human learning processes and gradually improve accuracy.
Algorithms for machine learning have the potential to advance, as was previously stated. Algorithm-
based computer processes that adhere to predefined stages have no room for error. There are times
where computers make decisions based on the sample data that is already available, which contrasts
with commands that are written to produce an outcome based on an input. Computers may make
mistakes while making choices in some situations, just like people
Supervised Learning, Unsupervised Learning, and Reinforcement Learning are the three categories
of machine learning.
Supervised Learning
Machines are trained using precisely ”labelled” training data, and they are then taught to predict the
result using supervised learning methods. Data that has already been connected to the desired result is
referred to as ”labelled data.”
Unsupervised Learning
Unsupervised learning is a method of machine learning that employs training data sets without
actively supervising the models. Instead, models explore the available data to find undiscovered patterns
and insights. It is similar to learning that takes place in the human brain while acquiring new knowledge.
Reinforcement Learning
Reinforcement learning is one of the branches of machine learning. It entails taking actions that will
maximise gain in a certain situation. It is used by many software programs and computers to decide
what action to take in a certain situation. Reinforcement learning, in contrast to supervised learning,
depends on the reinforcement agent to decide how to complete the task at hand. In supervised learning,
the training data provides an answer key. In supervised learning, since the answer is already known, the
model is trained using it. In the absence of a training data set, it must learn from its mistakes.
2.2 Existing Systems

The amount of crops that will be produced in a particular year is predicted by a variety of existing
methods for agricultural yield prediction using machine learning algorithms and statistical models.
Farmers, agricultural businesses, and governments may utilise these tools to make well-informed choices

about crop management and productivity.

The USDA’s Crop Yield Forecast System (CYFS), which forecasts crop yields for important American
crops using a combination of machine learning algorithms and meteorological data, is one example of
a crop yield prediction system. The CYFS offers projections for a variety of crops, including maize,
soybeans, wheat, and cotton. It is updated often. The Agricultural Model Inter comparison and
Improvement Project (AGMIP), a global research initiative that utilises machine learning algorithms and
weather data to estimate crop yields for a variety of crops throughout the world, is another illustration of
a crop yield prediction system. In order to better understand how a variety of factors, such as weather,
soil conditions, and management techniques, affect crop yields, AGMIP comprises a number of regional
and global crop yield prediction models. A variety of smaller-scale agricultural yield prediction systems
that are tailored for certain locations or crops exist in addition to these large-scale ones. These systems
may provide forecasts about crop yields in particular places or for certain crops using a range of data
sources and machine learning methods.
Overall, there are a variety of crop yield prediction systems that are now in use that employ statistical
models and machine learning algorithms to assist farmers, agricultural businesses, and governments in
making well-informed decisions on crop production and management. These systems may be a useful
tool for increasing agricultural production efficiency and effectiveness as well as for assisting in ensuring
that there is sufficient food to fulfil the demands of the expanding world population.
2.2.1 Comparative study of Existing Systems

As mentioned above there are many existing system of crop yield prediction using machine learning.
Being comparing the existing systems and proposed system the proposed system gives the high accuracy
than the existing system, this is the main and biggest achievement in the proposed system. The algo-
rithms and data sets used in the existing system and proposed system are different and the proposed
system gives the 99 percent accuracy that the existing system never given.

2.3 Tools and Technologies

2.3.1 Tools
System Processor : i3 4th GEN and Above.
Hard disk : 500GB
RAM : 4GB and Above.
2.3.2 Technologies
Operating System : Windows 8/10 (64 bit OS)
Programming Language : Python 3.9.12
Framework : Anaconda 4.12.0
Libraries : Keras, TensorFlow
IDE : Jupyter Notebook
Frontend : Flask 1.1.2
2.4 Feasibility Study

Technical Feasibility
It is technically possible to estimate agricultural yield using machine learning. On the basis of input
data like state name, season name, crop name, size of the area and rainfall conditions, machine learning
algorithms and statistical models may be trained on past data to produce predictions about future crop
yields.
Crop yield prediction may be done using a variety of machine learning methods, including supervised
learning methods like decision trees, random forests, and neural networks, as well as unsupervised
learning methods like clustering and dimensionality reduction. These algorithms may be used to forecast
future crop yields based on input variables like state name,season name,crop name,size of the area
and rainfall conditions and can be trained on labelled data sets of past crop yield data. Machine
learning requires a substantial amount of high-quality data in order to create correct predictions. After
analysing this data, machine learning algorithms may be used to forecast future crop yields. Overall,
there are many algorithms and methodologies that may be utilised to create precise predictions based
on a variety of data sources, making agricultural production prediction using machine learning very
technically feasible. It’s crucial to understand that these projections have limitations in terms of their
precision and dependability, and that there is always some degree of uncertainty.
Economic Feasibility
The agriculture sector and farmers may benefit greatly from applying machine learning to predict crop
harvests. Making educated decisions about planting and harvesting with the aid of accurate yield
projections may increase productivity and profitability for farmers.
The cost of creating a machine learning model for predicting crop yields will vary depending on a
number of variables, including the amount of data available, the resources needed for data collection and
pre-processing, the complexity of the model, and the price of the necessary hardware and software. To
develop this model, the data is gathered from kaggle and it is for free of cost and no external resources

used for data collection and pre-processing and the software used to develop this model is Jupyter
notebook, this is a free open source software.
Operational Feasibility
Operational feasibility refers to the practicality and ease of implementing a project. When considering
the operational feasibility of a crop yield prediction system using machine learning, you should consider
the following factors:
Data availability: To develop this model, there is a require of the data so the data set is collected
from the kaggle and there is efficient data for both training and testing the model, there are over 70
thousand records in the data set. This data is enough to train and test the model.
Hardware and software requirements: To built this model there is a need of software and hardware
requirements. Software used to built this model is Anaconda Navigator(Jupyter Notebook) and hardware
requirements like PC with processor i3 4th generation and above, hard disk with minimum of 500gb
space and RAM with 8gb and above.
-

Chapter 3
Hardware and Software Requirements
3.1 Hardware Specification

Hardware Specification
Specification Desired Value
Processor Intel i3 7th Gen
Memory (RAM) 8 GB
Hard Disk 500GB
Table 3.1: Hardware Specification
3.2 Software Specification

Software Specification
Specification Desired Value
Operating System Windows 10
Software Anaconda Navigator
IDE Jupyter Notebook
Framework Flask
Software Type Application Software
Table 3.2: Software Specification

-

Chapter 4
Software Requirements Specification
4.1 Users
• Farmers
Predictive models may assist farmers in maximising their inputs (such as water, fertiliser,
and pesticides) and in helping them decide when to sow and harvest their crops with greater
knowledge. Both yields and costs may rise as a result of this.
• Agri-Businesses
With the use of yield prediction models, businesses that create or market agricultural inputs (such
as seeds and fertiliser) may better understand market demand and optimise their production and
distribution.
• Government and Policy Makers

Decisions about food security and agricultural policy can be influenced by accurate yield forecast.
• Researchers and Academics

For researchers and academics interested in comprehending and improving agricultural systems,
yield prediction can be a crucial topic of study.
• Investors
For investors in the agriculture industry, accurate yield forecasts can be helpful since they can
guide investment choices.

4.2 Functional Requirements

FR1. Data Gathering
Data and models are two elements that machine learning requires to function. Make
sure the data you collect has enough filled features so that your learning model can
be trained properly. Make sure you include enough rows since, generally speaking,
the more data you have, the better. The data is collected from the kaggle(this is a
online community platform that provides a data sets and compete with other data
scientists to solve data science challenges). The data set consists of 70000+ records
and values like state name, season name, crop name, size of the area and rain fall
for that particular season.
FR2. Data Preprocessing

Data preprocessing performed on the data set is label encoding.The process of
label encoding transforms categorical data, which may be expressed as text, into
numerical form. Since many machine learning methods require numerical input
data, this is frequently essential.
FR3. Feature Selection

The process of choosing a subset of pertinent characteristics to include in a machine
learning model is known as feature selection. In terms of predicting agricultural
yields. It is crucial to take into account which elements are most pertinent and have
the best predictive ability when choosing features for a crop yield prediction model.
The parameters considered in this machine learning model are State name, Season
Name, Crop Name, size of the Area and the Rain fall.
FR4. Model Construction and Model Training

Model Construction : Building a crop production forecast system utilising machine
learning requires many key processes, including model creation and model training.
The process of creating and putting into practise a machine learning model is referred
to as model creation. Choosing a model type and the right methods and strategies for
the job at hand are often involved. The algorithms and techniques used to develop
this model are K-Neighbors classifier and K-Means Clustering.
Model Training : Model fitting to training data is referred to as model training.
The model adjusts its internal parameters to minimise the difference between its
predictions and the real labels when given a set of input data and the associated
accurate output (also known as labels). Until the model achieves an acceptable
degree of accuracy, as measured by an assessment metric like mean squared error or
mean absolute error, this procedure is repeated.
FR5. Model Validation and Result Analysis

Model Validation : Once trained, the model may be used to generate predictions
based on fresh data. To determine if the model can generalise to previously undis-
covered data, its accuracy can be assessed using a different test data set. Additional
stages, such as hyper parameter tuning or feature engineering, may be required if
the model’s performance is unsatisfactory.
Result Analysis : Analyzing the output of a crop production prediction model
created using machine learning entails a number of processes. Calculate several
performance measures, including as accuracy, precision, recall, and F1 score, to
assess the model’s performance. To assess the effectiveness of the model, you may
also use alternative metrics like mean absolute error (MAE) or root mean square
error (RMSE). Compare the model’s performance to that of other models: You
may compare the performance of different models you’ve constructed to discover
which one performs better. This might assist you in determining the advantages
and disadvantages of each model.
FR6. Deploying a Model Using Flask

Making a machine learning model available for use by others is known as deployment.
This often entails placing the model on a server or cloud platform and developing
an interface (for example, a web API or application) that allows users to access the
model and obtain forecasts or suggestions.

4.3 Non-Functional Requirements

NFR1. Performance
This model is able to process a high number of queries quickly and generate
predictions in a timely way.
NFR2. Scalability
This model has a ability to handle an increase in the volume of requests without a
significant drop in performance.
NFR3. Reliability
The model is reliable and available to users at all times, with minimal downtime or
errors. This model available for 24/7.
NFR4. Security
The model is secure and protect against unauthorized access or misuse. This may
include measures such as authentication and secure data handling. As this model
contains a Register and login page, though it helps to protect against unauthorized
access.
NFR5. Usability
The Capability of model to be understood, learned and used by the user, when used
under specific condition. This model provides the user-friendly environment to the
user. Some aspects of functionality, reliability, and efficiency will effect usability.
NFR6. Maintainability
The Capability of the software to be modified. The proposed model is easy to make
modifications like corrections, improvements, or adaptation of the model to changes
in environment, requirements and functional specifications.
NFR7. Portability
The Capability of model to be transferred from one environment to another. This
model can be transferred from one platform to another platform, environment like
organizational, hardware or software environment. After the model is transferred
from one environment to another, its efficiency still remains same.

Chapter 5
System Design
5.1 Architecture Diagram
Figure 5.1: Architecture Diagram
In the above Architecture Diagram, Mainly there are three phases User Interface phase,Model Building
phase and Model Evaluation phase. In User interface, the can able to register or log in to the model
and he can give parameters for the prediction and the result is displayed in the user interface. In the
Model building phase all the preprocessing, data splitting and model building is done and it will provide
a model for testing and in the model evaluation phase the test data set is given to the trained model
to check its accuracy and can select model based on its performance metrics and the selected model is
saved in the form of pickle file. Later the user can give new crop yield data to the actual model and it
provides result to the user.

5.2 Data Flow Diagram

5.2.1 Context Diagram
Figure 5.2: Context Diagram
In the above Context diagram the components are External entities, Process and data flow. Model and
Users are the External Entities, Crop Yield Prediction is the machine learning algorithm. The Model
will provide the data set to the Algorithm and that algorithm will give the crop yield result and saved
model details to the model. Later the user will provide the input parameters i.e.,crop yield details(state
name,season name,crop name,size of the area and average rainfall) to the Algorithm, that algorithm will
predict the result and gives to the user.

5.3 Process Flow Diagram
Figure 5.3: Process Flow Diagram
In the above Process Flow Diagram, the collected crop yield data set is Analysed in the data analysis
phase and the analysed data set is then featured engineered means Label encoding converting the
categorical data into the numerical data, data splitting the featured data is splitted into two parts
for training the model and testing the model, grouping the unlabelled data is called clustering, In the
training phase the training data set is given to the both the algorithms (K-Neighbors Classifier and Ridge
Classifier)for the training purpose after this phase the testing data set is given to both the algorithms for
testing and validation on the basis of the performance of both the algorithms and accuracy the model
is selected for decision making and the new crop yield data is given to the selected model to predict the
result and finally the model provide the estimated crop yield.
-

Chapter 6
Detailed Design
6.1 Use Case Diagram
Figure 6.1: Use Case Diagram
In the above Use case diagram User and Model are the actors and login, logout, input dataset, Data
splitting, Train model, Test model, Load model into flask, User input, View result are the use cases
performed by the actors. Firstly the model will input the dataset, to input the dataset data gathering is
mandatory so we have made the relation as ¡¡include¿¿ and the model will split the data into two parts

as Testing dataset and Training dataset, the model is trained with the training dataset and tested with
testing dataset and predicts the Accuracy and that model is loaded in the flask i.e. frontend. Later the
user will login to the application, before logging into the application, the user wants to register first then
by using email id and password the user is able to login into the application. After successful login the
user have to give the input i.e., state name, area, rainfall, season and crop. Then the model will give
the result for the particular input, the user can view the result.
6.2 Sequence Diagram
Figure 6.2: Sequence Diagram
In the above sequence diagram, there are different lifelines and messages. The lifelines are User, At-
tributes, Classification, Result, Data set,Login and Admin. First the Admin will upload the collected
data set(collected from kaggle), later the model is builded and made a classification based on the pre-
vious data. The user will login to the system by providing the User id and password, And Enter the
parameters based on the parameters the model will classify and provides the result to the user.

6.3 Class Diagram
Figure 6.3: Class Diagram
In the above class diagram the user has the attributes like user name,email and password and he can
register or login to the system and the user data is stored in the database i.e., in excel sheet created
by admin. The admin can build a model and can select an algorithm to build a model, so based on
the performance and accuracy metrics he can select among the multiple algorithms and by using the
algorithm he is able to build a machine learning model, the model has the ability to predict the result
with new user data. The user can give new crop yield data to the model for the crop yield prediction
and model provides result to the user.
-

Chapter 7
Implementation
7.1 Screenshots
Flask
Figure 7.1: Screenshot 01
The above screenshot is related to front end developed using flask with python here we have imported
the libraries, classified classes with their respective values and opened a models using ’with’ statement
with ’open’ function in ’rb’ mode, the rb model opens a file in binary format for reading.

Here we have assigned state-labels,season-labels and crop-labels to state-data,season-data and crop-data

respectively. And created a function for prediction using POST and GET methods. A GET message is
send, and the server returns data and POST method is used to send HTML form data to the server.
Mainly this function takes the input from the user and provides result to the user.
Here we have created a function for prediction of rainfall in the particular state in particular season.
With using GET and POST methods it takes state name and season name from the user and predicts
the rainfall for that data.

There is a section in the front end that visualize the data,like confusion matrix, classification report and
different visualizations to display those visualizations we have created a function called submit here user
have to select the option from the list, so the visualization of that option is displayed here.
In the above the same submit function is called. And the back function allows the user to the main dash
board from the graphs section. And we have created a login function which allows the user to login to
the model the user have to provide the user name and password to the model to get access to the model
and those user name and password is verified from the excel sheet.

Here we have defined a function for user registration before logging in the user have to register with
email,user name and password. Once the user is successfully registered. The details i.e, the email,user
name and password is stored in the excel sheet from next time if the user want to log in he/she just
want to enter user name and password.
And this is the last one in the frond end here, defined a function to logout from the dash board the user
can logout from the dash board just by clicking on logout on the top of the page and it directs the user
to the login page.

In the above Screenshot, we have imported the different libraries which are required to develop crop
yield prediction model. The different libraries like pandas,numpy,warnings,seaborn,matplotlib etc,. And
we have loaded the collected data set for future process.
Here we have analysed the number of records and columns, there are 74975 records with 5 columns.
Next we have checked whether the data set has the Null values luckly we don’t have any null values.

We have assigned the area which has quantile less than 1.0 to lower outliers and area which has quantile
greater tahn 0.90 to higher outliers. Here we have visualized the state counts, means there are different
states with different counts, so easy understanding we have visualised the number of counts for each
state with the help of pie chart.
Same like previous, here we have visualised the season counts, there are six types of seasons with different
counts those counts are visualised with the horizontal bar graph.

Here we have visualised the crop counts, there are different types of crops with different counts,so we
have visualised those crop counts using bar graph.
In the above screenshot we have done label encoding, means there are categorical values for state
name,season name and crop name so we have to convert those categorical values into numerical values.
The machine learning algorithms are only able to work with numerical data so this is the important
phase in model building.

The screenshots 13,14 and 15 are related to clustering, clustering means identifying groups of similar
objects in data set and grouping them. Here we made four clusters/groups based on the median value of
the area size like if the median value is 4443.0 is assigned to class 0, if the median value is 240.0 is assigned
to class 1, if the median value is 10838.0 is assigned to class 2 and finally if the median value is 329.0
is assigned to class 3. And we have given names to each of the classes like 0=’Good’,1=’Poor’,2=’Very
Good’ and 3=’Average’.

ere we have splitted the data set into two parts, one is for training the model and another is for test-
ing the model in the ratio 80:20, means 80 percent of data is for training the model and 20 percent is
for testing the model. And started to build our first model K-Neighbors Classifier and the Accuracy of
the model is 99.98 percent.
This screenshot is related to the above above model. Here we are going to visualize the confu-
sion matrix and classification report of the K-Neighbors Classifier. The Classification report contains
precision,recall,f1-score and support for every class. And Confusion matrix contains the values of true
labels and predicted labels.

The confusion matrix for the K-Neighbors Classifier Algorithm.
In the above screenshot we are started to build our second model i.e, Ridge classifier and the accuracy
we got from this model is 86.76 percent.

Same like the first model we have visualised the confusion matrix and classification report of the Ridge
classifier algorithm.
Confusion matrix for the Ridge Classifier Algorithm and model saving.

In the above Screen shot, we have built a model to predict the rainfall, we have chosen the state
name,season name and rainfall for this prediction and we have splitted the train data set and test data
set.
Here we have started to build model for rainfall prediction using random forest regressor.

Here we have build a model for rainfall prediction using random forest regressor and got 100 percent
accuracy and visualised a graph for actual values and predicted values.
Saving the random forest regressor model in pickle(.pkl) format.

-

Chapter 8
Testing
Test Case
An ML model’s performance is assessed using test cases, which are particular sets of inputs and the
anticipated result. To make that the model is producing correct predictions and that it is operating as
intended, test cases are utilised.
8.1 Frontend
Figure 8.1: Test Case

8.2 Backend
Figure 8.2: Test Case

Chapter 9
Conclusion
Finally the conclusion of this model is that it is mainly useful for the farmers who are facing some
difficulties while cultivating their crops and who don’t have idea about what to grow and when to grow
on certain conditions. So this model helps them to make a better decisions and it will increase the
productivity and decrease the farmer’s economy.

Chapter 10
Future Work
Enhancing data collecting and pre-processing

Getting high-quality data on a variety of variables that may have an influence on crop growth is one of
the challenges in predicting agricultural output. In the future, it could be feasible to obtain information
on agricultural conditions more effectively and correctly using new technology like satellite images,
sensors, and drones.
Creating new machine learning models

There are a variety of machine learning models that might be used to estimate crop yields, and new
models are always being created. Research could concentrate on creating models that are more precise,
effective, or more suited to certain crop varieties or growing environments.
Analyzing the effects of climate change

It is likely that as the climate changes, so will the variables that affect agricultural growth. Future
research might concentrate on creating models that can forecast how agricultural yields may be
influenced by global patterns such as climate change.
Other applications of machine learning in agriculture

Crop yield prediction is only one of many issues that the agricultural sector must deal with. To increase
the productivity and sustainability of agriculture, machine learning might also be used to solve issues
like insect management, irrigation, and soil management.

References
[1] Veenadhari S, Misra B, Singh CD: Data mining techniques for predicting crop productivity—A
review article. In: IJCST.2011.
[2] Ramesh D, Vishnu Vardhan B: Data mining techniques and applications to agricultural yield data.
In: International journal of advanced research in computer and communication engineering. 2013.
[3] Thomas van Klompenburg, Ayalew Kassahun, Cagatay Catal: Crop yield prediction using machine
learning:0168-1699/ © 2020 Elsevier B.V
[4] nakha Venugopal, Aparna S, Jinsu Mani, Rima Mathew, Prof. Vinu Williams:Crop Yield Prediction
using Machine Learning Algorithms:NCREIS - 2021 Conference Proceedings
[5] Ashwini I. Patil, Ramesh A. Medar, Vinod Desai:Crop Yield Prediction Using Machine Learning
Techniques:2020 IJSRSET

¡Plagarism Report to be pasted as image here!!¿

¡IEEE Format Paper !!¿ ¡To Do - Put IEEE Template here¿

¡POSTER IMAGE¿ ¡To Do - Put Poster Image here¿

PES1PG21CA154

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

PES1PG21CA154

Uploaded by

Copyright:

Available Formats

PES UNIVERSITY

(Established under Karnataka Act No. 16 of 2013)

3rd Semester P roject Report

CROP YIELD PREDICTION USING

Submitted in partial fulfillment of

Master of Computer Applications

September 22 - January 2023

Department of Computer Applications

CROP YIELD PREDICTION USING MACHINE LEARNING

is a bonafide work carried out by

Ms.SAMYUKTA D KUMTA Dr. Veena S Dr. B K Keshavan

Date : Date : Date :

I am grateful to my guide, Ms.SAMYUKTA D KUMTA,Asst.Professor, Department of Computer

3 Hardware and Software Requirements 9

4 Software Requirements Specification 10

Appendix B : User Manual 38

Appendix C: Plagarism Report 38

3.1 Hardware Specification . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9

5.1 Architecture Diagram . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14

6.1 Use Case Diagram . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17

8.1 Test Case . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33

Department of Computer Applications - PES University 1

1.1 Project Description

1.1.2 Proposed Solution

Department of Computer Applications - PES University 2

1.1.4 Scope and Limitations

Department of Computer Applications - PES University 3

2.1 Domain Survey

2.2 Existing Systems

Department of Computer Applications - PES University 4

about crop management and productivity.

2.2.1 Comparative study of Existing Systems

Department of Computer Applications - PES University 5

2.3 Tools and Technologies

2.4 Feasibility Study

Department of Computer Applications - PES University 6

Department of Computer Applications - PES University 7

Hardware and Software Requirements

3.1 Hardware Specification

Table 3.1: Hardware Specification

3.2 Software Specification

Table 3.2: Software Specification

Department of Computer Applications - PES University 8

Software Requirements Specification

• Government and Policy Makers

• Researchers and Academics

Department of Computer Applications - PES University 9

4.2 Functional Requirements

FR2. Data Preprocessing

FR3. Feature Selection

FR4. Model Construction and Model Training

Department of Computer Applications - PES University 10

FR6. Deploying a Model Using Flask

Department of Computer Applications - PES University 11

4.3 Non-Functional Requirements

Department of Computer Applications - PES University 12

5.1 Architecture Diagram

Figure 5.1: Architecture Diagram

Department of Computer Applications - PES University 13

5.2 Data Flow Diagram

Figure 5.2: Context Diagram

Department of Computer Applications - PES University 14

5.3 Process Flow Diagram

Figure 5.3: Process Flow Diagram