Cropyeildpredection

A Project Report on
CROP RECOMMENDATION SYSTEM USING MACHINE

LEARNING
Submitted in the Partial Fulfillment of the Requirements of the Award of the Degree
BACHELOR OF TECHNOLOGY
IN
COMPUTER SCIENCE AND ENGINEERNG
Submitted by
V.Lokesh (19W61A0578)
L.Ganesh(19W61A0546)
Y.Dileep(19W61A0584)
J.Bharath vamsi(19W61A0532)
E.Chanikya(19W61A0520)
Under the Esteemed Guidance of
Miss.K.Sakunthala,
Asst.professor
Department of Computer Science & Engineering

Sri Sivani College of Engineering
(Approved by AICTE, New Delhi, Permanently Affiliated to JNTUK, Kakinada, CC: W6,‘A’
Grade by AP Knowledge Mission, ISO 9001:2015 Certified campus)
Chilakapalem Jn,. Etcherla (M) Srikakulam (Dist),

A.P,India.
2019-2023
SRI SIVANI COLLEGE OF ENGINEERING
(Approved by AICTE, New Delhi, Permanently Affiliated to JNTUK ,
‘A’ Grade by AP Knowledge Mission, ISO 9001:2015 Certified campus)
DEPARTMENT OF COMPUTER SCIENCE & ENGINEERING
Certificate
This is to certify that this project work entitled "CROP
RECOMMENDATION SYSTEM USING MACHINE LEARNING" is the bonafide work
carried out by V.Lokesh(19W61A0578),
L.Ganesh(19W61A0546),Y.Dileep(19W61A0584),J.Bharathvamsi(19W61A0532),
E.Chanikya(19W61A0520) submitted in partial fulfillment of the requirements for the
Award of degree of Bachelor of Technology in Computer Science and Engineering, during
the year 2019-2023.
Project Guide Head of the Department

(Miss.K.Sakunthala) (Dr M.Murali krishna)
Asst.Professor
External Examiner
ACKNOWLEDGEMENT
It is indeed with a great sense of pleasure and immense sense of gratitude that we
acknowledge the help of these individuals. We feel elated in manifesting our sense of
gratitude to our guide, Miss.K.Sakunthala, Assistant Professor in the Department of
Computer Science and Engineering for his valuable guidance. He has been a constant source
of inspiration for us and we are very deeply thankful to him for his support and invaluable
advice. We would like to thank our Head of the Department Dr. M.Murali krishna, for his
constructive criticism throughout our project. We are highly indebted to Principal Dr. Y
Srinivasa Rao, for the facilities provided to accomplish this project. We are extremely
grateful to our Departmental staff members, lab technicians and non-teaching staff members
for their extreme help throughout our project. Finally we express our heartfelt thanks to all of
our friends who helped us in successful completion of this project.
Project Members
V.Lokesh(19W61A0578)
Y.Dileep (19W61A0584)
L.Ganesh(19W61A0546)
J.Bharath vamsi (19W61A0532)
E.Chanikaya(19W61A0520)
DECLARATION
I do hereby declare that the work embodied in this project report entitled " CROP
RECOMMENDATION SYSTEM USING MACHINE LEARNING" is the
outcome of genuine research work carried out by me under the direct supervision of
Miss.K.Sakunthala Assistant Professor, Department of Computer Science
Engineering and is submitted by me to Sri Sivani College of Engineering. The work is
original and has not been submitted elsewhere for the award of any other degree or
diploma.
Project Members
V.Lokesh(19W61A0578)
L.Ganesh (19W61A0546)
Y.Dileep (19W61A0584)
J.Bharath vamsi (19W61A0532)
E.Chanikya(19W61A0520)
INDEX
SNO CONTENTS PAGE NO

ABSTRACT
1 INTRODUCTION 1
1.1 OVERVIEW 1
1.2 IDENTIFICATION/NEED 2
1.3 SCOPE AND OBJECTIVE 2
2 LITERATURE SURVEY 5
3 SYSTEM ANALYSIS&DESIGN 9
3.1 EXISTING SYSTEM 10
3.2 PROPOSED SYSTEM 11
3.3 FEASIBILITY STUDY 11
3.3.1 ECONOMICAL FEASIBILITY 12
3.3.2 TECHNICAL FEASIBILITY 12
3.3.3 SOCIAL FEASIBILITY 12
3.4.1 HARDWARE REQUIREMENTS 13
3.4.2 SOFTWARE REQUIREMENTS 13
3.5 SYSTEM ARCHITECTURE 14
3.6 UML DIAGRAMS 14
3.6.1 USE CASE DIAGRAM 15
3.6.2 SEQUENCE DIAGRAM 16
3.6.3 ACTIVITY DIAGRAM 17
3.7 INPUT DESIGN 18
3.8 OUTPUT DESIGN 19

4 IMPLEMENTATION 21
4.1 MODULE DESCRIPTION 22
4.2 SOFTWARE ENVIRONMENT 26
4.2.1 PYTHON 26
4.2.2 WHY CHOOSE PYTHON 27
4.3 SAMPLE CODE 32
4.4 SVM ALOGRITHM 44
4.5 RANDOM FOREST 45
4.6 KNN ALOGRITHM 46
4.7 CNN ALOGRITHM 48
5 SYSTEM TESTING & SCREENSHOTS 50
5.1 SYSTEM TESTING 51
5.1.1 UNIT TESTING 52
5.1.2 INTEGRATION TESTING 53
5.1.3 FUNCTIONAL TESTING 54
5.1.4 WHITE BOX TESTING 54
5.1.5 BLACK BOX TESTING 54
5.1.6 ACCEPTANCE TESTING 55
5.2 SCREEN SHOTS 57
5.1.2 SCREENSHOTS OF SOME OF THE PESTS FOR 57

WHAT WE ARE PREDICTING THE PESTS
5.2.2 HOME PAGE 58
5.2.3 TAKING INPUTS FOR PREDICTION 59

5.2.4 CROP PREDICTED 60
5.2.5 TAKING INPUTS FOR CROP PREDICTION 61
5.2.6 CROP PREDICTED 62
5.2.7 TAKING INPUTS FOR CROP PREDICTION 63
5.2.8 CROP PREDICTION 64
5.2.9 TAKING INPUTS TO PREDICT FERTILIZERS 65
5.2.10 SUGGESTIONS 66
5.2.11 SUGGESTIONS 67
5.2.12 INFORMATION ABOUT FERTILIZERS 68
5.2.13 SUGGESTIONS ABOUT FERTILIZERS 69
5.2.14 THE IMAGE OF INSECT FOR WHICH WE ARE 69

PREDICTING PESTCIDE AND USAGE
5.2.15 TAKING INPUT TO PREDICT PESTCIDE 70
5.2.16 IDENTIFIED PESTICIDE 71
5.2.17 TAKING INPUT TO PREDICT PESTCIDE 72
5.2.18 IDENTIFIED PESTCIDE 72
5.2.19 IDENTIFIED PEST 73
6 CONCLUSION AND FUTURE SCOPE 74
6.1 CONCLUSION 75
6.2 FUTURE SCOPE 75
7 BIBLIOGRAPHY AND REFERENCES 77

LIST OF FIGURES
S.NO CONTENT PAGE NO

3.5 SYSTEM ARCHITECTURE 14
3.6 UML DIAGRAM 14
3.6.1 USE CASE DIAGRAM 15
3.6.2 SEQUENCE DIAGRAM 16
3.6.3 ACTIVITY DIAGRAM 17
LIST OF TABLES
S.NO CONTENT PAGE NO

5.1.7 TEST CASE 1 55

CROP YIELD PREDICTION
ABSTRACT
ABSTRACT
India being an agriculture country, its economy predominantly depends on agriculture yield
growth and agro industry products. Data Mining is an emerging research field in crop yield
analysis. Yield prediction is a very important issue in agricultural. Any farmer is interested in
knowing how much yield he is about to expect and what is the crop that is suitable for the
land. Analyze the various related attributes like location, pH value from which alkalinity of
the soil is determined. Along with it, percentage of nutrients like Nitrogen (N), Phosphorous
(P), and Potassium (K) Location is used along with the use of third-party applications like
APIs for weather and temperature, type of soil, nutrient value of the soil in that region,
amount of rainfall in the region, soil composition can be determined. All these attributes of
data will be analyzed, train the data with various suitable machine learning algorithms like
SVM, Random-Forest, KNN and Voting Classifier for creating a model. The system comes
with a model to be precise and accurate in predicting crop yield and deliver the end user with
the proper recommendations about required fertilizer ratio based on atmospheric and soil
parameters of the land which enhance to increase the crop yield and increase farmer revenue.
Thus, the proposed system takes the data regarding the quality of soil and the weather related
information as an input. The quality of the soil such as Nitrogen, Phosphorous, Potassium and
Ph value. Weather related information like Rainfall, Temperature and Humidity to predict the
better crop. In our project we are taking the datasets from Kaggle website.
1. INTRODUCTION
CHAPTER 1
INTRODUCTION
1.1 OVERVIEW
Agriculture is one of the most important occupation practiced in our country. It is the
broadest economic sector and plays an important role in overall development of the country.
About 60 % of the land in the country is used for agriculture in order to suffice the needs of
1.2 billion people. Thus, modernization of agriculture is very important and thus will lead the
farmers of our country towards profit. Data analytic (DA) is the process of examining data
sets in order to draw conclusions about the information they contain, increasingly with the aid
of specialized systems and software. Earlier yield prediction was performed by considering
the farmer's experience on a particular field and crop. However, as the conditions change day
by day very rapidly, farmers are forced to cultivate more and more crops. Being this as the
current situation, many of them don’t have enough knowledge about the new crops and are
not completely aware of the benefits they get while farming them. Also, the farm productivity
can be increased by understanding and forecasting crop performance in a variety of
environmental conditions. Thus, the proposed system takes the data regarding the quality of
soil and the weather related information as an input. The quality of the soil such as Nitrogen,
Phosphorous, Potassium and Ph value. Weather related information like Rainfall,
Temperature and Humidity. In our project we are taking the datasets from Kaggle website.
1
1.2. IDENTIFICATION/ NEED
A crop prediction is a widespread problem that occurs. During the rising season, a
farmer had curiosity in knowing how much yield he is about to expect. In the earlier period,
this yield prediction become a matter of fact relied on Farmer’s long-term experience for
specific yield, crops and climatic conditions. Farmer directly goes for yield prediction rather
than concerning on crop prediction with the existing system. Unless the correct crop is
predicted how the yield will be better and additionally with existing systems pesticides,
environmental and meteorological parameter related to crop is not considered. Promoting and
soothing the agricultural production at a more rapidly pace is one of the essential situation for
agricultural improvement. Any crop's production show the way either by interest of domain
or enhancement in yield or both. In India, the prospect of widening the district under any crop
does not exist except by re-establishing to increase cropping strength or crop replacement.
So, variations in crop productivity continue to trouble the area and generate rigorous distress.
So, there is need to attempt good technique for crop prediction in order to overcome existing
problem.
1.3 SCOPE AND OBJECTIVE
SCOPE
Applying Naive Bayes Data Mining Technique for Crop selection will depending on
the nature of the Naive probability model. It can be trained very easy in a supervised learning
section. In several practical applications, parameter estimation for naive Bayes uses the
method of naive Bayes model with believing in Bayesian probability or using any Bayesian
methods.
ADVANTAGES
1. A supervised classifier can perform tasks which linear program cannot.
2. It works even in the presence of noise with good quality output.
2
DISADVANTAGES
1. Must have Knowledge on Bayesian probability or Bayesian methods.
2. Time taken for the process is larger.
3. Based on the assumption that features have same statistical relevance.
OBJECTIVE
Achieving maximum crop yield at minimum cost is one of the goals of agricultural
production. Early detection and management of problems associated with crop yield
indicators can help increase yield and subsequent profit. By influencing regional weather
patterns, large-scale meteorological phenomena can have a significant impact on agricultural
production. Predictions could be used by crop managers to minimize losses when un –
favorable conditions may occur. Additionally, these predictions could be used to maximize
crop prediction when potential exists for favorable growing conditions.
3
2. LITERATURE SURVEY
4
CHAPTER 2
LITERATURE SURVEY
2.1 VIRENDRA PANPATIL ET :
It had accomplished gigantic work for Indian ranchers by making productive yield
proposal framework. They created framework utilizing classifier models, for example,
Decision Tree Classifier, KNN, and Naive Bayes Classifier. The proposed framework can be
utilized to figure out best season of planting, development of plant and Plant reaping. They
utilized distinctive classifier for accomplishing better exactness for instance: Decision tree
shows less precision when dataset is having more varieties yet Naïve Bayes gives preferable
exactness over choice tree for such datasets. The best favorable position of framework that it
can without much of a stretch versatile all things considered/be utilized to test on various
yields.
2.2 MAYANK ET:
It have presumed that this paper fabricate extemporized framework for crop yield
utilizing administered AI calculations and with objective to give simple to utilize User
Interface, increment the precision of crop yield forecast, investigate distinctive climatic
boundaries, for example, overcast cover, precipitation, temperature, and so on In the
proposed framework they zeroed in on MAHARASHTRA State for implantation and for
information gathering they utilized govt. site, for example, www.data.gov.in. For crop yield
forecast they utilized calculations, for example, Random Forest Algorithm and for
convenience they created website page so it will be not difficult to use for all. The primary
favorable position of proposed framework is precision rate is more than 75 percent on the
whole the yields and areas chose in the examination
2.3 SHWETA ET:
It has inferred that this paper will survey that different utilization of AI in the
cultivating areas. And furthermore, helps in can be select appropriate crop select land and
select season settled utilizing these procedures. The calculations use is Naive Bayes and K-
Nearest Neighbor. The calculations are utilizes precision of execution.
5
2.4 AMIT KUMAR ET:
It has presumed that this paper helps in foreseeing crop arrangements and augmenting
yield rates and making advantages to the ranchers. Additionally, Using Machine learning
applications with farming in foreseeing crop sicknesses, examining crop copies, diverse water
system designs. The calculations utilized are fake neural organizations. The serious issue with
neural organization is that the proper organization which suits best for the arrangement is
difficult to accomplish and it incorporates experimentation. The second issue with neural
organization is the equipment reliance as the calculation incorporates more calculations in
reverse and forward the preparing needs more. Assurance of appropriate organization
structure requires insight and time. The proposed framework likewise centers around crop
determination utilizing natural just as financial variables. The framework likewise utilizes the
monetary factor that is the cost of the crop which assumes a significant part on the off chance
that if the yields with same yield yet unique yield cost. The framework additionally utilizes
other strategy which is crop sequencing which gives a full arrangement of yield which can be
developed all through the season. The proposed framework likewise centers around crop
choice utilizing ecological just as financial variables. The framework likewise utilizes the
monetary factor that is the cost of the crop which assumes a significant part on the off chance
that if the crops with same yield yet unique yield cost. The framework additionally utilizes
other technique which is crop sequencing which gives a full arrangement of yield which can
be developed all through the season.
2.5 MANJULA ET:
It has have presumed that this paper helps in improving the yield pace of crops by
utilizing rule based mining. The paper utilizes affiliation rule mining to foresee the yield of
the crop. The calculations utilized are k-Means Algorithm, bunching strategy and deduced
affiliation rule mining. The significant impediment is that the paper utilizes affiliation rule
digging for expectation of crop yield. The issue with affiliation decides mining is that it
creates an excessive number of rules sometimes and the exactness of the expectation
decreases. Likewise the principles will in general fluctuate according to dataset and the
outcomes additionally enormously. The proposed framework mostly centers around the issue
of yield expectation of crop which assumes vital part in yield choice as rancher can choose
crop with greatest yield. The frameworks utilize affiliation rule mining to discover rules and
6
crops with greatest yield. This framework centers on formation of an expectation model
which might be utilized to future forecast of crop yield.
2.6 RAKESH KUMAR ET:
It has have presumed that this paper helps in improving the yield pace of crops by
applying order techniques and looking at the boundaries. The paper clarifies the utilization of
various calculations to accomplish the equivalent. The calculations proposed are Bayesian
calculation, K-implies Algorithm, Clustering Algorithm, and Support Vector Machine. The
hindrance is that there could be no appropriate precision and execution referenced in the
paper according to usage of the proposed calculations. The paper is a study paper and just
recommends the utilization of the calculations yet there is no usage proof gave in the paper.
The technique applied on this paper for crop decision centers uniquely around the plants
which might be developed as indicated by season. The proposed approach settles decision of
crop (s) principally dependent on forecast yield cost supported by boundaries (for example
Environment, soil kind, water thickness, crop kind). It takes crop, their planting time, estate
days and foreseen yield charge for the season as information and finds a succession of
vegetation whose creation with regards to day are greatest over season.
2.7 RAJSHEKHAR ET:
The depicts and gave the subtleties us for rundown of utilized techniques, In India
there are divergent Agriculture crops creation and those crops relies upon the few sort of
elements, for example, natural science, economy and furthermore the geological variables
covering such procedures and strategies on memorable yield of disparate yields, it is
conceivable to get information or information which can be steady to ranchers and
government associations for creation well choices and for improve rules which help to
expanded creation. In this article, our work is on utilization of information mining strategies
which is use to separate data from the horticultural records to assess better crop yield for
primary yields in principle regions of India. In our task we found that the exact expectation of
disparate indicated crop yields across various locales will help to ranchers of India. From this
Indian ranchers will plant various crops in various districts.
7
2.8 VISHNUVARDHAN ET:
They examined a few development in India is dealing with thorough issue to benefit as
much as possible from the crop efficiency. More than 60 out of a hundred the crop actually
relies upon rainstorm precipitation. Momentum developments in Information Technology for
Agriculture field have built up an intriguing exploration zone to conjecture the crop yield.
The risky of yield expectation is a significant issue that stays to be addressed dependent on
available information. Information mining techniques are the better determinations for this
reason. Distinctive Data Mining strategies are utilized and assessed in agribusiness for
approximating the impending year's crop creation. This paper presents a concise investigation
of crop yield forecast utilizing Multiple Linear Regression (MLR) strategy and Density based
grouping procedure for the specific district for example East Godavari region of Andhra
Pradesh in India. In this paper an exertion is made in order to know the locale exact crop
yield examination and it is prepared by applying both Multiple Linear Regression technique
and Density-based bunching strategy. These models were tested in regard of the multitude of
areas of Andhra Pradesh; at that point the strategy of assessment is dropped with just East
Godavari region of Andhra Pradesh in India
8
3. SYSTEM ANALYSIS
&
DESIGN
9
CHAPTER 3
SYSTEM ANALYSIS& DESIGN
3.1 EXISTING SYSTEM
Niketa et al in 2016 have indicated that the yield of the crop depends on the
seasonal climate. In India, climate conditions vary unconditionally. In the time of drought,
farmers face serious problems. So this taken into consideration they used some machine
learning algorithms to help the farmers to suggest the crop for the better yield. They take
various data from the previous years to estimate future data. They used SMO classifiers in
WEKA to classify the results. The main factors that take into consideration are minimum
temperature, maximum temperature, average temperature, and previous year’s crop
information and yield information. Using SMO tool they classified the previous data into two
classes that are high yield and low yield.
Eswari et al in 2018 have indicated that yield of the crop depends on the perception,
average, minimum and maximum temperature. Apart, from that, they have taken one more
attribute named crop evapotranspiration. The crop evapotranspiration is a function of both the
weather and growth stage of the plant. This attribute is taken into consideration to get a good
decision on the yield of the groups. They all collected the dataset with these attributes and
send as input to the Bayesian network and classify into the two classes named true and false
classes and compared with the observed classifications in the model with a confusion matrix
and bring the accuracy. Finally, they concluded that crop yield prediction with Naïve Bayes
and Bayesian network give high accuracy when compared to SMO classifier and forecasting
the crop yield prediction in different climate and cropping scenarios will be beneficial.
DISADVANTAGES OF EXISTING SYSTEM:
The obtained result for the crop yield prediction using SMO classifier gives less
accuracy when compared to naïve Bayes, multilayer perception and Bayesian network.
Previously yield is predicted on the bases of the farmer’s prior experience but now
weather conditions may change drastically so they cannot guess the yield.
10
3.2 PROPOSED SYSTEM
In the proposed system, we develop Prediction of the crop using the efficient
algorithm.
The challenge in it is to build the efficient model to predict the better crop
Here in this project we use machine learning algorithms like Voting classifier which
is nothing but hybrid classification/ensemble of models. In our project the Voting classifier is
an ensemble of models that are obtained from SVM, Random-Forest and KNN. Which can
enhance the accuracy and it can give a better prediction system.
ADVANTAGES OF PROPOSED SYSTEM
Predicting the better crop is the ultimate Aim of the project.
Early detection of problems and management of those problems can help the farmers for
better crop yield.
For the better understanding of the crop yield, we need to study of the huge data with the help
of machine learning algorithm so it will give the accurate prediction of crop and suggest the
farmer for a better crop.
3.3 FEASIBILITY STUDY

The feasibility of the project is analyzed in this phase and business proposal is put forth
with a very general plan for the project and some cost estimates. During system analysis the
feasibility study of the proposed system is to be carried out. This is to ensure that the
proposed system is not a burden to the company. For feasibility analysis, some understanding
of the major requirements for the system is essential. Three key considerations involved in
the feasibility analysis are
 Economic feasibility
 Technical feasibility
 Social feasibility
11
3.3.1 ECONOMICAL FEASIBILITY:
This study is carried out to check the economic impact that the system will have on
the organization. The amount of fund that the company can pour into the research and
development of the system is limited. The expenditures must be justified. Thus the developed
system as well within the budget and this was achieved because most of the technologies
used are freely available. Only the customized products had to be purchased.
Here, in this project we had used limited resources which are well in limit of our
project budget and hence the it is justified that it is economically feasible.
3.3.2 TECHNICAL FEASIBILITY:
This study is carried out to check the technical feasibility, that is, the technical
requirements of the system. Any system developed must not have a high demand on the
available technical resources. This will lead to high demands being placed on the client. A
feasibility study evaluates the project’s potential for success.
The technologies used such as are Python are open resources which will be versatile
in developing various applications. Hence, it is justified that it is technically feasible
3.3.3 SOCIAL FEASIBILITY:
The aspect of study is to check the level of acceptance of the system by the user. This
includes the process of training the user to use the system efficiently. The user must not feel
threatened by the system, instead must accept it as a necessity. The level of acceptance by the
users solely depends on the methods that are employed to educate the user about the system
and to make him familiar with it. His level of confidence must be raised so that he is also able
to make some constructive criticism, which is welcomed, as he is the final user of the system.
The developed system is useful for farmers and other cultivators, which in turn affect
the society and hence it is justified that it is socially feasible.
12
3.4 SYSTEM REQUIREMENTS
3.4.1 HARDWARE REQUIREMENTS
The most common set of requirements defined by any operating system or software
application is the physical computer resources, also known as hardware. The minimal
hardware requirements are as follows,
1. Processor : Pentium IV
2. RAM : 8 GB
3. Processor : 2.4 GHz
4. Main Memory : 8GB RAM
5. Hard Disk Drive : 1TB
6. Keyboard : 104 Keys
3.4.2 SOFTWARE REQUIREMENTS

Software requirements deals with defining resource requirements and prerequisites
that needs to be installed on a computer to provide functioning of an application. The
minimal software requirements are as follows,
1. Front end : Python

2. Dataset : Kaggle website
3. IDE : ANACONDA
4. Operating System : Windows 10
13
3.5 SYSTEM ARCHITECTURE
Fig: 3.1. System Architecture
3.6 UML DIAGRAMS

UML stands for Unified Modeling Language. UML is a standardized general-
purpose modeling language in the field of object-oriented software engineering. The standard
is managed, and was created by, the Object Management Group.
The goal is for UML to become a common language for creating models of object
oriented computer software. In its current form UML is comprised of two major components:
a Meta-model and a notation. In the future, some form of method or process may also be
added to; or associated with, UML.
The Unified Modeling Language is a standard language for specifying, Visualization,
Constructing and documenting the artifacts of software system, as well as for business
modeling and other non-software systems.
14
The UML represents a collection of best engineering practices that have proven
successful in the modeling of large and complex systems.
The UML is a very important part of developing objects oriented software and the
software development process. The UML uses mostly graphical notations to express the
design of software projects.
GOALS:
The Primary goals in the design of the UML are as follows:
1. Provide users a ready-to-use, expressive visual modeling Language so that they can
develop and exchange meaningful models.
2. Provide extendibility and specialization mechanisms to extend the core concepts.
3. Be independent of particular programming languages and development process.
4. Provide a formal basis for understanding the modeling language.
5. Encourage the growth of OO tools market.
6. Support higher level development concepts such as collaborations, frameworks,
patterns and components.
7. Integrate best practices.
3.6.1 USECASE DIAGRAM

A use case diagram in the Unified Modeling Language (UML) is a type of behavioral
diagram defined by and created from a Use-case analysis.
Use case diagrams are used to gather the requirements of a system including internal
and external influences. These requirements are mostly design requirements. Hence, when a
system is analyzed to gather its functionalities, use cases are prepared and actors are
identified.
When the initial task is complete, use case diagrams are modelled to present the
outside view.
In brief, the purposes of use case diagrams can be said to be as follows −
 Used to gather the requirements of a system.

 Used to get an outside view of a system.
 Identify the external and internal factors influencing the system.
 Show the interaction among the requirements is actors.
15
FIG 3.2 USE CASE DIAGRAM
3.6.2 SEQUENCE DIAGRAM

The sequence diagram represents the flow of messages in the system and is also
termed as an event diagram. It helps in envisioning several dynamic scenarios. It portrays the
communication between any two lifelines as a time-ordered sequence of events, such that
these lifelines took part at the run time. In UML, the lifeline is represented by a vertical bar,
whereas the message flow is represented by a vertical dotted line that extends across the
bottom of the page. It incorporates the iterations as well as branching.
Purpose of a Sequence Diagram
1. To model high-level interaction among active objects within a system.

2. To model interaction among objects inside a collaboration realizing a use case.
3. It either models generic interactions or some certain instances of interaction.
16
Fig 3.3 SEQUENCE DIAGRAM
3.6.3 ACTIVITY DIAGRAM
Activity diagram is another important diagram in UML to describe the dynamic

aspects of the system.
Activity diagram is basically a flowchart to represent the flow from one activity to
another activity. The activity can be described as an operation of the system.
The control flow is drawn from one operation to another. This flow can be sequential,
branched, or concurrent. Activity diagrams deal with all type of flow control by using
different elements such as fork, join, etc
17
FIG 3.4 ACTIVITY DIAGRAM
3.7 INPUT DESIGN
The input design is the link between the information system and the user. It
comprises the developing specification and procedures for data preparation and those steps
are necessary to put transaction data in to a usable form for processing can be achieved by
inspecting the computer to read data from a written or printed document or it can occur by
having people keying the data directly into the system. The design of input focuses on
controlling the amount of input required, controlling the errors, avoiding delay, avoiding
extra steps and keeping the process simple. The input is designed in such a way so that it
provides security and ease of use with retaining the privacy. Input Design considered the
following things:
 What data should be given as input?

 How the data should be arranged or coded?
 The dialog to guide the operating personnel in providing input.
 Methods for preparing input validations and steps to follow when error occur.
18
OBJECTIVES
1. Input Design is the process of converting a user-oriented description of the input

into a computer-based system. This design is important to avoid errors in the data input
process and show the correct direction to the management for getting correct information
from the computerized system.
2. It is achieved by creating user-friendly screens for the data entry to handle large
volume of data. The goal of designing input is to make data entry easier and to be free from
errors. The data entry screen is designed in such a way that all the data manipulates can be
performed. It also provides record viewing facilities.
3. When the data is entered it will check for its validity. Data can be entered with the
help of screens. Appropriate messages are provided as when needed so that the user will not
be in maize of instant. Thus the objective of input design is to create an input layout that is
easy to follow.
3.8 OUTPUT DESIGN
The design of output is the most important task of any system. During output design,
developers identify the type of outputs needed, and consider the necessary output controls
and prototype report layouts.
Computer output is the most important & direct source of information to the user. The
system is accepted by the user only by the quality of its output. If the output is not of good
quality, the user is likely to reject the system. Therefore, an effective output design is the
major criteria for deciding the overall quality of the system.
While designing the output one should try to accomplish the following:
1. Determine what information to present to the user?

2. Decide whether to display or print the information.
3. Arrange the presentation of information in an acceptable format.
4. Decide how to distribute the output to intended recipients.
19
Output Design Objectives
 To develop output design that serves the intended purpose and eliminates the
production of unwanted output.
 To develop the output design that meets the end users requirements.
 To deliver the appropriate quantity of output.
 To form the output in appropriate format and direct it to the right person.
 To make the output available on time for making good decisions.
In this project an effort is made in order to predict the better crop and also the
fertilizers that are required to increase the yield of a crop. Our project has
implemented using voting classifier which is nothing but ensemble of models. In our
project we had taken voting classifier, ensemble of models obtained from SVM,
random forest and KNN. By implementing using voting classifier and also by taking
inputs regarding both quality of soil and environmental conditions we got better
accuracy because the yield of crop not only depends quality of soil but also on
environmental like Temperature, Humidity and Rainfall. The accuracy of our project
is approximately 97%.
20
4. IMPLEMENTATION
21
CHAPTER 4
IMPLEMENTATION
4.1 MODULE DESCRIPTION
DATA PRE-PROCESSING
Here the raw data in the crop data is cleaned and the metadata is appending to it by
removing the things which are converted to the integer. So, the data is easy to train. Hear all
the data. In this pre-processing, we first load the metadata into this and then this metadata
will be attached to the data and replace the converted data with metadata. Then this data will
be moved further and remove the unwanted data in the list and it will divide the data into the
train and the test data For this splitting of the data into train and test we need to import
train_test_split which in the scikit-learn this will help the pre-processed data to split the data
into train and test according to the given weight given in the code. The division of the test
and train is done in 0.2 and 0.8 that is 20 and 80 percent respectively.
Model Creation:
We create data into two models:

A) Training model
B) Testing model
The division of the test and train is done in 0.2 and 0.8 that is 20 and 80
percent respectively.
Model evaluation:
We apply the machine learning algorithm for testing part and get the accuracy
of this model.
Prediction:
This module based on GUI part. we create a web page using bootstrap. The
web page like (Nitrogen, Phosphorous, Potassium, PH value, Humidity,
Rainfall, Temperature).now we get the data’s from user to compare the dataset
values .finally it will predict for the Crop and soil to be planted.
22
Methodology:
Give the value of Nitrogen, Phosphorus, Potassium, PH value, Rainfall, Humidity and
Temperature. We already trained the dataset. Our value compared to dataset and finally result
will displayed what seed we cultivated that particular place.
Data Set Description:
This is the sample data set used in this project. The data in Table I is data used to
predict crop yield based on 7 factors. These 7 factors are Nitrogen, Phosphorous, Potassium,
PH value, Rainfall, Humidity, and Temperature. We can create a machine learning model and
train the model and we can predict the crop and from Table II we can predict the fertilizer
should be used to get the proper yield the input parameters are the quantity of Nitrogen,
Phosphorus, Potassium and the output is the respective fertilizer should be used. Here in the
input parameters 1, 2, 3, 4, 5, 6,7represents the soil quality respectively.
Necessary Packages:
1. Numpy
2. Pandas
3. Matplotlib
4. pyplot
5. Scikit-learn
6. Tensorflow
7. Jupyter
23
Sample dataset of crop prediction
Table 4.1: Sample dataset of crop prediction
24
Sample dataset for fertilizer prediction
Table 4.2: Sample dataset for fertilizer prediction
25
4.2 SOFTWARE ENVIRONMENT:
Python is a high-level, interpreted scripting language developed in the late 1980’s by

Guido van Rossum at the National Research Institute for Mathematics and Computer Science
in the Netherlands. The initial version was published at the alt. Sources newsgroup in 1991,
and version 1.0 was released in 1994.
Python 2.0 was released in 2,000, and the 2.x versions were the prevalent releases
until December 2008. At that time, the development team made the decision to release
version 3.0, which contained a few relatively small but significant changes that were not
backward compatible with the 2.x versions. Python 2 and 3 are very similar, and some
features of Python 3 have been back ported to Python 2. But in general, they remain not quite
compatible.
Both Python 2 and 3 have continued to be maintained and developed, with periodic
release updates for both. As of this writing, the most recent versions available are 2.7.15 and
3.6.5. However, an official End of Life date of January 1, 2020 has been established for
Python 2, after which time it will no longer be maintained. If you are a newcomer to Python,
it is recommended that you focus on Python 3, as this tutorial will do.
Python is still maintained by a core development team at the Institute, and Guido is
still in charge, having been given the title of BDFL (Benevolent Dictator For Life) by the
Python community. The name Python, by the way, derives not from the snake, but from the
British comedy troupe Monty Python’s Flying Circus, of which Guido was, and presumably
still is, a fan. It is common to find references to Monty Python sketches and movies scattered
throughout the Python documentation.
4.2.1 PYTHON:
Python is a high level general purpose open source programming language. It is both
object oriented and procedural. Python is an extremely powerful language. This language is
very easy to learn and is a good choice for most of the professional programmers.
Python is invented by Guido Van Rossum at CWI in Netherland in 1989. It is

binding of C, C++, Java. It also provides a library for GUI.
26
Python Features and Characteristics:
 Python is a high level, open source, general purpose programming language.

 It is object oriented, procedural and functional.
 It has library to support GUI.
 It is extremely powerful and easy to learn.
 It is open source, so free to available for everyone.
 It supports on Windows, Linux and Mac OS.
 As code is directly compiled with byte code, python is suitable for use in scripting
languages.
 Python enables us to write clear, logical applications for small and large tasks.
 It has high level built-in data types: string, lists, dictionaries etc.
 Python has a set of useful Libraries and Packages that minimize the use of code in our
day to day life, like- Django, Tkinter, OpenCV, NextworkX, lxml.
 It plays well with java because of python. Python is a version of Python.
 It encourages us to write clear and well structured code.
 It is easy to interface with C, Objective C, java, etc.
 We can easily use Python with other languages. It has different varieties of languages
like-
 CPython – Python implemented in C.
 Jython – Python implemented in java Environment.
 PYPY – Python with JIT compiler and stackless mode.
4.2.2 WHY CHOOSE PYTHON

If you’re going to write programs, there are literally dozens of commonly used
languages to choose from. Why choose Python? Here are some of the features that make
Python an appealing choice.
Python is popular:-
27
Python has been growing in popularity over the last few years. The 2018 Stack
Overflow Developer Survey ranked Python as the 7th most popular and the number one most
wanted technology of the year. World-class software development countries around the globe
use Python every single day.
According to research by Dice Python is also one of the hottest skills to have and the
most popular programming language in the world based on the Popularity of Programming
Language Index.
Due to the popularity and widespread use of Python as a programming language,

Python developers are sought after and paid well. If you’d like to dig deeper into Python
salary statistics and job opportunities, you can do so here.
Python is interpreted:-
Many languages are compiled, meaning the source code you create needs to be
translated into machine code, the language of your computer’s processor, before it can be run.
Programs written in an interpreted language are passed straight to an interpreter that runs
them directly.
This makes for a quicker development cycle because you just type in your code and
run it, without the intermediate compilation step.
One potential downside to interpreted languages is execution speed. Programs that are
compiled into the native language of the computer processor tend to run more quickly than
interpreted programs. For some applications that are particularly computationally intensive,
like graphics processing or intense number crunching, this can be limiting.
In practice, however, for most programs, the difference in execution speed is

measured in milliseconds, or seconds at most, and not appreciably noticeable to a human
user. The expediency of coding in an interpreted language is typically worth it for most
applications.
Python is Free:-
28
The Python interpreter is developed under an OSI-approved open-source license,
making it free to install, use, and distribute, even for commercial purposes.
A version of the interpreter is available for virtually any platform there is, including
all flavours of Unix, Windows, MACOS, smart phones and tablets, and probably anything
else you ever heard of. A version even exists for the half dozen people remaining who use
OS/2.
Python is Portable:-
Because Python code is interpreted and not compiled into native machine instructions,
code written for one platform will work on any other platform that has the Python interpreter
installed. (This is true of any interpreted language, not just Python.)
Python is Simple:-
As programming languages go, Python is relatively uncluttered, and the developers
have deliberately kept it that way.
A rough estimate of the complexity of a language can be gleaned from the number of
keywords or reserved words in the language. These are words that are reserved for special
meaning by the compiler or interpreter because they designate specific built-in functionality
of the language.
Python 3 has 33 keywords, and Python 2 has 31. By contrast, C++ has 62, Java has
53, and Visual Basic has more than 120, though these latter examples probably vary
somewhat by implementation or dialect.
Python code has a simple and clean structure that is easy to learn and easy to read. In
fact, as you will see, the language definition enforces code structure that is easy to read.
For all its syntactical simplicity, Python supports most constructs that would be
expected in a very high-level language, including complex dynamic data types, structured and
functional programming, and object-oriented programming.
Additionally, a very extensive library of classes and functions is available that

provides capability well beyond what is built into the language, such as database
29
manipulation or GUI programming. Python accomplishes what many programming
languages don’t: the language itself is simply designed, but it is very versatile in terms of
what you can accomplish with it.
Conclusion:-
This section gave an overview of the Python programming language, including:
 A brief history of the development of Python

 Some reasons why you might select Python as your language of choice
Python is a great option, whether you are a beginning programmer looking to learn
the basics, an experienced programmer designing a large application, or anywhere in
between. The basics of Python are easily grasped, and yet its capabilities are vast. Proceed to
the next section to learn how to acquire and install Python on your computer.
Python is an open source programming language that was made to be easy-to-read

and powerful. A Dutch programmer named Guido van Rossum made Python in 1991. He
named it after the television show Monty Python's Flying Circus. Many Python examples and
tutorials include jokes from the show.
Python is an interpreted language. Interpreted languages do not need to

be compiled to run. A program called an interpreter runs Python code on almost any kind of
computer. This means that a programmer can change the code and quickly see the results.
This also means Python is slower than a compiled language like C, because it is not
running machine code directly.
Python is a good programming language for beginners. It is a high-level language,

which means a programmer can focus on what to do instead of how to do it. Writing
programs in Python takes less time than in some other languages.
Python drew inspiration from other programming languages like C, C++, Java, Perl,
and Lisp.
30
Python has a very easy-to-read syntax. Some of Python's syntax comes from C,
because that is the language that Python was written in. But Python uses whitespace to
delimit code: spaces or tabs are used to organize code into groups. This is different from C. In
C, there is a semicolon at the end of each line and curly braces ({}) are used to group code.
Using whitespace to delimit code makes Python a very easy-to-read language.
Python use [change / change source]:-
Python is used by hundreds of thousands of programmers and is used in many places.

Sometimes only Python code is used for a program, but most of the time it is used to do
simple jobs while another programming language is used to do more complicated tasks.
Its standard library is made up of many functions that come with Python when it is
installed. On the Internet there are many other libraries available that make it possible for the
Python language to do more things. These libraries make it a powerful language; it can do
many different things.
Some things that Python is often used for are:
 Web development
 Scientific programming
 Desktop GUIs
 Network programming
 Game programming
4.3 SAMPLE CODE
31
crop_model.py
from sklearn.model_selection import train_test_split
import pandas as pd
from sklearn.svm import SVC
from sklearn.naive_bayes import GaussianNB
from sklearn.ensemble import RandomForestClassifier, VotingClassifier
from sklearn.neighbors import KNeighborsClassifier
from sklearn.metrics import accuracy_score
from sklearn import model_selection
crop = pd.read_csv('Data/crop_recommendation.csv')
X = crop.iloc[:,:-1].values
Y = crop.iloc[:,-1].values
X_train, X_test, y_train, y_test = train_test_split(X, Y, test_size = 0.15)
models = []
models.append(('SVC', SVC(gamma ='auto', probability = True)))
models.append(('svm1', SVC(probability=True, kernel='poly', degree=1)))
models.append(('rf',RandomForestClassifier(n_estimators = 21)))
models.append(('gnb',GaussianNB()))
models.append(('knn1', KNeighborsClassifier(n_neighbors=1)))
32
vot_soft = VotingClassifier(estimators=models, voting='soft')
vot_soft.fit(XSSS_train, y_train)
y_pred = vot_soft.predict(X_test)
scores = model_selection.cross_val_score(vot_soft, X_test, y_test,cv=5,scoring='accuracy')
print("Accuracy: ",scores.mean())
score = accuracy_score(y_test, y_pred)
#print("Voting Score % d" % score)
import pickle
pkl_filename = 'Crop_Recommendation.pkl'
Model_pkl = open(pkl_filename, 'wb')
pickle.dump(vot_soft, Model_pkl)
Model_pkl.close()
app.py
33
from flask import Flask, render_template, request, Markup
import pandas as pd
from utils.fertilizer import fertilizer_dict
import os
import numpy as np
from keras.preprocessing import image
from keras.models import load_model
import pickle
classifier = load_model('Trained_model.h5')
classifier._make_predict_function()
crop_recommendation_model_path = 'Crop_Recommendation.pkl'
crop_recommendation_model = pickle.load(open(crop_recommendation_model_path, 'rb'))
app = Flask(__name__)
@ app.route('/fertilizer-predict', methods=['POST'])
def fertilizer_recommend():
crop_name = str(request.form['cropname'])
N_filled = int(request.form['nitrogen'])
P_filled = int(request.form['phosphorous'])
K_filled = int(request.form['potassium'])
df = pd.read_csv('Data/Crop_NPK.csv')
N_desired = df[df['Crop'] == crop_name]['N'].iloc[0]
P_desired = df[df['Crop'] == crop_name]['P'].iloc[0]
K_desired = df[df['Crop'] == crop_name]['K'].iloc[0]
34
n = N_desired- N_filled
p = P_desired - P_filled
k = K_desired - K_filled
if n < 0:
key1 = "NHigh"
elif n > 0:
key1 = "Nlow"
else:
key1 = "NNo"
if p < 0:
key2 = "PHigh"
elif p > 0:
key2 = "Plow"
else:
key2 = "PNo"
if k < 0:
key3 = "KHigh"
elif k > 0:
key3 = "Klow"
else:
key3 = "KNo"
abs_n = abs(n)
35
abs_p = abs(p)
abs_k = abs(k)
response1 = Markup(str(fertilizer_dict[key1]))
return render_template('Fertilizer-Result.html', recommendation1=response1,
recommendation2=response2, recommendation3=response3,
diff_n = abs_n, diff_p = abs_p, diff_k = abs_k)
def pred_pest(pest):
try:
test_image = image.load_img(pest, target_size=(64, 64))
test_image = image.img_to_array(test_image)
test_image = np.expand_dims(test_image, axis=0)
result = classifier.predict_classes(test_image)
return result
except:
return 'x'
@app.route("/")
@app.route("/index.html")
def index():
return render_template("index.html")
@app.route("/CropRecommendation.html")
def crop():
36
return render_template("CropRecommendation.html")
@app.route("/FertilizerRecommendation.html")
def fertilizer():
return render_template("FertilizerRecommendation.html")
@app.route("/PesticideRecommendation.html")
def pesticide():
return render_template("PesticideRecommendation.html")
@app.route("/predict", methods=['GET', 'POST'])
def predict():
if request.method == 'POST':
file = request.files['image'] # fetch input
filename = file.filename
file_path = os.path.join('static/user uploaded', filename)
file.save(file_path)
pred = pred_pest(pest=file_path)
if pred == 'x':
return render_template('unaptfile.html')
if pred[0] == 0:
pest_identified = 'aphids'
elif pred[0] == 1:
pest_identified = 'armyworm'
elif pred[0] == 2:
pest_identified = 'beetle'
37
elif pred[0] == 3:
pest_identified = 'bollworm'
elif pred[0] == 4:
pest_identified = 'earthworm'
elif pred[0] == 5:
pest_identified = 'grasshopper'
elif pred[0] == 6:
pest_identified = 'mites'
elif pred[0] == 7:
pest_identified = 'mosquito'
elif pred[0] == 8:
pest_identified = 'sawfly'
elif pred[0] == 9:
pest_identified = 'stem borer'
return render_template(pest_identified + ".html",pred=pest_identified)
@ app.route('/crop_prediction', methods=['POST'])
def crop_prediction():
if request.method == 'POST':
N = int(request.form['nitrogen'])
P = int(request.form['phosphorous'])
K = int(request.form['potassium'])
ph = float(request.form['ph'])
rainfall = float(request.form['rainfall'])
38
temperature = float(request.form['temperature'])
humidity = float(request.form['humidity'])
data = np.array([[N, P, K, temperature, humidity, ph, rainfall]])
my_prediction = crop_recommendation_model.predict(data)
final_prediction = my_prediction[0]
return render_template('crop-result.html', prediction=final_prediction,

pred='img/crop/'+final_prediction+'.jpg')
if __name__ == '__main__':
app.run(debug=True)
cnn_model.py
39
# Part 1 - Building the CNN
#importing the Keras libraries and packages
from keras.models import Sequential
from keras.layers import Convolution2D
from keras.layers import MaxPooling2D
From keras.layers import Flatten
from keras.layers import Dense, Dropout
from keras import optimizers
# Initialing the CNN
classifier = Sequential()
# Step 1 - Convolution Layer
classifier.add(Convolution2D(32, 3, 3, input_shape = (64, 64, 3), activation = 'relu'))
#step 2 - Pooling
classifier.add(MaxPooling2D(pool_size =(2,2)))
# Adding second convolution layer
classifier.add(Convolution2D(32, 3, 3, activation = 'relu'))
#Adding 3rd Concolution Layer
classifier.add(Convolution2D(64, 3, 3, activation = 'relu'))
#Step 3 - Flattening
classifier.add(Flatten())
#Step 4 - Full Connection
40
classifier.add(Dense(256, activation = 'relu'))
classifier.add(Dropout(0.5))
classifier.add(Dense(10, activation = 'softmax'))
#Compiling The CNN
classifier.compile(
optimizer = 'adam',
loss = 'categorical_crossentropy',
metrics = ['accuracy'])
#Part 2 Fittting the CNN to the image
from keras.preprocessing.image import ImageDataGenerator
train_datagen = ImageDataGenerator(
rescale=1./255,
shear_range=0.2,
zoom_range=0.2,
horizontal_flip=True)
test_datagen = ImageDataGenerator(rescale=1./255)
training_set = train_datagen.flow_from_directory(
'Data/train',
target_size=(64, 64),
batch_size=32,
class_mode='categorical')
test_set = test_datagen.flow_from_directory(
'Data/test',
41
target_size=(64, 64),
batch_size=32,
class_mode='categorical')
model = classifier.fit_generator(
training_set,
steps_per_epoch=100,
epochs=100,
validation_data = test_set,
validation_steps = 6500
#Saving the model
import h5py
classifier.save('Trained_Model.h5')
print(model.history.keys())
import matplotlib.pyplot as plt
# summarize history for accuracy
plt.plot(model.history['acc'])
plt.plot(model.history['val_acc'])
plt.title('model accuracy')
plt.ylabel('accuracy')
plt.xlabel('epoch')
plt.legend(['train', 'test'], loc='upper left')
plt.show()
42
# summarize history for loss
plt.plot(model.history['loss'])
plt.plot(model.history['val_loss'])
plt.title('model loss')
plt.ylabel('loss')
plt.xlabel('epoch')
plt.legend(['train', 'test'], loc='upper left')
plt.show()
4.4 SVM ALGORITHM:
43
An SVM model is a representation of the examples as points in space, mapped so
that the examples of the separate categories are divided by a clear gap that is as wide as
possible. In addition to performing linear classification, SVMs can efficiently perform a
non-linear classification, implicitly mapping their inputs into high-dimensional feature
spaces.
Given a set of training examples, each marked as belonging to one or the other of two
categories, an SVM training algorithm builds a model that assigns new examples to one
category or the other, making it a non-probabilistic binary linear classifier. Let you have
basic understandings from this article before you proceed further. Here I’ll discuss an
example about SVM classification of cancer UCI datasets using machine learning tools.
SUPPORT VECTOR MACHINE
An SVM methodology denotes its appearance between objects throughout region,

which were displayed in such a way that either the groups were divided by either a great
distance. The goal is to evaluate the maximum-margin hyper-plane that provides the best
class divide. The instances that are often nearest to a hyper-plane with maximum margins are
named vectors of assistance. The selected variables were focused onto the portion of a
datasets that denotes a learning set. Dual-class support variables allow for all the formation
with two simultaneous hyper-planes. Furthermore, the greater its boundary between other two
hyper-planes, the classifier's failure of generalization would be stronger. Relative to several
other machine learning methods SVMs are developed inside a special way.
In our project for SVM algorithm we used values of degree as degree=1,2,3,4,5 and
kernel=poly to generated a model.
44
Fig: 4.3. Example of SVM algorithm
4.5 RANDOM FOREST
Random Forest is a classifier that contains a number of decision trees on various

subsets of the given dataset and takes the average to improve the predictive accuracy of that
dataset. Instead of relying on one decision tree, the random forest takes the prediction from
each tree and based on the majority votes of predictions, and it predicts the final output.
In our project for random forest algorithm we used 21 estimators that is nothing but a
decision trees to generate a model.
45
Fig: 4.3. Example for RANDOM FOREST algorithm
4.6 KNN (K- NEAREST NEIGHBOURHOOD)

K-nearest neighbours (KNN) algorithm uses ‘feature similarity’ to predict the values of
new data points which further means that the new data point will be assigned a value based
on how closely it matches the points in the training set. We can understand its working with
the help of following steps −
Step 1 − they want data frame for enforcing some method. And we must pack its instruction
and perhaps even the relevant data within the first phase in KNN.
Step 2 −First; we will pick the K amount i.e. its closest information values. Some number
could be K.
Step 3 − Does the preceding to every level well into the information −
3.1 − using any of the methods notably: Manhattan, Euclidean or Hamming distance
measure the distance among testing data so each line of sample data. Its most
frequently utilized form for range calculation is Euclidean.
3.2 − Now, based on the distance value, sort them in ascending order.
3.3 − Next, list the top K lines from both the list you have ordered.
3.4 −Now, the category would be allocated to both the check points based on its most
common classes of those lines.
Step 4 – End
In our project for knn algorithm we used values of n as n=1,3,5,7,9 to generated a model.
46
Fig: 4.4. Example of KNN algorithm
VOTING CLASSIFIER
A Voting Classifier is a machine learning model that trains on an ensemble of numerous

models and predicts an output (class) based on their highest probability of chosen class as
the output.
It simply aggregates the findings of each classifier passed into Voting Classifier and
predicts the output class based on the highest majority of voting. The idea is instead of
creating separate dedicated models and finding the accuracy for each them, we create a
single model which trains by these models and predicts output based on their combined
majority of voting for each output class.
Voting Classifier supports two types of votings.
1. Hard Voting: In hard voting, the predicted output class is a class with the
highest majority of votes i.e the class which had the highest probability of being
predicted by each of the classifiers. Suppose three classifiers predicted
the output class(A, A, B), so here the majority predicted A as output.
Hence A will be the final prediction.
47
2. Soft Voting: In soft voting, the output class is the prediction based on the
average of probability given to that class. Suppose given some input to three
models, the prediction probability for class A = (0.30, 0.47, 0.53) and B = (0.20,
0.32, 0.40). So the average for class A is 0.4333 and B is 0.3067, the winner is
clearly class A because it had the highest probability averaged by each classifier.
Now we combine all the model that are obtained from above three classifiers Random-forest,
KNN and SVM.
In our project we used soft type voting that means here the output class is predicted based on
the average of probability given to that class.
4.7 CNN (CONVOLUTIONAL NEURAL NETWORK):
In our project we are using Convolutional neural networks for identifying the type of
insect when image of an insect is uploaded. Because we can predicts the pesticides for that
insect only by knowing which kind of insect it was.
We used 3 convolutional layers in our project.
Dimensions of the image is 64x64.
A convolutional neural network is a feed-forward neural network that is generally

used to analyze visual images by processing data with grid-like topology. It’s also known
as a ConvNet. A convolutional neural network is used to detect and classify objects in an
image.
48
A convolution neural network has multiple hidden layers that help in extracting
information from an image.
The four important layers in CNN are:
1. Convolution layer
2. ReLU layer
3. Pooling layer
4. Fully connected layer
Convolution Layer
This is the first step in the process of extracting valuable features from an image. A
convolution layer has several filters that perform the convolution operation. Every image is
considered as a matrix of pixel values.
ReLU layer
ReLU stands for the rectified linear unit. Once the feature maps are extracted, the next
step is to move them to a ReLU layer.
ReLU performs an element-wise operation and sets all the negative pixels to 0. It
introduces non-linearity to the network, and the generated output is a rectified feature map.
Pooling Layer
Pooling is a down-sampling operation that reduces the dimensionality of the feature

map. The rectified feature map now goes through a pooling layer to generate a pooled
feature map.
49
Fig: Example for CNN algorithm
50
5. SYSTEM TESTING
&
SCREEN SHOTS
51
CHAPTER 5.
SYSTEM TESTING & SCREEN SHOTS
SYSTEM TESTING
5.1 INTRODUCTION
In a generalized way, we can say that the system testing is a type of testing in which
the main aim is to make sure that system performs efficiently and seamlessly. The process of
testing is applied to a program with the main aim to discover an unprecedented error, an error
which otherwise could have damaged the future of the software. Test cases which brings up a
high possibility of discovering and error is considered successful. This successful test helps to
answer the still unknown errors.
TESTING METHODOLOGIES
A test plan is a document which describes approach, its scope, its resources and the
schedule of aimed testing exercises. It helps to identify almost other test item, the features
which are to be tested, its tasks, how will everyone do each task, how much the tester is
independent, the environment in which the test is taking place, its technique of design plus
the both the end criteria which is used, also rational of choice of theirs, and whatever kind of
risk which requires emergency planning. It can be also referred to as the record of the process
of test planning. Test plans are usually prepared with signification input from test engineers.
5.1.1 UNIT TESTING
In unit testing, the design of the test cases is involved that helps in the validation of the
internal program logic. The validation of all the decision branches and internal code takes
place. After the individual unit is completed it takes place. Plus it is taken into account after
the individual united is completed before integration. The unit test thus performs the basic
level test at its component stage and tests the particular business process, system
configurations etc. The unit test ensures that the particular unique path of the process gets
performed precisely to the documented specifications and contains clearly defined inputs
with the results which are expected.
52
5.1.2 INTEGRATION TESTING:
These tests are designed to test the integrated software items to determine whether if they
really execute as a single program or application. The testing is event driven and thus is
concerned with the basic outcome of field. The Integration tests demonstrate that the
components were individually satisfaction, as already represented by successful unit testing,
the components are apt and fine. This type of testing is specially aimed to expose the issues
that come-up by the components combination.
The following are the types of Integration Testing:
1. Top - Down Integration
Top-down integration testing technique used in order to simulate the behaviour of

the lower-level modules that are not yet integrated. In this integration testing, testing takes
place from top to bottom. First high-level modules are tested and then low-level modules
and finally integrating the low-level modules to a high level to Bottom-up Integration
In bottom-up testing, each module at lower levels is tested with higher modules
until all modules are tested. The primary purpose of this integration testing is, each
subsystem is to test the interfaces among various modules making up the subsystem. This
integration testing uses test drivers to drive and pass appropriate data to the lower level
modules.
Advantages:
 In bottom-up testing, no stubs are required.
 A principle advantage of this integration testing is that several disjoint
subsystems can be tested simultaneously.
Disadvantages:
 Driver modules must be produced.
 In this testing, the complexity that occurs when the system is made up of a large
number of small subsystem.
5.1.3 FUNCTIONAL TESTING
53
The functional tests help in providing the systematic representation that functions
tested are available and specified by technical requirement, documentation of the system and
the user manual.
Functional testing is centered on the following items:

Valid Input : identified classes of valid input must be accepted.
Invalid Input : identified classes of invalid input must be rejected.
Functions : identified functions must be exercised.
Output : identified classes of application outputs must be exercised.
Systems/Procedures: interfacing systems or procedures must be invoked.
Organization and preparation of functional tests is focused on requirements, key

functions, or special test cases. In addition, systematic coverage pertaining to identify
Business process flows; data fields, predefined processes, and successive processes must be
considered for testing. Before functional testing is complete, additional tests are identified
and the effective value of current tests is determined.
5.1.4 White Box Testing:
The white box testing is the type of testing in which the internal components of the
system software is open and can be processed by the tester. It is therefore a complex type of
testing process. All the data structure, components etc. are tested by the tester himself to find
out a possible bug or error. It is used in situation in which the black box is incapable of
finding out a bug. It is a complex type of testing which takes more time to get applied.
5.1.5 Black Box Testing:
The black box testing is the type of testing in which the internal components of the
software is hidden and only the input and output of the system is the key for the tester to find
out a bug. It is therefore a simple type of testing. A programmer with basic knowledge can
also process this type of testing. It is less time consuming as compared to the white box
testing. It is very successful for software which are less complex are straight-forward in
nature. It is also less costly than white box testing.
54
5.1.6 Acceptance Testing:
User Acceptance Testing is a critical phase of any project and requires significant
participation by the end user. It also ensures that the system meets the functional
requirements.
TEST CASE
Testing, as already explained earlier, is the process of discovering all possible weak-points in
the finalized software product. Testing helps to counter the working of sub-assemblies,
components, assembly and the complete result. The software is taken through different
exercises with the main aim of making sure that software meets the business requirement and
user-expectations and doesn’t fail abruptly. Several types of tests are used today. Each test
type addresses a specific testing requirement.
Ensure the system is working as intended.
Advantages:
 Separately debugged module.
 Few or no drivers needed.
 It is more stable and accurate at the aggregate level.
Disadvantages:
 Needs many Stubs.
 Modules at lower level are tested inadequately.
SAMPLE TEST CASES:

5.1.7 TEST CASE 1:
TESTCASE TEST TEST EXPECTED ACTUAL PASS/FAIL

DESCRIPTION DATA RESULT RESULT
1 Accepting ph Ph Valid value Value pass
value value=76 is to be accepted
accepted
55
2 Accepting ph Ph=110 Value not be Value not pass
value accepted accepted
5.1.8 TEST CASE 2:
TES TEST TEST DATA EXPECTED ACTUAL PASS/

CASE DESCRIPTION RESULT RESULT FAIL
1 Predict the crop based N=76,p=48,k=18,.ph= maize maize pass

on input values 5.77,Rainfall(in
N,P,K,ph,Humidity,Rai mm)=83.21,
nfall,Temperature Temperature=19.29,R
elative(in5)=69.63
2 Predict the crop based N=90,P=42,K=43,Ph rice maize fail
on input values =6.5,Rainfall(in
mm)=100,
Temperature (in
c)=20.8,Relative
humidty in(%)=82
5.1.9 TES CASE 3:
TEST TEST TEST DATA EXPECTED ACTUAL PASS/FAIL

DATA DESCRIPTION RESULT RESULT
1 Predict the crop N=76,P=48,K=1. Maize maize pass
based on input 8,ph=15.77,Rainf
values all(in mm)
=83.21,temperatu
re(in
c)=19.29,Relative
humidty( in
%)=69.63
2 Predicting the N=10,P=20,K=35 coconut sugarcane Fail
56
fertilizer based on ,
input values crop you want to
grow=coconut
5.2 SCREEN SHOTS
5.2.1 SCREENSHOTS OF SOME OF THE IMAGES OF THE PESTS FOR WHICH

WE ARE PREDICTING THE PESTS
FIG 5.1: SCREENSHOTS OF THE IMAGE OF PESTS
57
5.2.2 HOME PAGE
FIG 5.2: HOME PAGE
58
5.2.3 TAKING INPUTS FOR CROP PREDICTION
FIG 5.3 : TAKING INPUTS FOR CROP PREDICTION
59
5.2.4 CROP PREDICTED
FIG 5.4: CROP PREDICTED
60
5.2.5 TAKING INPUTS FOR CROP PREDICTION
FIG 5.5: TAKING INPUTS FOR CROP PREDICTION
61
5.2.6. CROP PREDICTED
FIG 5.6 : CROP PREDICTED
62
5.2.7. TAKING INPUTS FOR CROP PREDICTION
FIG 5.7: TAKING INPUTS FOR CROP PREDICTION
63
5.2.8. CROP PREDICTED
FIG 5.8: CROP PREDICTED
64
5.2.9. TAKING INPUT TO PREDICT FERTILIZERS
FIG 5.9: TAKING INPUTS TO PREDICT FERTILIZERS
65
5.2.10. Suggestions
FIG 5.10: Suggestions
66
5.2.11. Suggestions
FIG 5.11: Suggestions
67
5.2.12. Information about fertilizers.
5.12. Information about fertilizers.
68
5.2.13. Suggestions about fertilizers.
Fig: 5.13 Suggestions
5.2.14: THE IMAGE OF INSECT FOR WHICH WE ARE PREDICTING

PESTICIDES AND USAGE
69
Fig: 5.14: Predicting pesticide
5.2.15. TAKING INPUT TO PREDICT PESTICIDES
Fig: 5.15. TAKING INPUT TO PREDICT PESTICIDES
70
5.2.16: IDENTIFIED PESTICIDE
Fig: 5.16 IDENTIFIED PESTISIDE
71
5.2.17: TAKING INPUT TO PREDICT PESTICIDES
Fig: 5.17. INPUT TO PREDICT PESTICIDES
5.2.18: IDENTIFIED PESTICIDE
Fig: 5.18. Recommended Pesticide based on pest
72
5.2.19:. IDENTIFIED PEST
Fig: 5.19. Identified Pest
73
Enter Required values for manual analysis:
View Predicted values:
74
Temperature Requirement:
75
76
6. CONCLUSION
&
FUTURE SCOPE
Chapter 6.
CONCLUSION AND FUTURE SCOPE
77
6.1 CONCLUSION
The proposed work presents a crop prediction framework utilizing Voting classifier
which is nothing but an ensemble of models. Here in our project voting classifier ensembles
the models obtained from SVM, Random-Forest and KNN. Our project predict the crop with
more accuracy. In this way the framework will help decrease the challenges looked by the
farmers and prevent them from endeavoring suicides. It will go about as a medium to give the
farmers effective data needed to get high return and consequently augment benefits which
thus will diminish the self destruction rates and reduce his challenges.
6.2 FUTURE WORK
It’s lead to increasing the Countries’ overall profit. In our project we found that the
accurate prediction of different specified crop yields across different districts will help to
farmer. From this farmers will plant different crops in different districts. In the near future,
geospatial analysis can be added to improve accuracy and also implement a better
geographical data.
78
7. BIBLOGRAPHY
&
REFERENCES
CHAPERT 7.
7.1. BIBILOGRAPHY AND REFERENCES
79
[1] Mayank Champaneri, Chaitanya Chandvidkar , Darpan Chachpara, Mansing Rathod,
“Crop yield prediction using machine learning” International Journal of Science and
Research ,April 2020.
[2] Pavan Patil, Virendra Panpatil, Prof. Shrikant Kokate, “Crop Prediction System using
Machine Learning Algorithms”, International Research Journal of Engineering and
Technology, Feb 2020.
[3] Ramesh Medar , Shweta, Vijay S. Rajpurohit, “Crop Yield Prediction using Machine
Learning Techniques”, 5th International Conference for Convergence in Technology, 2019.
[4] Trupti Bhange, Swati Shekapure, Komal Pawar, Harshada Choudhari, “Survey Paper on
Prediction of Crop yield and Suitable Crop”, International Journal of Innovative Research in
Science, Engineering and Technology, May 2019.
[5] E. Manjula, S. Djodiltachoumy, “A Modal for Prediction of Crop Yield “International

Journal of Computational Intelligence and Informatics, March 2017.
[6] Nishit Jain, Amit Kumar, Sahil Garud, Vishal Pradhan, Prajakta Kulkarni, “Crop
Selection Method Based on Various Environmental Factors Using Machine Learning”,
International Research Journal of Engineering and Technology (IRJET), Feb 2017.
[7] Rakesh Kumar, M.P. Singh, Prabhat Kumar, J.P. Singh, “Crop Selection Method to
Maximize Crop Yield Rate using Machine Learning Technique”, 2015 International
Conference on Smart Technologies and Management for Computing, Communication,
Controls, Energy and Materials (ICSTM),Vel Tech Rangarajan Dr. Sagunthala R&D Institute
of Science and Technology, Chennai, T.N., India., May 2015.
[8] Rajshekhar Borate., “Applying Data Mining Techniques to Predict Annual Yield of Major
Crops and Recommend Planting Different Crops in Different Districts in India”, International
Journal of Novel Research in Computer Science and Software Engineering,Vol. 3, Issue 1,
pp: (34-37), April 2016.
80
[9] D Ramesh, B Vishnu Vardhan, “Analysis of Crop Yield Prediction using Data Mining
Techniques”, International Journal of Research in Engineering and Technology
(IJRET),Vol.4, 2015.
[10]. S.Veenadhari, Dr Bharat Misra, Dr CD Singh.2019.”Machine learning approach for

forecasting crop yield based on climatic parameters.”.978-1-4799-2352- 6/14/$31.00 ©2014
IEEE.
[11]. Igor Oliveira, Renato L. F. Cunha, Bruno Silva, Marco A. S. Netto.2018.”A Scalable
Machine Learning System for PreSeason Agriculture Yield Forecast.”.978-1-5386-9156-
4/18/$31.00 ©2018.
[12] Neha Rale, Raxitkumar Solanki, Doina Bein, James Andro-Vasko, Wolfgang
Bein.”Prediction of Crop Cultivation”.978-1-7281-0554-3/19/$31.00©2019 IEEE.
[13]. Md. Tahmid Shakoor, Karishma Rahman, Sumaiya Nasrin Rayta, Amitabha
Chakrabarty.2017.”Agricultural Production Output Prediction Using Supervised Machine
Learning Techniques”.978-1-5386-3831-6/17/$31.00 ©2017 IEEE.
[14]. G Srivatsa Sharma, Shah Nawaz Mandal, Shruti Kulkarni, Monica R Mundada,
Meeradevi.2018.”Predictive Analysis to Improve Crop Yield Using a Neural Network
Model”.978-1-5386-5314-2/18/$31.00 ©2018 IEEE.
[15]. Rashmi Priya, Dharavath Ramesh.2018.”Crop Prediction on the Region Belts of India:
A Naïve Bayes MapReduce Precision Agricultural Model”. 978-1-5386-5314- 2/18/$31.00
©2018 IEEE.
81

Cropyeildpredection

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Cropyeildpredection

Uploaded by

Copyright:

Available Formats

A Project Report on

CROP RECOMMENDATION SYSTEM USING MACHINE

Department of Computer Science & Engineering

Chilakapalem Jn,. Etcherla (M) Srikakulam (Dist),

DEPARTMENT OF COMPUTER SCIENCE & ENGINEERING

Project Guide Head of the Department

J.Bharath vamsi (19W61A0532)

SNO CONTENTS PAGE NO

1.3 SCOPE AND OBJECTIVE 2

3.1 EXISTING SYSTEM 10

3.2 PROPOSED SYSTEM 11

3.3 FEASIBILITY STUDY 11

3.3.1 ECONOMICAL FEASIBILITY 12

3.3.2 TECHNICAL FEASIBILITY 12

3.3.3 SOCIAL FEASIBILITY 12

3.4.1 HARDWARE REQUIREMENTS 13

3.4.2 SOFTWARE REQUIREMENTS 13

3.5 SYSTEM ARCHITECTURE 14

3.6 UML DIAGRAMS 14

3.6.1 USE CASE DIAGRAM 15

3.6.2 SEQUENCE DIAGRAM 16

3.6.3 ACTIVITY DIAGRAM 17

3.7 INPUT DESIGN 18

3.8 OUTPUT DESIGN 19

4.1 MODULE DESCRIPTION 22

4.2 SOFTWARE ENVIRONMENT 26

4.2.2 WHY CHOOSE PYTHON 27

4.3 SAMPLE CODE 32

4.4 SVM ALOGRITHM 44

4.5 RANDOM FOREST 45

4.6 KNN ALOGRITHM 46

4.7 CNN ALOGRITHM 48

5 SYSTEM TESTING & SCREENSHOTS 50

5.1 SYSTEM TESTING 51

5.1.1 UNIT TESTING 52

5.1.2 INTEGRATION TESTING 53

5.1.3 FUNCTIONAL TESTING 54

5.1.4 WHITE BOX TESTING 54

5.1.5 BLACK BOX TESTING 54

5.1.6 ACCEPTANCE TESTING 55

5.2 SCREEN SHOTS 57

5.1.2 SCREENSHOTS OF SOME OF THE PESTS FOR 57

5.2.2 HOME PAGE 58

5.2.3 TAKING INPUTS FOR PREDICTION 59

5.2.4 CROP PREDICTED 60

5.2.5 TAKING INPUTS FOR CROP PREDICTION 61

5.2.6 CROP PREDICTED 62

5.2.7 TAKING INPUTS FOR CROP PREDICTION 63

5.2.8 CROP PREDICTION 64

5.2.9 TAKING INPUTS TO PREDICT FERTILIZERS 65

5.2.12 INFORMATION ABOUT FERTILIZERS 68

5.2.13 SUGGESTIONS ABOUT FERTILIZERS 69

5.2.14 THE IMAGE OF INSECT FOR WHICH WE ARE 69

5.2.16 IDENTIFIED PESTICIDE 71

5.2.17 TAKING INPUT TO PREDICT PESTCIDE 72

5.2.18 IDENTIFIED PESTCIDE 72

5.2.19 IDENTIFIED PEST 73

6 CONCLUSION AND FUTURE SCOPE 74

6.2 FUTURE SCOPE 75

7 BIBLIOGRAPHY AND REFERENCES 77

S.NO CONTENT PAGE NO

S.NO CONTENT PAGE NO

5.1.8 TEST CASE 2 56