You are on page 1of 9

Xavier Institute of Engineering

Department of Computer Engineering

(Academic Year: 2022-23 Semester: V)

Class: TE

Subject: Data Warehousing and Mining

Experiment No. 11

Aim Write and present a report on any algorithm in the team

CO6
CO
Name of the Student Yashika Gupta

Roll No./ Batch 26/B

Date of Performance

Date of submission

Marks Awarded

Signature of the Teacher:

Rubrics used for Laboratory Evaluation:

Below Expectations Average Good


Knowledge (4) 2 3 4
Performance (5) 2 3 5
Neatness of Documentation
1 2 3
(3)
Punctuality & Submission on
1 2 3
Time (3)
Table of Contents
ABSTRACT 3
1. INTRODUCTION 4
2. PROBLEM STATEMENT 4
3. PROPOSED SYSTEM 5
DATA MINING IN AGRICULTURE 5
3.1 Classification Techniques: 6
3.2 Bayesian Network 6
4. CONTRIBUTION 7
5. CONCLUSION 7
ABSTRACT

Agriculture is the most fundamental function to meet the demand for food around
the world; it is a backbone, particularly in developing countries like India. The
decision-making situation can be changed using current technologies so that the
farmer can produce better. The primary role in decision-making for agricultural
domains is data mining. Python is used as an interface to analyze the agricultural
dataset. Pycharm is the data mining tool used to predict agricultural production.
The parameters included in the dataset are soil, crop, irrigation, fertilizer,
temperature, and rainfall. Data mining techniques like KMeans Clustering and
Bayesian network algorithms are considered to be accuracy algorithms.
1. INTRODUCTION

In the days of today's society, data mining is used in huge areas and there are many data
mining tools, techniques, and procedures available the sphere of influence of data mining
application software is accessible, but data mining in datasets in agricultural land is a
relatively important field of research. Now the one-day mining concept and techniques
are being used to solve agricultural problems. Globally, the demand for food increases
day by day; therefore, agricultural scientists, farmers, government, and researchers are
trying to do more and using numerous techniques in agriculture to improve production.
As a result, the data generated in the field of agricultural data has been improving day by
day. As the data layer increases, there is a need for an instinctive way to extract and
analyze this data when needed. Data mining can analyze versatile data; there are no
restrictions on the data type. The second hidden model analysis process to various
perspectives for classification and conversion of relevant information is called data
mining in which data is organized in particular areas as the data repository. Efficient
analysis using data mining techniques helps farmers make decisions. This information
helps them reduce costs and increase the production rate. Data extraction is the process
that includes the following steps: extraction, transformation, uploading data to a
repository, and managing data in multidimensional databases.

2. PROBLEM STATEMENT

All existing systems look at only one or two to three parameters and according to these
two or three parameters they provide a predictive yield forecast. And many existing
systems just provide information on crops and do not predict crop yields. Therefore a
system that not only provides information about crops to farmers but also provides
forecasts for a well-suited crop of a particular land needs to be developed.
3. PROPOSED SYSTEM

Agriculture is the broadest economic sector and is the main source of income for many
people in India. Agriculture is dependent on many climate and economic factors. These
factors include soil, climate, cultivation, irrigation, fertilizers, temperature, and rainfall.
But the problem in today’s farming is the lack of knowledge about the climatic trends and
patterns and scientific knowledge about the soil that will benefit the crop yield. The
optimum crop prediction system aims at taking all these factors into consideration and
provides a central platform for farmers to analyze optimum solutions for their crop yield.
This system will focus on the analysis of the agriculture data and finding optimal
parameters to maximize crop production using the ID3 algorithm.

DATA MINING IN AGRICULTURE


In large datasets, data mining is the process to discover new models. Data processing
provides a great advantage in agriculture for disease detection, problem prediction, and
optimization of pesticides. Recent technologies related to agricultural activities provide a
lot of information. Hence these data for the model, and mining techniques in agriculture
are used for the reorganization and detection of diseases. Agricultural data in data mining
can be presented in the form of a data mart Crop production for reliable and timely
requirements for various marketing, pricing, and storage decisions distribution and
import-export. The performance of agriculture mainly depends on diseases, parasites,
weather conditions, and different crop planning for crop productivity are the results. So
based on these predictions, they are very useful for agricultural domains. Data mining
techniques use forecasts before harvest. For example, with data application mining
techniques, the government can take full advantage of the data on farmers who buy
models and also get an understanding of their land to protect them in order to get more
profit from the farmer. A data mining tool called a knowledge discovery database (KDD).
3.1 Classification Techniques:
SVM Classification: The Support Vector Machine algorithm is an important data analysis
methodology and is used for classification and regression techniques. Here, the data
points were plotted using an n-dimensional space with the value of particular
characteristics as the value of a specific coordinate. The classification is done by finding
the hyperplane line that differentiates the classes separately.

3.2 Bayesian Network


A Bayesian network is an acyclic graph consisting of arcs and nodes with directions in
which each edge represents a conditional dependency and each node represents a unique
random variable. The probabilistic graph model that uses Bayesian inference for
calculations is called Bayesian networks. KNN(K NearestNeighbours)Classifier:
K-Neighbors Neighbors is one of the classification algorithms in Machine Learning.
Otherwise, it is called a supervised learning model and lazy algorithm due to instance
learning. It is used for various applications such as pattern recognition, data mining, and
intrusion detection. The implementation is simple for small data sets. The training data
does not require any knowledge of the data structure prior to analysis. The disadvantage
of this classifier is finding the closest neighbor for each sample. A lot of space is needed
when training data is large. The distance between the test data and the training data must
be calculated for each test data. So the tests take a long time. K-Means Clustering:
K-means clustering is one of the clustering methods that process a group of data points in
a small number of clusters. For example, the items in a shopping mall are grouped into
several categories (medium, large, and XL are grouped according to the size of the dress).
This is a qualitative method of dividing a group of data. A quantitative approach is used
to measure the unique characteristics of products. In k clusters, the number of data points
must be divided. The goal of this methodology is to assign a cluster to each data point.
The Kmeans algorithm aims to discover the positions of the groups that minimize the
distance from the data points to the group.
4. CONTRIBUTION

Everyone did the required amount of research and later split the different parts to create
this report.
Abstract - Moses Fernandes
Problem statement - Shivam Goswami
Solution to problem - John Baby
Conclusion - Yashika Gupta
References - Chris George

5. CONCLUSION

In this case study, we saw how to use some of the common data mining techniques in
agriculture, some of these techniques such as k-means and Bayesian classification. Data
mining in agriculture is the next field of research. Efficient techniques are developed and
can be developed to solve complex agricultural problems through data mining. The future
improvement of this agricultural analysis is to predict crop yields using these techniques.
It is useful for crop decision-making for farmers and government organizations. In the
future, the ANN and NN classification approach can be used for better classification and
to improve the classification performance of crop yield prediction. Understanding the
high dimensions between the complex annual and seasonal weather patterns that
determine yields helps farmers and other decision-makers predict the effects of drought
and other climatic conditions. The complex annual and seasonal weather patterns that
determine yields help farmers and other decision-makers predict the effects of drought
and other climatic conditions.
6. REFERENCES
[1].Ramesh D, Vishnu Vardhan B. Data mining techniques and applications to
agricultural yield data. In: International journal of advanced research in computer
and communication engineering. Vol. 2, Issue 9, September 2013
[2]. Suhas M Patil, Sakkaravarthi R Internet of things based Smart agriculture
system using predictive analytics. Asian journal of pharmaceutical and clinical
research Received: 23 January 2017, Revised and Accepted: 03 March 2017.
[3]. Ms. Saima Humma, Prof. Anil Kumar MishraReview on Analysis of
Agriculture Data Using Data Mining Techniques International Journal for
Research in Applied Science & Engineering Technology (IJRASET) ISSN:
2321-9653; IC Value: 45.98; SJ Impact Factor: 6.887Volume 6 Issue IV, April
2018-
[4]. B. MiloviC and V. RadojeviC Application of Data Mining in Agriculture
Bulgarian Journal of Agricultural Science, 21 (No 1) 2015, 26-34 Agricultural
Academy.
[5].Vanitha CN, Archana N, Sowmiya R Agriculture Analysis Using Data Mining
And 2019 5th International Conference on Advanced Computing &
Communication Systems (ICACCS).

You might also like