Professional Documents
Culture Documents
Assignment – 1
Introduction:
The port city of Ratnagiri is located on the Arabian Sea coast in Maharashtra, India's southwest.
The district is a part of Maharashtra's Konkan region. Rice is the primary crop produced in
Ratnagiri. It plays a significant role in the local population's daily diet. Due to their proximity to
the Sahyadri ranges, where there is ample rainfall, rice is widely grown in the talukas of Khed,
Rajpur, Sangameshwar, Chiplun, and others. The below analysis is done for understanding the
crop production pattern from 1997 to 2014 and forecasting the future crop production of
Ratnagiri District, Maharashtra.
Forecasting Model:
ARIMA model:
ARIMA is an acronym for the autoregressive integrated moving average. It’s a
statistical analysis model that uses time series data to either forecast future trends
or provides a better understanding of the current data set.
The ARIMA model is typically written as ARIMA(p, d, q), and the definitions of
p, d, and q are as follows:
p: the lag order or the number of time lag of autoregressive model AR(p)
d: degree of differencing or the number of times the data have had
subtracted with past value
q: the order of moving average model MA(q)
Objective:
Data Collection:
The raw data of the State wise crop production in India is collected from the
standard data collection websites like GitHub, and Kaggle.
The data has all the state and district-wise information about crop production.
Also, there is data related to yearly crop production for the different types of
fruits, Bajra, Rice, Coconut, and Sugarcane.
The information about different seasons like Kharif, Rabbi, and summer and crop
production for each season is also present in the data collected.
Since our analysis is related to the Ratnagiri District of Maharashtra we have
applied the filter in excel and after the data cleaning further analysis is done only
on the required data.
The final data for Ratnagiri District in Maharashtra is imported into R studio for
Data analysis.
R code for predicting the rice production for the next 12 periods:
CP=Crop_Production
CP
plot(CP)
CP=log(CP)
plot(CP)
ts(CP,frequency = 2,start = c(1997,1),end = c(2014,1))
plot(CP)
library(forecast)
model=auto.arima(CP$`Crop Production`)
model
acf(model$residuals, main="Correlogram")
pacf(model$residuals, main="Partial Correlogram")
FC=forecast(model,12)
FC
library(ggplot2)
autoplot(FC)
accuracy(FC)
fc=as.data.frame(FC)
ds1=exp(fc$`Point Forecast`)
ds1
ts(ds1,frequency = 2,start=c(2015,1))
year=c(2015,2016,2017,2018,2019,2020)
Kharif=c(2462.263,7057.439,10654.286,12516.761,13330.907,13663.581)
rabbi=c(40944.759,21192.705,16379.974,14809.969,14237.635,14019.833)
FP=data.frame(year,Kharif,rabbi)
FP
Autocorrelation is used to measure the degree of similarity between a time series and a lagged
version of itself over the given range of time intervals.
On the above plot, there is a significant correlation at lag 0 followed by correlations that are not
significant.
The partial correlogram is seen between the ranges -0.3 to 0.3 which says the model is good fit.
Though there was a drop in the production during a period later the production variance
was stable.
From the above plot, we can see less variation in the production of rice crops in the latest
seasons in Ratnagiri, Maharashtra.
Final Forecasted Values for both seasons: