You are on page 1of 19

3.

1 Data source and Collection


3.2 Statistical Analysis
3.3 Time series Analysis
3.4 Econometric Analysis

‘Every one has a will to win but very few


have will to prepare for win’

--------------- Vinci Lombardi


MJLtttRJJLLS andr ME'FH'CXD
Regions under study
4NDCM3ANG6TIC PLAINS In the present study,
Northern plain - the
plain of the Ganga-
Indus river basin has
.// been chosen for
"...... ?fvXX>" /
investigation. It is the
most fertile land for
lUOgHO
t agricultural production
V C‘»»
and known as rice
f j t'viv .v'w

bowl of India. It
includes the states Punjab, Haryana, Delhi, U.P, Bihar, West Bengal and Assam.

Vegetables under study


There are six different types of vegetable crops: i) Solanaceous crops, ii) Bulb
crop, iii) Root crops, iv) Leguminous crops, v) Leafy vegetables & vi)
Cruciferous vegetables.

In the course of investigation, some of the common vegetables which are


affordable and popular by mass have been considered. The selected vegetables
under study were:

i) Solanaceous vegetables:

a. Brinjal

b. Tomato

c. Chilli

ii) Bulb crops : Onion

iii) Root crops: Potato

iv) Leguminous crops: Peas

v) Okra and Cole crops like cabbage and cauliflower


Data source

The study primarily has been based on secondary data collected from various
national (National Horticultural Board1,2, Department of Economics and
Statistics- IASRI13’4,2 ITC5, DAC/ Directorate of Economics and Statistics6,
Ministry of Agriculture7, APEDA8, NHRDF9, and Planning Commission, Govt,
of India10) and international sources like FAO11, AVRDC12, ADB13 pertaining to

1 NHB. 2006. Indian Horticulture Database - 2005. National Horticulture Board,


Ministry of Agriculture, Govt, of India. 361 pp. http:// www.nhb.gov.in

2 NHB. 2009. Indian Horticulture Database - 2008. National Horticulture Board,


Ministry of Agriculture, Govt, of India. 298 pp. http:// www.nhb.gov.in

3 IASRI. 2004. Agricultural Research Data Book 2004. New Delhi: Indian Agricultural
Statistics Research Institute, Indian Council of Agricultural Research, http://
www. iasri .res. in/agridata/04data% 5 Cchapter%206% 5 Cdb2004tb6_ 11 .htm.

4 IASRI. 2006. Agricultural Research Data Book 2006. New Delhi: Indian Agricultural
Statistics Research Institute, Indian Council of Agricultural Research, http://
www.iasri.res.in/agridata/HOME.HTML.

5 ITC. 2007. International Trade Statistics by Country and Product Group. International
Trade Centre, http://www.intracen.org/tradstat/sitc3-3d/ indexre.htm.

6 Department of Agriculture Co-operation, Ministry of Agriculture, Govt, of India.


http://agricoop.nic.in/Agristatistics.htm

7 Directorate of Economics & Statistics, Department of Agriculture Co-operation,


Govt, of India, http://dacnet.nic.in/eands/latest 2006.htm

8 APEDA. 2007. Export statistics for agro-food products, India 2005-2006. Agricultural
and Processed Food Products Export Development Authority. 610 pp.
http://apeda.com/apedawebsite/

9 National Horticultural Research and Development Foundation, Nasik.


http://www.nhrdf.com/

10 Planning Commission. (2007). National 5-Year Plans: 11th plan proposals. Planning
Commission, Govt, of India, http://planningcommission.nic.in/plans/planrel/
appl l_16jan.pdf.

11 PAOSTAT. 2007. FAOSTAT On-line. Rome: United Nations Food and Agriculture
Organization, http://faostat.fao.org/default.aspx.

36
area, production, productivity and marketing of vegetables in Indian and export
of vegetables in world market. The time series analyses has been carried out to
evaluate the pattern in the data series and extrapolation of that pattern was used
to throw light into the future planning for policy makers.

The data has been collected from secondary sources and analyzed through the
graphs. A time plot has been made and analysis for trends over time of area, the
production, productivity and marketing behavior and other systematic features
for planned strategy in vegetable sector were revealed.

Although, in practice, linear trend has been commonly used, but as it was rarely
fitted the best in production data, therefore, other trend pattern has also been
attempted. The rate of growth or decline is not of constant nature throughout but
varies considerably in different time with different vegetable crops.

The time series analysis has been studied to understand the rate of growth in
area, production and productivity of different vegetable crops grown in northern
plain zones of India. In order to determine the type and nature of the variations
and disparity in the vegetable sector across different states collected data were
processed and analyzed.

Besides appropriate statistical tools (MS-EXCEL, COSTAT) has been used to


compare the actual current performance with the time series data on production,
trade and marketing issues of vegetable sectors in order to understand the causes
of such variations if any in different regions of the country. The analysis would
help to understand and forecast the behavior of the vegetable sector in future
years which is needed by policy makers, administrative planner, and research and
development managers.

12 AVRDC - The World Vegetable Center, http://www.avrdc.org/ index.php? id=


28.

13 ADB, (2007), India’s Economic Growth to Moderate in 2007, ADB Says. ADB News
Release, 27 March 2007. http://www.adb.org/Media/Articles/ 2007/11664-
indian-developmentsoutlooks/

37
iXme series analysis ____________________
Time Series Analysis refers to a collection of specialized regression methods that
use integrated moving averages and other smoothing techniques and have
different assumptions about the error structure of the data.

Trend analysis

Trend analysis uses a technique called least squares to fit a trend line to a set of
time series data and then project the line into the future for a forecast. Trend
analysis is a special case of regression analysis where the dependent variable is
the’variable to be forecasted and the independent variable is time. While moving
average model limits the forecast to one period in the future, trend analysis is a
technique for making forecasts further than one period into the future.

Regardless of whether statistical techniques will be used for analyzing data over
time the most straightforward and intuitive first step in assessing a trend is to plot
the actual observed data by year (or some other time period deemed appropriate).
In addition, the data should be examined in tabular form. These initial steps are
indispensable for understanding the general shape of the trend, for identifying
any outliers in the data. Inspection of the data provides the basis for making
subsequent analysis choices and should never be bypassed. Visual inspection of
the data may indicate that use of statistical procedures is inappropriate.

Statistical Procedures

In order to test whether there is a statistically significant trend or whether two or


more trends are statistically different, several approaches are available.

Regression procedures will generate estimates of future rates as well as estimates


of average annual percent change! Moreover, graphs of the predicted and
projected values as well as the confidence level can be plotted.

38
A forecast is calculated by inserting a time value into the regression equation.
The regression equation is determined from the time-serieas data using the “least
squares method” (Least square method determines the values for a and b so that
the resulting line is the best-fit line through a set of the historical data. After a
and b have been determined, the equation can be used to forecast future values.

The general equation for a trend line: F=a+bt, Where: F - forecast, t - time value,
a - y intercept, b - slope of the line. This data pattern is linear in nature and fits in
straight line equation: y = mx +c, where, y is the predicted/ dependent variable
and x is the independent variable, c is the intercept and m is the slope of the
curve.

There should be a sufficient correlation between the time parameter and the
values of the time-series data. More specifically if the trend line equation is
providing a high value of coefficient of correlation (R2), then higher be the
accuracy of prediction about dependent variable from the given value of
independent variable.

The Coefficient of Determination

The coefficient of determination, R , measures the percentage of variation in the


dependent variable that is explained by the regression or trend line. It has a value
between zero and one, with a high value indicating a good fit.
Goodness of fit: Determination Coefficient RSQ (Range: [0, 1], RSQ=1 means
best fitting; RSQ= 0 means worse fitting)

Choosing the trend that fitts best

1) Roughly: visually, comparing the data pattern to the one of the 5 trends
(linear, logarithmic, polynomial, power, exponential)
2) In a detailed way: By means of the determination coefficient e.g., trends in
area, production, and yield of various vegetable crops would be quantified

39
econometrically by plotting the time curve and by adding trendline in chart
option in MS-EXCEL worksheet.

MS-Excel provides a number of functions that allow users to perform regression


analysis of times series/ trend data. Excel may be used in 6 different ways for
regression analysis, from simple to advanced viz., i) add trend line with display
of regression and r2 to chart ii) Prepare manual calculations iii) Use Excel
Functions (INTERCEPT, SLOPE, RSQ) iv) Use Excel Function - LINEST v)
Use Excel's Analysis Toolpak add-in vi) Use Combination of Excel Statistical
Functions to get equivalent report as Analysis Toolpak.

The add trendline will provide six different trend/ regression type i.e., Linear,
Logarithmic, Polynomial, Power, Exponential and Moving average option. By
choosing and highlighting any one option at a time will deliver the trend graph.
By highlighting option bar, EXCEL window will display-

Plate- Shape and notations of five different types of trend line


(Source: www.excelforum.com)

Trendline name: automatic (as default marked, if already choose for any one of
the trend/regression. type i.e., Linear, Logarithmic, Polynomial, Power,
Exponential and Moving average option.

40
Forecast: Forward and Backward (putting options for desired period will deliver
the predicted value as per the trendline equation)

Set intercept: generally this option should not be highlighted, as by default the
intercept is set at 0, however, for any set of variable if intercept is known the
same may be given for better fit equations

Display equation on chart- the square marked area may be clicked for
highlighting this option to get the best fit equation on the time graph

Display R-square value on chart- the square marked area may be clicked for
highlighting this option to get the best fit equation and R2on the time graph.
Based on the high R2 value, for a given sets of data one can choose the trend
equation that fitts best.

Regression analysis with other trend functions

As shown in above figure, there are many different types of trendline possible.
Each reflects a different relationship between the independent and the dependent
variables. Some trend functions of a single variable - other than a linear or a
polynomial trend - are listed below in a tabular form.

Logarithmic Y = a In (x) +b
' Power Y = axb
Exponential function to base b Y = abx
Natural Exponential function Y = aebx

The logarithmic trendline function, Y = a In (x) +b is already in a linear form


(y=mx+c). Hence, given the x and y values, a (slope) and b (intercept) values
were calculated.

41
The power function Y = axb can be transformed into In (Y) = In a + b In (x). As
in the log trendline, given x and y values, yields In (a) as the intercept and b as
the slope.

The exponential function to base b, Y = abx, can be transformed into In (y) = In


(a) + x In (b) with In (a) as the intercept and In (b) as the slope.

Finally, the natural exponentialfunction Y = aebx can be transformed into In (y)


= In (a) +bx, given x and y values yielded In (a) as the intercept and b as the
slope.

Linear regression with multiple variables

The above case can easily be extended with its single independent variable to
include multiple independent variables. When the dependent variable is a
function of multiple independent variables the problem is called multiple
regressions. Hence, the regression equation would be y = 1114X4 + 1^X3 + 1112X2 +

mixi+mo.

Regression with polynomials

In a regression analysis with polynomials such as y = a3X + a2X + ajx + ao each


term is like a different variable, i.e. y = a3Z3 + a2Z2 + ajZi+ ao, written in this
form it becomes apparent that a polynomial regression is no different from a
multiple regression.

42
(Descriptive Statistics
MEAN

Returns the average (arithmetic mean) of the arguments.

'• The arguments must either be numbers or be names, arrays, or references


that contain numbers.
• If an array or reference argument contains, text, logical values, or empty
cells, those values are ignored; however, cells with the value zero are
included.

VARIANCE

Returns covariance, the average of the products of deviations for each data point
pair. Use covariance to determine the relationship between two data sets. For
example, you can examine whether greater income accompanies greater levels of
education.

• The covariance is:

Cov(X Y) = X(X-X mean) (Y-Y mean)

Where, x and y are the sample means AVERAGE (array 1) and AVERAGE
(array2), and n is the sample size.

CORRELATION COEFFICIENT

Returns- the square of the Pearson product moment correlation coefficient


through data points in knowny's and known_x's. The r-squared value can be
interpreted as the proportion of the variance in y attributable to the variance in x.

• The equation for the Pearson product moment correlation coefficient, r,


is:
r= E(X-Xmean) (Y-Y mean)

43
“X mean ? I (Y-Ymean)
Where, x and y are the sample means AVERAGE (known_x’s) and AVERAGE
(known_y’s). RSQ returns r2, which is the square of this correlation coefficient.

COEFFICIENT OF VARIATION

Returns the average of the absolute deviations of data points from their mean.
AVEDEV is a measure of the variability in a data set.

• The equation for average deviation is:

CV= ~ I(X-X mean)

AVEDEV is influenced by the unit of measurement in the input data

LEVEL OF CONFIDENCE

The confidence interval is a range of values. The sample mean, x, is at the center
of this range and the range is x ± CONFIDENCE. For example, if x is the sample
mean of delivery times for products ordered through the mail, x ±
CONFIDENCE is a range of population- means. For any population mean, go, in
this range, the probability of obtaining a sample mean further from go than x is
greater than alpha; for any population mean, go, not in this range, the probability
of obtaining a sample mean further from go than x is less than alpha.

CONFIDENCE (alpha, standard dev, size)

Alpha is the significance level used to compute the confidence level. The
confidence level equals 100*(1 - alpha) %, or in other words, an alpha of 0.05
indicates a 95 percent confidence level.

• If we assume alpha equals 0.05, we need to calculate the area under the
standard normal curve that equals (1 - alpha), or 95 percent. This value is
± 1.96. The confidence interval is therefore:

XmCan±1.96(cWn)

44
Econometric analysis

Benefit-cost ratio (B: C)

This has been estimated as net return (as defined above) divided by all variable
costs and multiplied by one hundred. The costs of all inputs including family
owned resources, except land, have been treated as variable cost in this case.

Cost per unit of output


This has been estimated as total cost per ha divided by per ha yield. It provided
the relative value of different vegetables in a country, and has been used to
compare output efficiency in vegetable production.

Partial input productivity (PIP)

In estimating the partial input productivity of variable inputs, example fertilizer,


seeds, agro-techniques or water, the cost of all other inputs has been assumed as
fixed, and only the cost of that particular input has been considered as a variable.
In the present study, the economic efficiency of fertilizer, seed, agro-techniques
and irrigation was estimated. This was estimated by determining the difference in
output say additional yield realized by application of the target input (whose
partial productivity) has to be determined. The additional output has been
converted in terms of net return and divided by the particular input cost resulted
partial input productivity.

The equation developed for calculation of PIP as:

PIP= (8NR / VC)


Where, 8 is the difference in net revenue (NR) i.e., before and after use of the
specific input and VC is per ha variable input cost.

45
Individual input costs

The individual input cost included not only market price, but also its
transportation and spreading cost. The irrigation cost included the cost of water,
if any, in terms of water tax by the government or purchase cost from the
neighboring farmers, irrigation labor cost, plus depreciation cost of irrigation
equipment. In case the source of water was tube well, the irrigation cost
additionally included the cost of maintenance, depreciation, and operation of the
tube well.

Total and cash costs

Total production cost for each crop has been estimated by adding individual cost
items. Cash cost has been estimated as the total cost less the value of family
•labor and family-produced manure and seeds. The interest rate on cash cost has
also been included in the total cost at the rate of 10% per crop season. The share
of each cost item (factor share) in the total cost was estimated in percentage
terms. The factor shares for labor, seed, fertilizer, manure, irrigation, pesticide,
and others (staking and mulching) has been reported. In estimating these shares,
the cost of the labor used to apply an input has been taken out from the input cost
and aggregated into the labor cost.

Gross revenue

Gross revenue has been estimated as outputs (main and by-products) produced in
one planting period multiplied by market price of the output.

Economic efficiency in production

Net returns have been estimated as gross revenue less cost of all variable inputs.
All inputs including family labor and other farm-owned resources, except land

46
and management labor, has been considered as variable inputs. Higher net
returns, therefore, indicate efficiency of land and input management (seed,
fertilizers etc.) combined.

Efficiency of Technology

Technical efficiency of vegetable production has been estimated by using a


production function on the combined yield data for different treatments / regions
over a time scale following the procedure of Ali, 200014. The following
production function as example has been specified for this purpose.

Y=f(S, F, M, L, /, T, Q-........... (1)


Where, Y is yield of any vegetable crop under study (output in quintal or tonnes),
S is seed quantities/ type in kg, F is fertilizer nutrient quantities in kg, Mis farm
manure in kg, L is labor days, / is irrigation status, T is intervening technology
say (staking in 3 tier for tomato crop in vegetable farmer=casel, no staking for
vegetable farmer = case 2), C = cultivar type, i.e., varieties may be hybrid and /
or open pollinated. All quantitative inputs and output variables have been
transformed into per ha basis.

The production function in equation (1) was specified in best fit trend line
equations. The contribution of individual input components over the yield
function was estimated statistically. The significant differences in yield as
contributed by different inputs were estimated by standard statistical design of
experiments, mostly Randomized Block Design (RBD) in present study. The
significant difference in yield value as affected by contribution of any individual
input in equation (1) will represent the extent of difference in technical
efficiency, at the given level of specific input use.

14 Ali, M. (2000). Dynamics of vegetable production, distribution and consumption in


Asia. AVRDC Publication No. 00-498

47
In the present study, the yield data has been considered as dependent variable
and the effect of factors like seed (genotype), fertilizers and technical
interventions were estimated as independent variable. Standard statistical
package COSTAT has been used for analysis. Further yield has been considered
as a function of input variables and multiple regression analysis derived the
partial coefficient values. The partial coefficient values representing the
influence of each factor were converted in percent contribution of that factor on
yield parameter.

Market integration

Indian markets across states are not well integrated, as evidenced by wide
variability in seasonality of a vegetable across markets. For example, prices of
brinjal in Calcutta may be higher in October month, while in the low range in the
Delhi and Madras markets. A similar situation may be seen to prevail for other
vegetables and markets as well across different major markets in India.
Therefore, integrating markets by providing information on market arrival and
prices can help to reduce seasonality in Indian markets.

The seasonal variation in prices and market arrivals of different vegetables


has been estimated following the process of Ali, 2005 15 from the monthly

market arrival and price data collected, from different major markets in India.
Monthly prices were converted into indices (with January as a base) separately
for every year. The actual price of a particular vegetable in a particular market in
the month of January was converted to 100 by multiplying a factor, thereafter,
same conversion factor was used to convert the actual prices into the price
indices for different months. Then the average of the three years’ monthly
indices was calculated. Months with missing prices were excluded from the
estimation. If January prices were not available (indicated by -) for all of the

15 Ali, S. (2005). Total Factor Productivity Growth and Agricultural Research and
Extension: An Empirical Analysis for Pakistan’s Agriculture, 1960-96. Pakistan
Development Review. 44: 4 Part II, Pp. 729-746.

48
three years, June was taken as the base. Weighted average prices of a vegetable
were calculated by weighing the individual monthly prices with the share of a
market in the total monthly arrival of that vegetable in all India. Similarly, the
weighted average price index of all vegetables in a month in India was calculated
by weighing the relative share of all vegetables in a month. The same procedure
was followed for getting the monthly arrival indices for different vegetables in
major markets in India.
Marketing efficiency

The present study covered secondary data 16 from wholesale vegetable market
yards from Ahmedabad city viz., the Sardar Patel Market (SP Market) and
Chimanbhai Jivabhai Patel Market (CJP Market). From the Chennai City, two
wholesale markets namely Koyambedu Fruits and Vegetable Wholesale Market
(KFVWM), and Ambattur Farmer's Market (AUS) also known as Ambattur
Ezhawar Sandhai were selected. From the Kolkata city markets that were
selected for this study are S.S. Hogg market, Posta market and Mechua Fal Patty
market. While the market officials and records were consulted for collecting
relevant data on the functioning of the markets, physical infrastructure etc of the
markets, structured questionnaires were used to collect the information from the
market intermediaries such as wholesalers/ commission agent, retailers and the
producer farmers. The sample respondents from the markets comprise of
commission agents, retailers and producers.

The selection of different vegetables from these markets was based on their
importance in terms of volume of sale in the respective markets. The vegetables
selected for the study were potato, onion, tomato, cabbage, cauliflower, brinjal,
green-pea and lady's finger.

The administrative set-up of the selected markets does differ. While the markets
selected from Ahmedabad were regulated, the other markets were not.

16 Gandhi Vasant P and N V Namboodiri, 2002. Marketing of Fruits and Vegetables in


India: A Study of the Wholesale Markets in Ahmedabad Area, Centre for
Management in Agriculture, Indian Institute of Management, Ahmedabad.

49
Agricultural Produce Marketing Committee (APMC) controls and administers
the selected regulated markets in Ahmedabad. The market management
committee headed by the chief administrative officer of the Chennai
Metropolitan Development Authority (CMDA) and the Kancheepuram market
committee respectively controls the KFVWM and the AUS markets in Chemiai.
As mentioned above, in Kolkata City there are no regulated markets for the sale
of fruits and vegetables and all the markets are controlled by the local political
leaders, Municipal Corporation and the Government of West Bengal.

Market variation

The three years average price index (PI) and market arrival (MA) data for a
particular vegetable crop for a particular market in twelve different months were
sorted in terms of maximum, minimum and mean price index values as well as
market arrival values. The per cent variation was calculated by subtracting
maximum and minimum values, as under:

Variation (%) in Price = (PI max - PI mjn)

Variation (%) in market arrival = (MA max- MA mjn)

Seasonality

Seasonality in vegetable supplies was estimated from average monthly data on


prices and availability. The average price and market arrival index was estimated
from the monthly market arrival and price data for the major markets in India.
Monthly prices were converted into indices (with January as a base) separately
for every year. The actual price of January for any crop in any market was
converted to 100 by. using a suitable conversion factor and accordingly price
indices were calculated in MS-EXCEL for all the data set. Then the average of
the three years’ monthly indices was calculated.

50
Similarly the actual market arrival of January for any crop in any market was
converted to 100 by using a suitable conversion factor and accordingly market
arrival indices were calculated in MS-EXCEL for all the data set.

Seasonality has been estimated following the equation:

Seasonality (%) in price = (PI max - PI min) / PI mean * 100

Seasonality (%) in market arrival = (MA max - MA mjn) / MA mean * 100

Export performance ratio (EPR)

To study the trend in trade the export performance ratio (EPR) was estimated to
examine the comparative advantage of India in export of major vegetable, using
the method suggested by Balassa (1965) 17. The EPR of India in potato and
tomato was estimated by the equation:

EPR= Sit/ Swt ...(1)

where,
Sit= Share of reference individual commodity in India’s total export, and
Swt = Share of that commodity in the total world export.

Since EPR is based on observed pattern of trade flows, it is also called Revealed
Comparative Advantage (RCA).

If EPR or RCA is greater than unity, the country has the comparative advantage
in export of the concerned commodity and vice versa.

17 Balassa, B. 1965. Trade Liberalization and Revealed Comparative Advantage,


Manchester School of Economics and Social Studies, 33 (2): 99-124.

5.1
Revealed Symmetric Comparative Advantage (RSCA)

As suggested by Laursen (1998) , RCA was made symmetric by obtaining the


index as (RCA-l/RCA+1). This index is known as Revealed Symmetric
Comparative Advantage (RSCA) and varies from -1 to +1.

A negative value of RSCA indicates that the commodity is not competitive in


export. The positive value and values nearer to 1 indicated that the commodity is
highly competitive in export.

To study variability, the per cent coefficient of variation was used as an index of
instability. The sustainability in export of tomato, potato and onion was
estimated by computing the coefficient of variation as suggested by Kumar et
aL (2005)19.

To examine instability in export from India, the coefficients of variation (CV) in


export of that commodity has been estimated. The values of CV in export of that
commodity in time scale say during the post-WTO than pre-WTO period, helps
us to predict the effect of agricultural policy on export of a commodity from
India.

The high CV value in case of export from India indicated high degree of
instability in the market which may be due to many bottle necks and involvement
of many factors. Similarly, low CV values indicate a high degree of stability in
export market.

18 Laursen, K. (1998). Revealed Comparative Advantage and the alternatives as


measures of International Specialization, DRUID Working Paper No. 98-30,
Copenhagen, IVS, Copenhagen Business School.

19 Kumar, N. R., Singh, B.P. Paul Khurana, S.M. and Pandey, N.K. (2005). Impact of
WTO on Potato Export from India. Agricultural Economics Research
Review>, 18: 291-304

52

You might also like