You are on page 1of 102

COLLEGE OF DEFENCE MANAGEMENT

DATATHON-2023
INDIVIDUAL DETAILS

SERVICE NUMBER 29390-R

RANK WG CDR

NAME ASHOK RAWAT

MOBILE NUMBER 8197058283/8095052346

EMAIL ID ashok.su30@gmail.com

Experience in Data Analytics(Yrs) Minimal Exposure

UNIT DIRECTORATE OF OPERATIONAL


NETWORK/ DTE IACCS

FORMATION AIR HEADQUARTERS(VB)

ADDRESS DEFENCE OFFICES COMPLEX A


COMPLEX 8TH FLOOR KASTURBA
GANDHI MARG NEW DELHI-110001

PLATFORM/SOFTWARE PYTHON AND PYCHARM IDE


USED

ANY OTHER INFO

1
INDIA’S TRADE TRAJECTORY- GEO-STRATEGIC
OPPORTUNITIES

WG CDR ASHOK RAWAT 29390-R


November 5, 2023

Abstract
This research project employs data analytic techniques and algorithms to examine and
predict India’s trade patterns by analyzing historical data. It utilizes a regression model
to explore the relationship between exports, imports, and Gross Domestic Product (GDP),
which represents the total market value of goods and services produced within a country’s
borders during a specific time frame and serves as a comprehensive indicator of the coun-
try’s economic well-being. This project applies various data analytic methods to scrutinize
India’s trade data. It uses the ARIMA model to assess and forecast trends in India’s ex-
ports, imports, government spending, investments, and GDP. Furthermore, a regression
model is employed to determine how GDP depends on independent variables like exports,
imports, consumption, and expenditure. Predictive trends are also generated using ARIMA
and regression techniques. The project compares the results obtained from ARIMA and
regression methods.K-Means classification is used to categorize commodities that have a
significant impact on imports and exports, as well as their contributions to the total trade
volume. The study delves into Preferential Trade Agreements, which encompass not only
trade but also additional policy areas such as investment flows, labor, intellectual prop-
erty rights, and environmental protection. Despite being referred to as trade agreements,
these agreements aim for deep integration beyond trade.The project examines various pro-
visions of deep trade agreements and employs Principal Component Analysis (PCA) to
reduce high-dimensional datasets to lower dimensions, identifying crucial provisions for
trade agreements without significant information loss. Python programming language is
utilized for data preprocessing, data exploration, and the implementation of data analytics
techniques such as K-Means classification, Principal Component Analysis, and ARIMA
algorithms, along with visualization.The analysis is conducted using datasets spanning 62
years (1960-2022) at annual intervals, gathered from international organizations such as
the World Bank, International Monetary Fund (IMF), World Trade Organization (WTO),
UNCTAD and CDM. The insights derived from this study will aid India’s decision-makers
in formulating trade strategies based on observed trends and forecast analyses conducted
as part of this project.

Keywords GDP, ARIMA,Regression model, Exports, Imports, Government expenditure,


Investment, Consumption, Preferential Trade Agreements, K-Mean Clustering, Principal
Components Analysis (PCA). World Bank, IMF ,WTO, CDM.

1 INTRODUCTION
International trade involves the exchange of goods and services across borders or territories,
providing consumers and nations access to products not locally available. Virtually every com-
modity, ranging from food, clothing, and spare parts to oil, jewelry, wine, stocks, currencies,
and water, can be found in the global market. Additionally, services like tourism, banking,

2
consulting, and transportation are subject to international trade. This exchange results in the
flow of money, both inflow and outflow, whenever goods and services are traded. In essence, ex-
ports, imports, and Gross Domestic Product (GDP) play pivotal roles in analyzing a country’s
economic situation. Delving into the intricate details, we can gain a deeper understanding of
the relationships among various entities involved in international trade.

GDP = F n (Consumption, Investment, Expenditure, Exports, Imports) (1)

The above equation 1 is a GDP equation without known coefficients and constants, empha-
sizing a project’s goal to identify and determine these coefficients to comprehend the relationship
among various entities. The analysis and prediction of the interplay between GDP, Exports,
and Imports are crucial for recognizing potential trends and anticipating future impacts on a
country’s economy.

In today’s world, global organizations like the World Trade Organization (WTO) play a crucial
role in facilitating trade among nations with varying economic conditions. These entities are
committed to promoting and advancing international trade. International trade can be broadly
classified into three main types: Export Trade, Import Trade, and Entrepot Trade. Entrepot
Trade, also known as Re-exports, involves the process of importing goods from one country,
enhancing their value, and then exporting these improved products to another country. For
example, India imports raw materials like gold from China, refines and transforms them into
jewelry, and subsequently exports the finished products to various nations.

This project’s primary focus is on the application of regression and forecasting models to
carefully analyze India’s current trade data with the main objective of predicting the future
trajectory of India’s trade. Moving beyond this national perspective, we delve into the increas-
ing significance of Preferential Trade Agreements (PTAs) within the global trading system.The
prevalence of PTAs has seen a substantial increase, growing from around 50 in the early 1990s
to an impressive count of over 350 by 2023. It’s worth noting that every member of the World
Trade Organization (WTO) is now involved in at least one PTA, often participating in multiple
such agreements simultaneously.

A significant development is the expansion of the scope of PTAs. In the 1950s, a typical
PTA covered just eight policy areas. However, in contemporary times, these agreements have
considerably broadened, encompassing an average of 18 policy areas mandated by the WTO.
This shift highlights the extensive impact and influence that PTAs have on the complex land-
scape of international trade. As a result, this project aims to provide a detailed understanding
and deeper insights into the dynamics of Preferential Trade Agreements, particularly as they
relate to India’s trade landscape. Furthermore, we identify the provisions that are essential for
PTAs and must be included in any trade agreements.

2 PROBLEM STATEMENT
In this project we try to answer following questions based on available India’s trade data using
several data analytics techniques, methodologies, algorithms, and programming languages to
show:

(a) What are the historical trends in India’s Import and Export activities?

3
(b) What are the historical patterns of Imports and Exports with nations engaged in trade
relations with India?

(c) Who are the significant trade partners of India in terms of both Imports and Exports?
(d) What are the primary products that India has traditionally exported?
(e) What are the key products that India has historically imported?
(f) What is the regional breakdown of India’s Trade share concerning Imports and Exports?

(f) Applying K-mean classification to identify product categories significantly contributing to


both Export and Import activities.
(g) Employing the expenditure approach for GDP calculation, which microeconomic variables
exhibit a correlation with GDP?

(h) How do Imports and Exports influence the overall GDP of India?
(j) Can we anticipate the future trajectory of India’s trade using time series forecasting tech-
niques such as regression and ARIMA models for Imports, Exports, Government expen-
diture, Consumption, Investment, and GDP?

(k) Investigate various preferential trade agreements, conducting a thorough analysis of provi-
sions incorporated since 1950. What provisions should India consider to protect its trade
interests?

3 PROJECT AIM
In this project, various data analytics techniques and algorithms were used to analyze and pre-
dict India trade trends for the upcoming years using historical data sets from previous years
(1960-2022). These data sets contain data about GDP, the value imports ,export, total govern-
ment expenditure, consumption expenditure, investment, all countries exports, and imports as
a whole and by products.Further , data available about PTA contain trade agreements between
countries and information about various provisions contain within the agreements and its legal
implications.

In our comprehensive analysis, we meticulously scrutinized the amassed data using sophisti-
cated regression models to unravel the intricate relationships between GDP and pivotal inde-
pendent variables, including Exports and Imports, among others. This rigorous approach aimed
to achieve the most optimal fit for GDP across various microeconomic factors.

Subsequently, we employed the robust AutoRegressive Integrated Moving Average (ARIMA)


model to discern potential trends in India’s future trade. Our focus extended to forecasting
trade directions and trends for the upcoming years (2023-2027) across a spectrum of microe-
conomic variables, encompassing expenditure, imports, exports, investment, consumption, and,
paramountly, GDP. This forecasting endeavor involved a rigorous comparison between fore-
casted GDP values using ARIMA and those calculated through regression analysis.

This project goes beyond conventional analyses by offering a profound comparative assess-
ment of trade dynamics between two major players, the USA and China, highlighting India’s
trade interactions on a global scale. Furthermore, we meticulously identified India’s foremost

4
trade partners, shedding light on the top trading products and discerning the directional trends
associated with these commodities. This endeavor is designed to provide not just insights but
a panoramic understanding of India’s intricate trade landscape.

3.1 AIMS AND OBJECTIVES


(a) Apply techniques, algorithms, and models to analyze India trade.

(b) Implement a regression analysis model to analyze the relationship between GDP and
Exports, Imports and other micro-economics variables.
(c) Implement the time-series technique to predict the trends of Exports,Imports and other
microeconomics variables and to be able to apply the time series forecasting model on the
Exports ,Imports and other macroeconomiccs variable trends for the next coming years.

(e) Analyse the various PTA and various provisions contain in PTA. Using Principal Com-
ponents analysis identified the various provision which are important for any trade agree-
ments.

3.2 RESEARCH METHODOLOGY


To fulfill the set objectives of examining and understanding the correlation among GDP, Ex-
ports, Imports and other microeconomics variables in the context of India’s trade, and to forecast
the movement of exports and imports for India’s goods and services, it is imperative to adhere
to a systematic data mining methodology that encompasses the entire process from raw data
to knowledge. In this study, we adopt the Cross-Industry Standard Process for Data Mining
(CRISP-DM), utilizing various models. CRISP-DM not only demonstrated its efficacy but also
offers a comprehensive framework for the planning and execution of this study.

3.3 LIMITATIONS OF THE STUDY


During this study, we identified the following limitations that need to be taken into consideration
for assessment of the results and should be considered by future work.

(a) Limited and small data sets.


(b) Limited knowledge in econometric methods and theories.
(c) Limited knowledge of trade theories and factors that impact trade between countries

This analysis is based solely on historical trade data and does not take into account other
factors influencing international trade, such as pandemics, geopolitical conditions, wars, and
unforeseen events. It’s important to acknowledge that this project does not aim to provide a
comprehensive trade analysis, as a thorough examination and forecasting in this domain require
the integration of trade and economic theories, extensive datasets, and consideration of various
factors, including political circumstances. Due to time limitations, the scope of this project was
constrained, resulting in a limitation on the number of data analytics techniques, algorithms,
and economic methodologies explored.

5
Figure 1: CRISP-DM Methodology

4 STUDY AND LITERATURE SURVEY


The literature review concentrates on studies pertinent to examining the correlation between a
nation’s GDP and its Imports and Exports. Additionally, we delved into research that explored
the repercussions of the global crisis in 2008, providing insights into how it influenced a country’s
Exports and Imports. The concluding theme addressed in this literature review pertains to
research efforts aimed at forecasting trade directions, with a specific focus on the application of
the ARIMA model.

4.1 TRADE RELATIONSHIPS GDP, IMPORTS AND EXPORTS


[6] Applied a multiple regression analysis model in macroeconomic research focused on Bosnia
and Herzegovina spanning the period from 2005 to 2018. In this study, the researchers examined
six distinct independent variables: Foreign Direct Investments, Imports, Exports, Growth Rate,
Unemployment, and Inflation. The objective was to assess the impacts of these macroeconomic
factors, treated as independent variables, on Gross Domestic Product, serving as the dependent
variable.The researchers applied the subsequent multiple regression formula in their analysis:

Y = α0 + α1 X1 + α2 X2 + α3 X3 + α4 X4 + α5 X5 + α6 X6 (2)
In the provided equation, Y denotes the dependent variable, and X1 through X6 signify
the six independent variables. The coefficients α0 through α6 serve to quantify the impact of
each corresponding independent variable on the dependent variable, as outlined by Stanko and
Stanić [6]. Their multiple linear regression model includes the following components:

(a) The dependent variable, exemplified as GDP in their analysis.

6
(b) The independent variables, illustrated as FDI, Imports, Exports, Growth rate, Unemploy-
ment, and Inflation in their example.

(c) The constant term.


(d) The error term, encapsulating all influences on the dependent variable (GDP) not orig-
inating from the specified independent variables (FDI, Imports, Exports, Growth rate,
Unemployment, Inflation).

In their study, the authors utilized the multiple linear regression model to quantify the re-
lationship between macroeconomic factors and GDP. Their findings revealed that a substantial
93.2% of variations in GDP could be explained by the specified independent variables: FDI,
Imports, Exports, Growth rate, Unemployment, and Inflation. Furthermore, the authors con-
cluded that among all the examined independent variables, Imports had the most significant
impact on GDP, followed by FDI and Exports.

Goyal , D. A., and Vajid [2] examined the bilateral trade dynamics spanning the years 2006
to 2016 between India and the UAE. The findings underscored the robust trade partnership
between the two nations, indicating a particularly strong and intensive trade relationship com-
pared to India’s engagements with other trade partners. The authors employed the Trade
Intensity Index, comprising both the Exports Intensity Index (EII) and Imports Intensity Index
(III), as metrics to gauge the intensity of trade interactions between India and the UAE.The
Export Import Index (EII) is defined as:
XIG
XI
EII = MG
(3)
MW −MI

where:
XIG = India’s Export to UAE,
XI = India’s Total Export,
MG = Total Import of UAE,
MW = Total World Import,
MI = Total Import to India.
III is given by:
MIG
MI
III = XG
(4)
XW −XI

where,
MIG = Import of India from UAE
MI = Total Import of India
XG = Total Export of UAE
XW = Total World Export
XI = Total Export of India

In Mukit’s study [4], the causal connection between Bangladesh’s GDP and its exports,
imports, and inflation is demonstrated, emphasizing the pivotal role of these factors in the
nation’s economic development. This research thoroughly explores the intricate relationship
among exports, imports, inflation, and GDP, providing a conclusive perspective on whether
prioritizing exports is beneficial for the country. The results underscore a noticeable correlation
among exports, imports, inflation, and GDP. The research reveals that exports have a positive

7
yet marginal impact on Bangladesh’s GDP. Additionally, it highlights that while the effects of
exports and imports are minimal, inflation plays a crucial role in fostering economic growth in
Bangladesh.

4.2 ECONOMICS CRISIS


[1] The researchers investigated the repercussions of the global economic crisis in 2008 on in-
ternational trade. According to their findings, the distinctive feature of international trade,
represented by the trade balance (X-M), exhibited notable changes in both pre- and post-
recession periods. The study indicates that the 2008 recession had a detrimental impact on
both exports and imports, resulting in a substantial decline in 2009. Specifically, exports de-
creased from 17.83 trillion USD to 13.99 trillion USD, while export of India declined from 0.29
trillion USD to 0.26 trillion USD.Trends of decline is quite evident during Covid-19 in 2020.The
same is depicted in the following Figure 2,3,4 and 5.

Figure 2: World Export Trends

8
Figure 3: India’s Export trends

The research further highlights the adverse effects of the crisis on oil trade, particularly
impacting the economies of the Arab World. The economic downturn resulted in a decreased
demand for oil, leading to a substantial decline in oil prices from over 100 USD to less than
50 USD per barrel. This drop in prices was a consequence of reduced global economic activity.
Consequently, all Gulf countries experienced significant declines in their trade figures during the
years 2008 and 2009. Simultaneously, the decrease in oil prices proved to be advantageous for
major oil-importing nations such as India. The fall in oil prices contributed to economic benefits
for countries that heavily rely on oil imports, offsetting the negative impacts experienced by
oil-exporting nations in the Arab World.Similar trends observed during Ukraine-Russia war.

9
Figure 4: World Import trends

Figure 5: India Import trends

10
In the study conducted by Lu [3], an analysis and forecast of U.S. total textile and apparel
exports spanning the period 1989-2025 were carried out using regression and ARIMA models.
R-based statistical software was leveraged for ARIMA forecasting. Through a comparison of
the results obtained from both forecasting methods, the researcher observed that the regression
and ARIMA models produced nearly identical outcomes. The projection indicated that the U.S.
total textile and apparel market is expected to reach 29.5 billion in 2025, reflecting a significant
21 percent increase compared to the figures recorded in 2014.

Pengliang [5] employed the ARIMA model to analyze the time series data representing China’s
service trade volume spanning from 1982 to 2015, utilizing information from the China Statis-
tical Yearbook 2015 and the Ministry of Commerce, Department of Trade and Services. Data
from the years 1982 to 2014 served as modeling samples, while the trade data for 2015 was
used as a posterior sample to assess the validity of the model’s estimation.Utilizing the ARIMA
model, the authors made forecasts for the imports and exports of service trade in 2016. The
projected value for 2016 amounted to 749.9989 billion, aligning with the authors’ expectations
of stable development in China’s trade in services.

5 UNDERSTANDING DATA FOR ANALYTICS


In this project, we utilized a dataset outlined in Section 7 gathered from various sources to
scrutinize and predict trade trends of India. Our approach followed the Cross-Industry Stan-
dards Process for Data Mining (CRISP-DM) methodology. This methodology commenced with
a comprehensive understanding of the project’s challenges, objectives, and goals. Subsequently,
we delved into comprehending the attributes of the data and their respective descriptions. Fol-
lowing this, data pre-processing procedures were employed to cleanse the data and render it
suitable for analysis and modeling. Ultimately, we applied following analytics algorithms and
techniques to achieve our objectives:
(a) Regression Modeling
(b) ARIMA.
(c) Principal Components Analysis

5.1 REGRESSION MODEL


Multiple regression algorithm was used to demonstrate the relationship between the dependent
variable (GDP) and the independent variables. This model was applied to analyze the correla-
tion between GDP and other economic factors. We calculated GDP using below equation.
GDP = F n (Consumption, Investment, Expenditure, Exports, Imports) (5)
where,
GDP = Gross Domestic Product is a function of :
G = Government Expenditure.
C = Final Consumption.
I = Gross formation of fixed capital (Investment).
E = Value of Exports for goods and services.
M = Value of Import for goods and services.

11
5.2 ARIMA MODEL
ARIMA stands for Autoregressive Integrated Moving Average Model. It belongs to
a class of models that explains a given time series based on its own past values i.e. its own
lags and the lagged forecast errors. The equation can be used to forecast future values. Any
‘non-seasonal’ time series that exhibits patterns and is not a random white noise can be modeled
with ARIMA models. ARIMA Models are specified by three order parameters: (p, d, q) where,
p is the order of the AR(Auto Regression) term
q is the order of the MA (Moving Average)term
d is the number of differencing required to make the time series stationary
(a) AR(p) Autoregression– a regression model that utilizes the dependent relationship be-
tween a current observation and observations over a previous period. An auto regressive
(AR(p)) component refers to the use of past values in the regression equation for the
time series.
(b) I(d) Integration – uses differencing of observations (subtracting an observation from obser-
vation at the previous time step) in order to make the time series stationary. Differencing
involves the subtraction of the current values of a series with its previous values d number
of times.
(c) MA(q) Moving Average – a model that uses the dependency between an observation and
a residual error from a moving average model applied to lagged observations. A moving
average component depicts the error of the model as a combination of previous error terms.
The order q represents the number of terms to be included in the model.

5.2.1 The meaning of p, d and q in ARIMA model


(a) The meaning of p: p is the order of the Auto Regressive (AR) term. It refers to the
number of lags of Yt to be used as predictors. Yt is a time series.
(b) The meaning of d: The term Auto Regressive’ in ARIMA means it is a linear regres-
sion model that uses its own lags as predictors. Linear regression models, as we know,
work best when the predictors are not correlated and are independent of each other. So
we need to make the time series stationary.The most common approach to make the series
stationary is to difference it. That is, subtract the previous value from the current value.
Sometimes, depending on the complexity of the series, more than one differencing may be
needed.The value of d, therefore, is the minimum number of differencing needed to make
the series stationary. If the time series is already stationary, then d = 0. The meaning
of stationary series is which has 0 mean and constant variance.
(c) The meaning of q: q is the order of the Moving Average (MA) term. It refers to
the number of lagged forecast errors that should go into the ARIMA Model. An ARIMA
model is one where the time series was differenced at least once to make it stationary and
we combine the AR and the MA terms. So the equation of an ARIMA model becomes :

Yt = α + β1 Yt−1 + β2 Yt−2 + β3 Yt−3 + .... + βp Yt−p + ϵt + ϕ1 ϵt−1 + ϕ2 ϵt−3 + ... + ϕq ϵt−q (6)

ARIMA model in words: Predicted Yt = Constant + Linear combination Lags of Y (upto


p lags) + Linear Combination of Lagged forecast errors (upto q lags). The ARIMA model
is used extensively in this project in later section to forecast and predict the future values
of Import, Import, GDP, Consumption and Govt. Expenditure.

12
6 TOOLS USED FOR DATA ANALYTICS
In this research project, the Python programming language (Python 3.8) played a central role
in conducting comprehensive data analysis, with the PyCharm integrated development environ-
ment (IDE) serving as the primary tool. Python offers an extensive array of libraries that are
highly beneficial for analytical purposes. Throughout our project, we leveraged various libraries,
as illustrated below:
from sklearn.linear_model import LinearRegression
from sklearn.model_selection import train_test_split
from prettytable import PrettyTable
from plotly.subplots import make_subplots
from raceplotly.plots import barplot
from statsmodels.graphics.tsaplots import plot_acf #Auto-Correlation Plots
from statsmodels.graphics.tsaplots import plot_pacf #Partial-Auto Correlation Plots
from sklearn.cluster import KMeans
from statsmodels.tsa.stattools import adfuller
from statsmodels.tsa.arima.model import ARIMA
import statsmodels.api as sm
import seaborn as sns
import plotly.express as px
import plotly.graph_objects as go
import numpy as np # linear algebra
import pandas as pd # data processing, CSV file I/O (e.g. pd.read_csv)
import seaborn as sns #statistical data visualization
import matplotlib.pyplot as plt #visualization library
import statsmodels.api as sm
import pmdarima as pm

7 DATA ANALYSIS
This chapter conducts an in-depth analysis of the data sourced from key institutions, including
the International Monetary Fund (IMF), The World Bank (WB), World Trade Organization
(WTO),UNCTAD and CDM. It offers insights into the data collected from these reputable
sources and employs various models, such as ARIMA and regression analysis, to forecast and
examine the correlations between GDP and key economic indicators like imports, exports, in-
vestment, and consumption. Additionally, the chapter includes forecasting future values for all
these indicators.

7.1 BUSINESS UNDERSTANDING


In 2020, India achieved the remarkable milestone of becoming the fifth-largest economy globally,
with a substantial GDP of USD 2.87 trillion (Trading Economics, 2021). This significant ascent
in the global economic hierarchy reflects India’s sustained economic growth and increasing in-
fluence on the world stage. The country’s economic prowess has not only solidified its standing
in South Asia but has also garnered recognition on the international front.

India’s strategic positioning as a key player in global trade is evident in its extensive net-
work of imports and exports. China, USA, UAE, Saudi Arabia, Germany, Switzerland, Iraq,
Indonesia, South Korea, and Japan are key contributors to India’s imports, while the United
States, UAE, China, United Kingdom, Germany, France, Singapore, Bangladesh, Canada, and

13
Australia feature prominently as export destinations (World’s Top Exports, 2021). This di-
verse and widespread trade connectivity underscores India’s pivotal role in the global economic
landscape.As we delve into our research, we aim to provide a comprehensive analysis of India’s
trade dynamics, employing sophisticated methods such as GDP regression analysis and ARIMA
forecasting. Through this, we seek to gain deeper insights into India’s economic trajectory and
contribute to a nuanced understanding of its role as a major player in the world economy.

7.2 DATA UNDERSTANDING


After establishing the goals and objectives of our study, we gathered all the necessary datasets to
analyze and predict trends in India’s trade. Leveraging the Python programming language, we
implemented various models to gain a comprehensive understanding. We began with exploratory
data analysis (EDA) to unravel relationships between different variables. Python libraries such
as Plotly, Matplotlib, and Seaborn were employed to visualize and comprehend the insights
derived from the data. Utilizing regression and time series forecasting methods, we delved into
understanding trends and exploring the connections among key economic factors like GDP,
exports, imports, and other microeconomic variables.Our analysis focused on deciphering the
trends and relationships between variables, aiming to uncover what lies ahead for India’s trade
in the upcoming years.

7.3 DATA COLLECTION


The data sets were collected from the CDM,World Bank (WB), IMF, UNCTAD and World
Trade Organization (WTO). As below figure shows, we extracted 13 different data sets that
contain historical data for India trade such as imports, exports, GDP, government expenditure,
consumption investment for India.All these data sets were used to forecast India trade trends
and to understand the relationship between GDP and imports, exports for India and other
microeconomics variables.

14
Figure 6: Data File Used for Analysis

8 A CLOSER LOOK AT THE DATA


The data sets collected are presented in Microsoft Excel. We have a total of 13 data sets. Each
data set contains different number of rows. Below table shows an overview about all attributes
and the description for collected data based on world bank descriptions.We are describing only
those attributes which were used in the analysis.

Data set Name Attributes Descriptions


Export HS2 2010 2021 HSCode Two Digit HSCode for Com-
modities
Commodities Description of the commodities
exported
Value Value of the commodity in Mil-
lion USD
Country Name of the country for export
Year Year is from 2010 to 2021
Import HS2 2010 2021 HSCode Two Digit HSCode for Com-
modities
Commodities Description of the commodities
imported
Value Value of the commodity in Mil-
lion USD
Country Name of the country for import
Year Year is from 2010 to 2021

15
Data set Name Attributes Descriptions
india consumption 1960 2021 Country Name Name of the country
Year column of year from 1960 to 2022
Columns(Contain
India’s Consump-
tion in Million USD
)
india gdp 2010 2022 Date In DDMMYYYY format
GDP GDP in Billion USD
india export 2010 2022 Date In DDMMYYYY format
Export Export in Billion USD
india import 2010 2022 Date In DDMMYYYY format
Import Import in Billion USD
india general govt expenditure 1960 2021 Country Name Name of the Country
Year column of year from 1960 to 2022
Columns(Contain
Indias Govt.
Expenditure in
Million USD )
india capital investment 1960 2022 Country Name Name of the Country
Year column of year from 1960 to 2022
Columns(Contain
India’s capital in-
vestment in Million
USD )
forecast dependent variables year 1960 to 2022
Govt. Expenditure Government expenditure on final
goods and services, including the
wages of public employees in mil-
lion USD
Consumption Private expenditures for the con-
sumption of durable and non-
durable goods and services, it
includes all purchases made by
consumers in million USD
Investment Gross capital formation, Invest-
ment in new non-financial assets
or equipment, such as new hous-
ing, machinery, or software in
million USD.
Import Imports of goods and services
represent the value of all goods
and other market services re-
ceived from the rest of the world
in million USD.

16
Data set Name Attributes Descriptions
Export Exports of goods and services
represent the value of all goods
and other market services pro-
vided to the rest of the world in
million USD.
Trade Deficit Export minus Import in million
USD.
GDP Gross domestic product. It
is calculated using expenditure
approach which is the sum of
the final consumption expendi-
tures of households, government,
and nonprofit institutions serv-
ing households; gross capital for-
mation; and net exports (exports
minus imports) of goods and ser-
vices in million USD.
pta agreements Agreement Name of the Agreements
worksheet (WTO+AC) and Year Year of signing of the agreements
(WTO+LE) have same attributes
columns. See remarks below
Provisions falling under the current mandate Provision Type 0 or 1 or 2
of the WTO and already subject to some form (FTAIndustrial)
of commitment in WTO agreements
0 if the provision is not mentioned (or too gen- Provision Type 0 or 1 or 2
erally mentioned) in the agreement 1 if the (FTAAgriculture)
provision is mentioned in the agreement
worksheet (WTO+LE See remarks be- Provision Type 0 or 1 or 2
low) (Customs)
Provisions falling under the current mandate Provision Type 0 or 1 or 2
of the WTO and already subject to some form (ExportTaxes)
of commitment in WTO agreements - when
legally enforceable
0 if the provision is not mentioned in the agree- Provision Type 0 or 1 or 2
ment or not legally enforceable 1 if the provi- (SPS)
sion is mentioned, legally enforceable but ex-
plicitely excluded by dispute settlement provi-
sion 2 if the provision is mentioned and legally
enforceable
Provision Type 0 or 1 or 2
(TBT)
Provision Type 0 or 1 or 2
(STE)
Provision Type 0 or 1 or 2
(AD)
Provision Type 0 or 1 or 2
(CVM)

17
Data set Name Attributes Descriptions
Provision Type 0 or 1 or 2
(StateAid)
Provision Type 0 or 1 or 2
(PublicProcure-
ment)
Provision Type 0 or 1 or 2
(TRIMs)
Provision Type 0 or 1 or 2
(GATS)
Provision Type 0 or 1 or 2
(TRIPs)
worksheet (WTO-X AC) See remarks Agreement Name of the Agreements
below
Obligations that are outside the current man- Year Year of signing of the agreements
date of the WTO
0 if the provision is not mentioned (or too gen- Provision Type 0 or 1 or 2
erally mentioned) in the agreement 1 if the (AntiCorruption)
provision is mentioned in the agreement
worksheet (WTO-X LE) See remarks Provision Type 0 or 1 or 2
below (CompetitionPol-
icy)
Obligations that are outside the current man- Provision Type 0 or 1 or 2
date of the WTO - when legally enforceable (Environmental
Laws)
0 if the provision is not mentioned in the agree- Provision Type 0 or 1 or 2
ment or not legally enforceable 1 if the provi- (IPR )
sion is mentioned, legally enforceable but ex-
plicitely excluded by dispute settlement provi-
sion 2 if the provision is mentioned and legally
enforceable
Provision Type (In- 0 or 1 or 2
vestment )
Provision Type 0 or 1 or 2
(Labour Market
Regulation )
Provision Type 0 or 1 or 2
(Movement of
Capital)
Provision Type 0 or 1 or 2
(Consumer Protec-
tion)
Provision Type 0 or 1 or 2
(Agriculture)
Provision Type 0 or 1 or 2
(Approximation of
Legislation)
Provision Type 0 or 1 or 2
(Audio Visual)

18
Data set Name Attributes Descriptions
Provision Type 0 or 1 or 2
(Civil Protection)
Provision Type (In- 0 or 1 or 2
novation Policies)
Provision Type 0 or 1 or 2
(Cultural Coopera-
tion)
Provision Type 0 or 1 or 2
(Economic Policy
Dialogue)
Provision Type 0 or 1 or 2
(Economic Policy
Dialogue)
Provision Type 0 or 1 or 2
(Education and
Training)
Provision Type 0 or 1 or 2
(Energy)
Provision Type 0 or 1 or 2
(Financial Assis-
tance)
Provision Type 0 or 1 or 2
(Health)
Provision Type 0 or 1 or 2
(Human Rights)
Provision Type 0 or 1 or 2
(Illegal Immigra-
tion)
Provision Type 0 or 1 or 2
(Illicit Drugs)
Provision Type 0 or 1 or 2
(Industrial Cooper-
ation)
Provision Type 0 or 1 or 2
(Information Soci-
ety)
Provision Type 0 or 1 or 2
(Mining)
Provision Type 0 or 1 or 2
(Money Launder-
ing)
Provision Type 0 or 1 or 2
(Nuclear Safety)
Provision Type 0 or 1 or 2
(Political Dialogue)
Provision Type 0 or 1 or 2
(Public Adminis-
tration)

19
Data set Name Attributes Descriptions
Provision Type 0 or 1 or 2
(Regional Coopera-
tion)
Provision Type 0 or 1 or 2
(Research and
Technology)
Provision Type 0 or 1 or 2
(SME)
Provision Type 0 or 1 or 2
(Social Matters)
Provision Type 0 or 1 or 2
(Nuclear Safety)
Provision Type 0 or 1 or 2
(Statistics)
Provision Type 0 or 1 or 2
(Taxation)
Provision Type 0 or 1 or 2
(Terrorism)
Provision Type 0 or 1 or 2
(Visa and Asylum)
Table 2: Data Set Description

The above dataset is obtained from UNCTAD,WTO,Microtrends, IMF and WorldBank and
CDM websites.

8.1 DATA PURIFICATION AND EXPLORATION


In this step, the data was explored more using Excel and python, to study and understand the
values and attributes.

(a) Column names in all datasets were adjusted to eliminate spaces, ensuring they are easily
comprehensible and manageable during the preprocessing phase. Some data underwent
manual adjustments using tools such as Excel.
(b) All attribute (variable) values are denominated in USD. It is crucial to note that a few
values are expressed in USD, while others are in million USD. During data processing in
the dataframe, appropriate conversions were applied to maintain consistency.
(c) The identification of missing values in the datasets was accomplished using the isnull()
function in Python. Numerous attributes exhibited N aN (not a number) values, which
were subsequently replaced with 0 values through the application of the f illna() method.

20
Figure 7: Data Prepossessing to check for N AN value

In the Figure 7 dataset was checked for N aN value in the dataframe. The True against the
column value indicate presence of N aN value.These values needs to be filled with 0 so that no
column attributes has N aN values. dataf rame.count() provides the number of field and their
counts.In the Figure 8 we can see here that value attributes has less number of values compare
to other columns, indicating many field in value column are N aN .

Figure 8: Data Prepossessing to check for number of N aN value

Our data sets were cleaned to:

(a) Ensure that there are no duplicate rows. We use drop duplicate function figure out
whether our data sets have duplicate rows or not. All data sets were checked, and we
found that there are no duplicate rows.If dataset contain duplicate same is dropped using
drop duplicate()
(b) Removing N aN values by using df.replace(np.nan, 0) function. The below figure shows
an example of removing N aN and replace it with 0 values.

The df.inf o() function was used to show a concise summary of the DataFrame in python
same is shown in the Figure 9. This function summarizes the data and shows the number of
columns, column labels, column data types, memory usage, range index and the number of cells

21
Figure 9: DataFrame after replacing N aN values with 0

in each column figure also shows the output, the range index is 184755 entries, and it is clearly
showing that all values were filled. two types are considered int64 and 1 types are float64 and 2
types are object type. Python code is shown below and it is used to prepossessing all the data
set prior to use.
import pandas as pd
# Set display options
pd.options.display.max_columns = None
pd.options.display.max_rows = None

# Function to read and clean Excel data before using it


def read_and_clean_data(file_path):
df = pd.read_excel(file_path)
print(df.isnull().any())
print(df.count())
df = df.drop_duplicates()
print(df.count())
df = df.fillna(0) # Corrected method for replacing NaN values with 0
print(df.info())
return df

export_file_path = ’Export_HS2_2010_2021.xlsx’
df_export = read_and_clean_data(export_file_path)

22
9 DATA VISUALIZATION AND EXPLORATION
During this phase, we delved into the data sets, employing various visualization techniques to
extract valuable insights. The data analysis and exploration aimed to provide a comprehensive
understanding of India’s trade directions, shedding light on whether India’s exports and imports
exhibit a substantial impact and correlation with the country’s GDP.

9.1 VISUALIZATION OF INDIA’S EXPORT


The diagram illustrates the historical trajectory of India’s annual exports from 1988 to 2021.In-
dia’s export landscape from 2010 to 2021 reveals a dynamic and evolving trajectory, marked
by significant shifts in the composition, destination, and overall volume of exported goods and
services.

Figure 10: India’s Export Trends

This period encapsulates key economic developments, global trade dynamics, and policy
changes that have collectively influenced India’s export patterns. India’s export sector faced
challenges, including global economic slowdowns, trade tensions, and disruptions caused by
events like the COVID-19 pandemic. The sector demonstrated resilience by adapting to changing
circumstances, exploring new markets, and diversifying product offerings.

23
Figure 11: India’s Export Trends Another representation

Figure 12: India’s Export percentage line chart representation

24
It provides a visual representation of the annual variations, showcasing years of both export
growth and contraction. Despite fluctuations, the overarching trend reflects a general upward
trajectory, indicating the overall expansion of India’s export activities during this period.In the
above Figure 10, 2008 recession and 2020 Covid impact on export trend chart is clearly visible.

Further, between 2014 to 2016 the export trend shows downward trend due to fall in prices
of crude oil and petroleum based product and India export major petroluem based products. In
the Figure 11 and Figure 12 shows another representation of India’s export patterns.In year
2020-2021 i.e. post covid there is 44.62% jump in export.

9.2 VISUALIZATION OF INDIA EXPORT W.R.T OTHER COUN-


TRIES

Figure 13: India’s Export Trends vs Other Countries

In the above Figure 13 there is sharp contrast with China and USA in term of export.
Even the quantum of export by Hong Kong, China is substantially more than India in the
last decade.The Export gap between China, USA and India is steadily increasing over the last
decades.Export gap between China and USA is increasing steadily over the last decade.

25
9.3 VISUALIZATION OF INDIA’S EXPORT TO OTHER COUN-
TRIES
In the Figure 14 shows the India’s export share among top 10 countries from year 2010 to 2021.
USA and UAE has substantial export by India of approximately 29.49% and 19.77% among top
10 exporting countries by India followed by Hong Kong , Singapore and UK.

Figure 14: India’s Export to Other Countries percentage share

Given below Figure 15 shows values of export by India from 2010 to 2021 in inverse his-
togram.

26
Figure 15: India’s Export top 10 Countries

9.4 VISUALIZATION OF EXPORT BASED ON PRODUCT


The data set provided to us has HSCODE(Harmonized system of Code and Description) based
classification from 1 to 99. HSCODE classification is divided into 21 sections and each contain
many chapters.The total number of chapter are 99. Same is shown below in the following Table
3:

HSCODE SECTION HSCODE CLASSIFICATION


CHAPTER
SECTION-1 HSC:1-5 Animals and Animal Products
SECTION-2 HSC:6-14 Vegetable Products
SECTION-3 HSC:15 Animal Or Vegetable Fats
SECTION-4 HSC:16-24 Prepared Foodstuffs
SECTION-5 HSC:25-27 Mineral Products
SECTION-6 HSC:28-38 Chemical Products
SECTION-7 HSC:39-40 Plastics and Rubber
SECTION-8 HSC:41-43 Hides and Skins
SECTION-9 HSC:44-46 Wood and Wood Products
SECTION-10 HSC:47-49 Wood Pulp Products
SECTION-11 HSC:50-63 Textiles and Textile Articles
SECTION-12 HSC:64-67 Footwear and Headgear
SECTION-13 HSC:68-70 Articles Of Stone, Plaster, Ce-
ment, Asbestos
SECTION-14 HSC:71-71 Pearls, Precious Or Semi-
Precious Stones, Metals
SECTION-15 HSC:72-83 Base Metals and Articles Thereof
SECTION-16 HSC:84-85 Machinery and Mechanical Ap-
pliances
SECTION-17 HSC:86-89 Transportation Equipment

27
SECTION-18 HSC:90-92 Instruments - Measuring, Musi-
cal
SECTION-19 HSC:93-93 Arms and Ammunition
SECTION-20 HSC:94-96 Miscellaneous
SECTION-21 HSC:97-99 Works Of Art
Table 3: HSCODE based Classification

In the above Table 3 the commodities are grouped together as per HSCode known as
chapters.Various chapters are clubbed together to form the specific section. Section Specify
the generic classification of the product and chapter provides detail decription of products.The
absolute HSCode based product export from year 2010-2021 is shown in the Figure 16 in the
form of pie chart % wise distribution among the products.

Figure 16: India’s Top 10 Export Commodities based on HSCode

The below shown Figure 17 shows same commodities in billion USD in Inverse Histogram.
HSC-27 and HSC-71 top the chart followed by other commodities.

28
Figure 17: India’s Top 10 Export Commodities in Billion USD from 2010-2021

In the given below Figure 18 shows top-10 sections wise export based on grouping of
HSCode. HSCode:25-27 form the major export item sectionwise followed by HSCode-71.

29
Figure 18: India’s Top 10 Export HSCODE Sectionwise in Billion USD from 2010-
2021

In the given below Figure 19 shows top-10 sections wise import based on grouping of
HSCode. Again HSCode:25-27 form the major import item sectionwise.This shows India
import major mineral product and add value to it and export.So India export approximately
one-third of imported HSCode:25-27 product after adding value to it.

30
Figure 19: India’s Top 10 Import HSCODE Sectionwise in Billion USD from 2010-
2021

Figure 20: India’s Top 5 Export and Top 10 Import Commodities HSCODE Billion
USD from 2010-2021 in term of profit and loss

31
In the Figure 20 depicts Top 5 Trade Surplus HSCode Section and Top 10 Trade Deficit
HSCode section.Again compare to other HSCode, HSCode:25-27 Mineral Product based driven
economy and other major export HSCode are HSCode:50-63 Textile and Textile articles,HSCode:6-
14 , HSCode:1-5, HSCode:86-89 and HSCode: 16-24 where India has major trade surplus.

9.5 VISUALIZATION OF EXPORT BASED ON REGION WISE


Given below Table 4 depicts distribution of countries are divided into region wise (Source
Ministry of Commerce website).

REGION COUNTRIES
EUROPEAN UNION AUSTRIA, BELGIUM, BULGARIA, CROA-
TIA, CYPRUS, CZECH REPUBLIC, DEN-
MARK, ESTONIA, FINLAND, FRANCE,
GERMANY, GREECE, HUNGARY, IRE-
LAND, ITALY, LATVIA, LITHUANIA,
LUXEMBOURG, MALTA, NETHERLAND,
POLAND, PORTUGAL, ROMANIA, SLO-
VAK REP, SLOVENIA, SPAIN
EFTA ICELAND, LIECHTENSTEIN, NORWAY,
SWITZERLAND
OTHER EU COUNTRIES ALBANIA, BOSNIA-HRZGOVIN, MACEDO-
NIA, SERBIA, MONTENEGRO, TURKEY, U
K
SACU COUNTRIES BOTSWANA, LESOTHO, NAMIBIA, SOUTH
AFRICA, SWAZILAND
OTHER SOUTH AFRICAN COUNTRIES ANGOLA, MOZAMBIQUE, ZAMBIA, ZIM-
BABWE
WEST AFRICAN COUNTRIES BENIN, BURKINA FASO, CAMEROON,
CAPE VERDE IS, CONGO P REP, EQUTL
GUINEA, GABON, GAMBIA, GHANA,
GUINEA, GUINEA BISSAU, COTE D’
IVOIRE, LIBERIA, MALI, MAURITANIA,
NIGER, NIGERIA, SAO TOME, SENEGAL,
SIERRA LEONE, ST HELENA, TOGO
CENTRAL AFRICAN COUNTRIES BURUNDI, C AFRI REP, CHAD, MALAWI,
RWANDA, UGANDA, CONGO D. REP.
EAST AFRICAN COUNTRIES COMOROS, DJIBOUTI, ETHIOPIA, KENYA,
MADAGASCAR, MAURITIUS, REUNION,
SEYCHELLES, SOMALIA, TANZANIA REP
NORTH AFRICAN COUNTRIES ALGERIA, EGYPT A RP, LIBYA, MO-
ROCCO, SUDAN, TUNISIA
NORTH AMERICAN COUNTRIES CANADA, MEXICO, U S A

32
LATIN AMERICAN COUNTRIES ANTIGUA, ARGENTINA, BAHAMAS, BAR-
BADOS, BELIZE, BERMUDA, BOLIVIA,
BRAZIL, BR VIRGN IS, CAYMAN IS,
CHILE, COLOMBIA, COSTA RICA, CUBA,
DOMINIC REP, DOMINICA, ECUADOR, EL
SALVADOR, FALKLAND IS, FR GUIANA,
GRENADA, GUADELOUPE, GUATEMALA,
GUYANA, HAITI, HONDURAS, JAMAICA,
MARTINIQUE, MONTSERRAT, NETHER-
LANDANTIL, NICARAGUA, PANAMA
REPUBLIC, ARAGUAY, PERU,ST KITT N
A, ST LUCIA, ST VINCENT, SURINAME,
TRINIDAD,TURKS C IS, URUGUAY,
VENEZUELA, VIRGIN IS US
EAST ASIA OCEANIA COUNTRIES AUSTRALIA, FIJI IS, KIRIBATI REP,
NAURU RP, NEW ZEALAND,PAPUA
N GNA, TIMOR LESTE, SOLOMON IS,
TONGA,TUVALU, VANUATU REP,SAMOA
ASEAN COUNTRIES BRUNEI, CAMBODIA,INDONESIA, LAO PD
RP, MALAYSIA, MYANMAR, PHILIPPINES,
SINGAPORE, THAILAND, VIETNAM SOC
REP
WEST ASIAN GCC COUNTRIES BAHARAIN IS, KUWAIT, OMAN, QATAR,
SAUDI ARAB, U ARAB EMTS
OTHER WEST ASIAN COUNTRIES IRAN, IRAQ, ISRAEL,JORDAN, LEBANON,
SYRIA, YEMEN REPUBLC
NORTH EAST ASIAN COUNTRIES TAIWAN, CHINA P RP, HONG KONG,
JAPAN, KOREA DP RP, KOREA RP,
MACAO, MONGOLIA
SOUTH ASIAN COUNTRIES AFGHANISTAN,BANGLADESH PR,
BHUTAN, MALDIVES, NEPAL, PAKISTAN
IR,SRI LANKA DSR
CAR COUNTRIES KAZAKHSTAN , KYRGHYZSTAN, TAJIK-
ISTAN, TURKMENISTAN, UZBEKISTAN
CIS COUNTRIES ARMENIA, AZERBAIJAN, BELARUS,
GEORGIA, MOLDOVA, RUSSIA, UKRAINE

33
UNSPECIFIED COUNTRIES AMERI SAMOA, ANDORRA, ANGUILLA,
ANTARTICA,ARUBA, MAYOTTE, COCOS
IS, COOK IS, ERITREA, FAROE IS., FR
POLYNESIA, GIBRALTAR, GREENLAND,
GUAM, STATE OF PALEST, VATICAN
CITY, MARSHALL ISLAND, MICRONESIA,
MONACO, NEW CALEDONIA, NIUE IS,N.
MARIANA IS., NORFOLK IS, PACIFIC IS,
PALAU, PITCAIRN IS., PUERTO RICO,SAN
MARINO, US MINOR OUTLYING ISLANDS,
WALLIS F IS, UNSPECIFIED, ST PIERRE,
CHANNEL IS, CHRISTMAS IS., FR S ANT
TR, GUERNSEY, HEARD MACDONALD,
INSTALLATIONS IN INTERNATIONAL WA-
TERS , JERSEY”,PANAMA C Z, SWEDEN,
TOKELAU IS, UNION OF SERBIA AND
MONTENEGRO,CANARY IS, CURACAO,
NEUTRAL ZONE , SAHARWI A.DM RP,
SINT MAARTEN (DUTCH PART), SVALL-
BARD AND J, SOUTH SUDAN
Table 4: Region Wise Countries Distribution

In the Figure 21 the export distribution region wise is depicted. North American and
EU has the maximum share of export from India. North America constitute 16.87% of India’s
Total export between 2010-2021 followed by European Union with 14.39%. In the Figure 22
the export distribution region wise in Billion USD is depicted.In the Figure 23 Region wise
and Year wise line chart graph is shown. India’s export to North American and European
Union countries is quite significant.In the Figure 24 India export to USA is quite significant
and followed by China as shown in heatmap.

34
Figure 21: India’s export region wise percentage

Figure 22: India’s Export region wise in Billion USD from 2010-2021

35
Figure 23: Region Wise and Yearwise India’s Export

36
Figure 24: Country Wise India’s Export Heatmap

9.6 VISUALIZATION OF INDIA’S IMPORT


India’s import landscape from 2010 to 2021 has undergone a dynamic evolution, influenced by
various factors such as economic changes, shifts in global trade dynamics, and policy alterations.
This period encompasses crucial developments in international trade and significant events like
the global economic slowdowns, trade tensions, and the disruptions caused by the COVID-19
pandemic. Despite these challenges, India’s import sector has exhibited adaptability and re-
silience.

The import diagram presented below outlines the historical trajectory of India’s annual
imports from 2010 to 2021. It visually captures the yearly variations, highlighting periods
of both import growth and contraction. While facing fluctuations, the overall trend suggests
a general upward movement, signifying the expansion of India’s import activities over this
timeframe.

37
Figure 25: India’s import Trends

This import analysis underscores the diverse and changing nature of India’s trade relation-
ships, shedding light on the country’s import dynamics throughout the specified period.In the
above Figure 25, 2008 recession and 2020 Covid impact on import trend chart is clearly vis-
ible.In the below Figure 26 and Figure 27 shows another representation of India’s import
patterns.In year 2020-2021 i.e. post covid there is 55.43% jump in import

38
Figure 26: India’s Import Trends Another representation

9.7 VISUALIZATION OF INDIA IMPORT W.R.T OTHER COUN-


TRIES
In the Figure 28 there is sharp contrast with China and USA in terms of import. Even the
quantum of import by Singapore and UAE, is approximately equal in the last decade.

39
Figure 27: India’s Import percentage line chart representation

9.8 VISUALIZATION OF INDIA’S IMPORT TO OTHER COUN-


TRIES
In the Figure 29 shows the India’s import share among top 10 countries from year 2010 to
2021. China and UAE has substantial import by India of approximately 26.27% and 12.42%
among top 10 exporting countries by India.

40
Figure 28: India’s Import Trends vs Other Countries

41
Figure 29: India’s Import from Other Countries percentage share

42
Given below Figure 30 shows values of import by India from 2010 to 2021 in inverse
histogram.

Figure 30: India’s Import top 10 Countries

9.9 VISUALIZATION OF IMPORT BASED ON PRODUCT


The data set provided to us has HSCODE based classification from 1 to 99. HSCODE classifi-
cation is divided into 21 Chapter and each contain many section. Same is shown in above Table
3. The commodities are grouped together as per HScode known as Section.However, absolute
HSCode based product import from year 2010-2021 is shown in the Figure 31 in the form of
pie chart % wise distribution among the products.

43
Figure 31: India’s Top 10 Import Commodities

The below shown Figure 32 shows same commodities in billion USD in Inverse Histogram.
HSC-27 and HSC-71 top the chart followed by other commodities.

9.10 VISUALIZATION OF IMPORT BASED ON REGION WISE


In the Figure 33 the import distribution region wise is depicted. North East Asian and West
Asian GCC region has maximum share of import from India. North East Asian and West
constitute 23.27% of India’s total import between 2010-2021 followed by West Asian GCC
region with 18.27%. In the Figure 34 the export distribution region wise in Billion USD is
depicted.

44
Figure 32: India’s Top 10 Import Commodities in Billion USD from 2010-2021

Figure 33: India’s region wise Import

45
Figure 34: India’s region wise Import in Billion USD from 2010-2021

46
Figure 35: India’s Region wise Import between 2010-2021

In the Figure 35 Region wise and Year wise import line chart graph is shown. India’s
import from North East Asia and West Asia GCC is quite significant

47
Figure 36: Country Wise India’s Region Import Heatmap

In the Figure 36 India’s import from China is quite significant and followed by USA as
shown in heatmap.

9.11 VISUALIZATION OF INDIA’S TRADE DEFICIT


The year wise trade deficit for India is shown in Figure 37 and Figure 38 . The significant
trade deficit has occur in the year 2011, 2012,2018 and 2021. However, for year 2011, 2012,
2018 and 2021 trade deficit has remain constant although Import and Export has increased
significantly.

48
Figure 37: India’s trade deficit between 2010-2021

Figure 38: India’s trade deficit between 2010-2021 line chart

49
Figure 39: India’s trade deficit percentage wise 2010-2021

In the following Figure 40 is shown. Significant increase of import has happened in the
year 2011,2018 and year 2021.More of less the export percentage change line graph has followed
import line graph.

50
Figure 40: India’s Import and Export percent change variations

In the Figure 41 depicts Top 5 trade surplus and Trade deficit trade partners.

51
Figure 41: India’s top 5 Trade Deficit and Surplus Trade partner

9.12 TRADE CLASSIFICATION BASED ON HSCODE INDIA’S


EXPORT
In this section we tried to find the classification of India’s export based on HSCode.We have
employed the K-Means classification of the products which form the distinct group.This will
help us to identify the HSCode which significantly contribute toward India’s Export.Commodity
wise HSCode with value of goods is plotted in the scatter plot as shown in Figure 42.

52
Figure 42: HScode based Scatter Plot

Next step is to find the number of cluster this done using Elbow methods.With Elbow
method we can find the number of clusters required to uniquely label the classed. In our case
of Export Scatter plot , the number of cluster is 3.Same is shown in the Figure 43.

53
Figure 43: HScode K-Mean Elbow method for classification

Based on the values of k=3 we get the distinct scatter plot with classification of HSCode
in three classes.The Class 0 has maximum Export Value of 1400 Billion USD. The Class 1
and the Class 2 stands at 1050 Billion USD and 1190 billion USD as shown in Figure 45.

54
Figure 44: HScode K-Mean Classification

Figure 45: figure


K-Mean classes based on HSCode

55
In the Figure 46 the class wise distribution of HSCode Section is shown.In the Figure 47
the class wise distribution of HSCode Chapter is shown.

Figure 46: figure


K-Mean classes based on HSCode Number

56
Figure 47: K-Mean classes based on HSCode Chapter

In the Figure 48 and Figure 49 shows the exact content of section. Class 0 notably
contain maximum sections/chapter of HSCode. Class 1 contain two elements which contribute
the maximum trade export i.e HSC: 25-27: Minerals Products and HSC: 71-71: Pearls,
Precious or Semi-Precious stones, metals.

57
Figure 48: K-Mean class 0 HSCode Section

9.13 TRADE CLASSIFICATION BASED ON HSCODE INDIA’S


IMPORT
In the similar way of export as describe in Sub-Section 9.12 import classification was studied
and similar result were obtained. Import values in billion USD of Class 0, Class 1, Class 2
is depicted in Figure 52.

58
Figure 49: K-Mean class 1 and 2 HSCode Section

Figure 50: HScode based Scatter Plot Import

59
Next step is to find the number of cluster and this done using Elbow methods.With Elbow
method we can find the number of clusters required to uniquely label the class. In our case of
Import Scatter plot , the number of cluster is 3. Same is shown in the Figure 51. Based on the
values of k=3 we get the distinct scatter plot with classification of HSCode in three classes.The
Class 0 has maximum import value of 2400 Billion USD. The Class 1 and the Class 2 stands at
1700 and 1250 Billion USD as shown in Figure 53

Figure 51: HScode K-Mean Elbow method for classification Import

60
Figure 52: HScode K-Mean Classification Import value

Figure 53: K-Mean classes based on HSCode Import

61
In the Figure 54 the class wise distribution of HSCode section is shown.

Figure 54: K-Mean classes based on HSCode Number Import

In the Figure 55 the class wise distribution of HSCode chapter wise is shown.

62
Figure 55: K-Mean classes based on HSCode Section Import

In the Figure 56 and Figure 57 shows the exact content of section. Class 0 notably
contain maximum sections/chapter of HSCode. Class 1 contain one elements which contribute
the maximum trade import i.e HSC: 25-27: Minerals Products.

63
Figure 56: K-Mean class 0 HSCode Section Import

Figure 57: K-Mean class 1 and 2 HSCode Section Import

64
10 UNDERSTANDING TRADE AGREEMENTS
The last 25 years have seen a notable increase in Preferential Trade Agreements (PTAs). In
1990, there were 50 active and notified agreements to the WTO, a number that surged to 279
by the end of 2015 and 317 by 2019. This surge has sparked discussions on the reasons behind
PTAs, their impact on trade dynamics, economic growth, and the well-being of member and
non-member nations. Researchers and policymakers are also exploring their connection to the
overall framework of global trade governance.

The increase in PTAs has prompted inquiries. Questions include whether PTAs help mem-
ber nations strengthen trade policies and manage cross-border spillover effects, whether they
promote trade creation or diversion, and whether PTAs contribute positively to the develop-
ment of the multilateral trading system or pose obstacles. This study delves into the provisions
of trade agreements to unravel the intricate features of international trade agreements.

The principal objective of this study is to conduct a comprehensive analysis of the content
of Preferential Trade Agreements (PTAs) to contribute substantively to the discourse on their
rationale and impact. Utilizing the methodology established by Horn, Mavroidis, and Sapir
(2010), we compiled data on all PTAs in effect and notified to the WTO in 2015. The database
encompasses details on the inclusion of 52 policy areas and their legal enforceability within 279
PTAs involving 189 countries. This database stands out as one of the most extensive datasets
available, considering the number of trade agreements, participating countries, and covered pol-
icy areas.

The analysis of new PTA data reveals significant patterns. Notably, more than half of the
PTAs studied go beyond tariff reductions, incorporating legally enforceable regulations in areas
falling under the WTO mandate. These ”WTO plus” or ”WTO+” provisions cover customs
regulations, export taxes, anti-dumping measures, countervailing measures, technical barriers to
trade (TBT), sanitary and phytosanitary standards (SPS), among others. Additionally, a sub-
stantial portion of PTAs, over one-third, includes legally enforceable provisions in areas outside
the WTO mandate, termed ”WTO extra” or ”WTO-X,” spanning diverse policy areas from
investment to environmental laws and nuclear safety.

The study categorizes PTA provisions in distinct ways based on the specific research ques-
tion. This categorization includes ”core” versus ”non-core,” border versus non-border, and
preferential versus non-discriminatory provisions. Core provisions, identified as economically
significant in the literature, include all WTO+ provisions and four WTO-X areas (competition
policy, investment, movement of capital, and intellectual property rights protection). A key
finding of this research is that, alongside WTO+ provisions, these four core WTO-X policy
areas are frequently present in PTAs. Nearly 90 percent of agreements incorporate at least one
”core” WTO-X provision, and one-third of PTAs include all ”core” WTO-X provisions.

10.1 POLICY AREAS COVERED IN PREFERENTIAL TRADE AGREE-


MENTS
In 2010, Horn, Mavroidis, and Sapir (referred to as ”HMS”) conducted a study on the contents of
preferential trade agreements.HMS developed a methodology to categorize the provisions within
PTAs and evaluate their legal enforceability. They specifically identified a set of 52 recurring
policy areas in PTAs, further classifying them into WTO plus or ”WTO+” and WTO extra
or ”WTO-X” provisions. According to HMS, WTO+ encompasses policy areas falling under

65
the current mandate of the WTO, while WTO-X pertains to obligations outside the WTO’s
mandate. Additionally, HMS classified provisions as legally enforceable if the legal language
is sufficiently clear, and the use of dispute settlement under the PTA has not been excluded.
Conversely, a provision with no reference to dispute settlement procedures under the Agreement
or with weak legal language is considered not legally enforceable.

In this research we studied the 52 policy areas, which are subsequently classified into two
groups: 14 WTO ”plus” or WTO+ and 38 WTO ”extra” or WTO-X areas. WTO+ provisions
within PTAs reaffirm existing commitments and, in certain instances, introduce additional
obligations. In contrast, WTO-X provisions pertain to policy areas that currently lack regulation
by the WTO. A policy area is deemed ”covered” by an agreement if it includes an article, chapter,
or provision that outlines some form of commitment in that particular field. Given below Table
5 provides the categorization of WTO+ and WTO - X provisions.

WTO+ Provisions WTO-X Provisions


Tariffs Industrial goods Anti-corruption Visa and Asylum
Tariffs agricultural goods Competition policy Environmental laws
Customs administration IPR Investment measures
Export taxes Labour market regulation Movement of capital
SPS measures Consumer protection Data protection
State trading enterprises Agriculture Approximation of legislation
TBT measures Audiovisual Civil protection
Countervailing measures Innovation policies Cultural cooperation
Anti-dumping Economic policy dialogue Education and training
State aid Energy Financial assistance
Public procurement Health Human Rights
TRIMS measures Illegal immigration Illicit drugs
GATS Industrial cooperation Information society
TRIPS Mining Money laundering
Nuclear safety Political dialogue
Public administration Regional cooperation
Research and technology SMEs
Social Matters Statistics
Taxation Terrorism
Table 5: WTO+ and WTO-X Provisions

10.2 THE CONTENTS OF PTAs


The extensive set of provisions meticulously identified and coded in the dataset enables a de-
tailed analysis of the content within each Preferential Trade Agreement (PTA). These provisions
can be categorized to illuminate various aspects of the disciplines encompassed in PTAs. From
a legal standpoint, provisions can be classified into two main categories: those that are already
covered by the World Trade Organization (WTO+) and those that extend beyond the current
WTO mandate (WTO-X).

Additionally, some provisions hold greater economic significance than others, constituting a
set of ”core” rules that play a crucial role in market access and the effective operation of global
value chains. These ”core” provisions can be scrutinized through two distinct lenses. The first

66
lens categorizes core provisions based on whether they are implemented at the border or behind
the border. Alternatively, some core provisions exhibit intrinsic discriminatory (or preferential)
characteristics, while others cannot be applied bilaterally and are, therefore, applied on a non-
discriminatory (or Most Favored Nation, MFN) basis.

A substantial number of PTAs encompass policy areas falling within the existing mandate
of the WTO, surpassing mere tariff reductions shown in Figure 58. As of 2015, all active
PTAs include tariff reductions on manufactured goods. Over 200 PTAs incorporate provisions
related to customs, export taxes, and anti-dumping measures. Furthermore, all WTO+ provi-
sions, except for Trade-Related Investment Measures (TRIMS), are present in more than half
of the PTAs. Notably, customs, export taxes, anti-dumping measures, state aid provisions, and
countervailing measures are legally enforceable in more than 160 PTAs. Technical barriers to
trade (TBT) and sanitary and phytosanitary standards (SPS) are legally enforceable in 181 and
176 PTAs, respectively.

Figure 58: WTO+ Provisions

Only a select few WTO-X provisions, such as competition policy, movement of capital, in-
vestment rules, and intellectual property rights (IPR), are incorporated and legally enforceable
in a notable number of trade agreements, as illustrated in Figure 59. Competition policies, for
instance, are included in over 233 PTAs and are enforceable in 204 of them. The inclusion of
movement of capital and investment rules is observed in more than 150 PTAs, though the legal
enforceability varies. Nearly all provisions on the movement of capital are legally enforceable
in 172 PTAs, whereas fewer PTAs have legally enforceable investment provisions, totaling 125.
IPR is enforceable in 130 PTAs.

Contrastingly, all other WTO-X provisions find legal enforceability in less than a quarter of
PTAs. Among these, the policy areas that are most frequently covered and legally enforceable
include environmental laws, disciplines related to visa and asylum, and regulations pertaining
to labor markets.

67
Figure 59: WTO-X Provisions

Among the 52 provisions in our database, many go beyond trade-related matters. Identifying
”core” provisions involves subjective judgment. We classify core provisions as those within the
WTO mandate (WTO+ provisions) and four additional WTO provisions: competition policy,
investment, movement of capital, and intellectual property rights. These 18 provisions form a
fundamental set of rules that govern market access and ensure the smooth functioning of global
value chains.

From an economic theory perspective, not only are ”core” provisions significant, but they are
also commonly incorporated into trade agreements. Figure 60 outlines the 18 ”core” policy ar-
eas, indicating the percentage of agreements covering them with and without legally enforceable
provisions. One-third of active PTAs include legally enforceable provisions covering all core pol-
icy areas. Predictably, the most common provisions include tariff reductions in manufacturing
and agricultural goods. Except for state trading enterprises (STE), public procurement, intel-
lectual property rights (IPR), investment, and Trade-Related Investment Measures (TRIMS),
movement of capital and investment, all core provisions are included and legally enforceable in
at least half of the PTAs.

Certain provisions, such as customs procedures, export taxes, anti-dumping, and competi-
tion policies, hold particular significance, being legally enforceable in at least two-thirds of the
PTAs. Alongside WTO+ provisions, the four core WTO-X policy areas (competition policy,
investment, movement of capital, and intellectual property rights) are crucial components of
comprehensive PTAs.

68
Figure 60: Percentage coverage core provision in PTA

In the Figure 61 shows the % Core provisions covered in all provision along with legal
enforce-ability.

69
Figure 61: Percentage coverage core provision legally in PTA

Core provisions in our data set can be also classified as border and behind-the-border pro-
visions,depending on whether the policy that the provision regulates is applied at the border
or not. Provisions on tariff reduction in manufacturing and agriculture, anti-dumping, counter-
vailing measures, TRIMS, TRIPS, customs, export taxes, SPS, TBT and movement of capital
are mostly border provisions. State enterprise, state aid, competition policy, IPR, investment,
public procurement and GATS are to a larger extent behind the border provisions.

The depth of PTAs increased thanks to the inclusion of a higher number of both border and
behind the border measures. Figure 62 and Figure 63 shows the evolution of the average
number of border and behind the border provisions included in PTAs. The average number of
both border and behind the border provisions remained roughly constant until the end of the
1990s. Around 5 border and 2 behind the border provisions were included on average in PTAs
signed before 2000. These numbers steadily increased to 7 and almost 3 respectively in the last
fifteen years.

70
Figure 62: Average Number of Border Provisions in PTA

Figure 63: Average Number of Non-Border Provisions in PTA

71
An alternative way to categorize core provisions is based on whether they are preferential
or non-discriminatory. Preferential provisions apply exclusively to the countries that are party
to the PTA and include tariffs on manufacturing and agricultural goods, public procurement,
export taxes, anti-dumping measures, and countervailing measures. Conversely, other provi-
sions are primarily non-discriminatory in nature, implying that when included in a PTA, they
are presumed to be applicable on a Most Favored Nation (MFN) basis. Customs, SPS, TBT,
TRIMS, GATS, TRIPS, movement of capital, state-owned enterprises, state aid, competition
policy, IPR, and investment fall into the MFN provisions category.

Over time, PTAs have evolved to encompass more policy areas that are both MFN and
preferential. Figure 64 and Figure 65 illustrates a notable increase in the number of both types
of provisions, particularly after the year 2000. The average count of MFN provisions increased
from period 1980-1984 to 2010-2015, rising from 7 to over 11. A similar trend is observed for
legally enforceable MFN provisions. The count of preferential provisions remained relatively
stable around 6 for periods before 2000, experiencing a gradual increase in the last fifteen years
to slightly more than 11 provisions on average. The legal enforceability of preferential provisions
appears robust, following a pattern similar to expectations.

Figure 64: Average Number of MFN Provisions in PTA

72
Figure 65: Average Number of Preferential Provisions in PTA

10.3 PRINCIPAL COMPONENTS ANALYSIS(PCA) DEPTH OF


PTA
Statistical methods can be used to analyze the content of PTAs and construct indexes of depth.
Principal component analysis (PCA) is one of these methods. We use PCA to reduce the
dimensionality of our data set from 14(WTO+) and 52(WTO+ and WTO-X) variables to
one index that accounts for as much of the variability in the data as possible.In essence,PCA
transforms the 14 WTO+ provisions and 52 WTO+ and WTO-X provisions into a set of
orthogonal variables called components. The first component is a weighted average of the
provisions that takes into account around 37 percent of the variation in the data.We then define
PCA depth as the weighted average of provisions using the coefficients of the first component
as weights ωk :
X52
P CADepth = ωk provisionk (7)
i=1

The principal component analysis (PCA) is a procedure that uses an orthogonal transfor-
mation to convert a set of observations of possibly correlated variables into a set of values of
linearly uncorrelated variables called principal components. The transformation is defined such
that the first principal component has the largest possible variance and each succeeding compo-
nent has the highest variance conditional on being orthogonal to the preceding components.We
perform a PCA of the covariance matrix of 14 (WTO+) and 52(WTO+, WTO-X) provisions
and a sample of 317 trade agreements.We decided to use the covariance matrix since all vari-
ables are measured on the same scale. Figure 66 shows the share of variation in the data taken
into account by the components. The blue line indicates that the first component accounts for
around 37 percent of the variation in the data while the second component accounts for only 9
percent. The cumulative variation in red indicates that the first 4 components account for more
than 60 percent the variation and the first 7 components account for 80 percent of variation.
One advantage of PCA is that it reduces a large number of variables to a smaller subset of
orthogonal components. However, there is not a well-defined objective methodology to select
the number of important components; therefore, we present the results of different selection

73
methods. An intuitive method for choosing the number of relevant components is to look at
the cumulative percentage of total variation.After defining a threshold, the smallest number of
principal components that reach such threshold are maintained. A threshold between 70 and
90 percent is usually chosen but there may be cases in which a lower threshold is more suitable.
Such threshold is generally smaller as the number of variables in the data set or the number
of observations increases. In our case, 80 percent of total variation would retain 7 components
while a cutoff of 50 percent would further shrink the number of components to 3. For our
purposes and given the large number of variables in the data set, we opt for an even smaller
threshold of one-third of the variation, which leaves us with only the first component.

Figure 66: Finding Number of Components

74
Figure 67: PCA Depth of WTO+ Provisions

In our sample, PCA depth ranges from 0 to almost 5. Figure 67 shows the evolution of
PCA depth associated to each agreement over time. The vertical axis reports the depth of a
PTA –i.e. the value of the first component of the PCA. The horizontal axis reports the year in
which a PTA was signed. The linear fit in the figure clearly illustrates that PTAs signed more
recently tend to be deeper than early agreements. On average, PCA depth decreases from 3.5
in the 1990s to 2.85 in 2021.

In the Figure 68 and Figure 69 was performed with 3 PCA components and clustering of
Features is shown in the Figure 69

75
Figure 68: PCA Depth based grouping

76
Figure 69: PCA Depth based k-Mean

77
Figure 70: Finding Number of PCA Components

Similar process was followed for all the 52 provisions of WTO+ and WTO-X and results
are depicted in Figure 71 and Figure 72

78
Figure 71: PCA Depth of WTO+ and WTO-X Provisions

79
Figure 72: PCA Depth based grouping WTO+ and WTO-X Provisions

80
The 52 provisions of WTO+ and WTO-X are classified based on three PCA components.
Same is shown in the Figures 73, 74, 75

Figure 73: Class 0 Cluster Elements

81
Figure 74: Class 1 Cluster Elements

82
Figure 75: Class 2 Cluster Elements

83
11 INDIA’S TRADE TRAJECTORIES
In this section we will see the the trends of microeconomics variables like GDP, Import, Export,
Consumption, Investment , Trade Deficit, Expenditure etc. and find the correlation between
various microeconomics variables.In the following sections we plot the trends of GDP, Export
, Import, Expenditure, Investment,Consumption and Trade Deficit. , we train our model refer
equation 1 with respect to various independent variable like Export , Import, Expenditure,
Investment,Consumption and Trade Deficit and find the model which appropriately give correct
result.In the subsequent subsection we predict pattern in GDP,Export, Import, Consumption,
govt. expenditure etc. to predict the future values. Finally the values calculated using Regres-
sion and ARIMA Model is compared with prediction of value for GDP for year 2023 to 2027
.

11.1 VISUALIZATION OF CORRELATION BETWEEN GDP AND


OTHER ECONOMICS VARIABLE
The below heatmap in Figure 76 shows the correlation between GDP and other economic
variables. The darker the blue color in any cell the more positive is the relation (positive linear
correlation) between the variables. In our study, we tried to understand the correlation between
GDP and Exports, Import, Govt Expenditure etc.

11.2 VISUALIZATION OF THE TRENDS FOR GDP AND OTHER


ECONOMICS VARIABLES

84
Figure 76: Peason Correlation Coefficient for Economics Variable

The below Figures show the trends of consumptions, GDP, Exports, Imports, Government
expenditure, and Investment.

85
Figure 77: Trends Consumptions

Figure 78: Trends Expenditure

86
Figure 79: Trends Export

87
Figure 80: Trends Import

11.3 MODELLING

88
Figure 81: Trends Investment

After preparing the data sets, Regression and time series ARIMA models were applied
for the project. Regression model was applied on the first phase to analyze and understand
the correlation between GDP and some of economic variables. In the second phase we applied
ARIMA model to analysis and forecast the India’s trade trends for economics variables including
GDP and other micro variables.In the last step we compared the forecast value obtained GDP
from ARIMA model and GDP value through regression model on dependent variable future
values.

11.4 REGRESSION
We employed a regression algorithm to assess the association between the dependent variable
(Gross Domestic Product or GDP) and various independent variables. This analysis aimed to
examine the relationship between GDP and key economic factors. The expenditure approach
was utilized to compute GDP, allowing us to investigate correlations among different factors in
GDP analysis.

The hypotheses tested are as follows:

(a) Null Hypothesis (H0): There is no significant relationship between the independent vari-
ables (Government expenditure, Consumption, Investment, Exports, and Imports) and
the dependent variable (GDP).
(b) Alternative Hypothesis (H1): There is a significant relationship between the independent
variables (Government expenditure, Consumption, Investment, Exports, and Imports)
and the dependent variable (GDP).
The Below steps describe our regression analysis:
(a) Step 1:GDP is defined as dependent variable, and all the remaining features are inde-
pendent variables.

89
"""
1. REGRESSION ANALYSIS WITH ALL THE INDEPENDENT VARIABLES
"""
x1 = list(range(2000, 2023))
x2 = list(range(2023, 2028))
Final_exploitable_data_df = pd.read_excel(’Final_exploitable_data.xlsx’)
X_independent_variable = Final_exploitable_data_df.drop([’year’, ’ GDP’, ’% Change
GDP’,’Trade Deficit’], axis=1)
Y_dependent_variable = Final_exploitable_data_df[[’ GDP’]]

(b) Step 2: We Split the data set into Training and Testing sets by using train test.split()
function to evaluate the performance of regression model. The first set is called training
set, and it will be used to fit the model. The second set is called testing set and it will be
used to evaluate the fitted model. The split percentage used was 70% for training and 30
% for testing.
"""
CREATING A TESTING AND TRAINING SET IN THE RATIO OF 30 : 70
"""
x_train, x_test , y_train, y_test = train_test_split(X_independent_variable,
Y_dependent_variable, test_size=0.3, shuffle=False)

(c) Step 3:we built the regression algorithm by fitting our training data set on the model
and evaluating the algorithm using testing data set.
"""
HERE WE BUILT REGRESSION MODEL
"""

linear_regression_model = LinearRegression()
linear_regression_model.fit(x_train, y_train)
y_predict = linear_regression_model.predict(x_test)

(d) Step 4: Ordinary Least Squares (OLS) method was used to find the multiple regression
of our data set and to estimate the parameter β for the linear models. We did multiple
iterations for regression analysis, each time we changed the independent variables list.
We start below analysis by considering first all independent variables, then we add and
remove independent variables one by one based on p-values. The following are the OLS
results:
Regression between GDP and all independent variables (Consumption, Government, In-
vestment, Exports, and Imports). In the Figure 82 shows the result when applied OLS
method for all independent variables. The regression formula is:

GDP = β0 +β1 ∗Government+β2 ∗Consumption+β3 ∗Investment+β4 ∗Exports+β5 ∗Imports


(8)
The various coefficient β can be obtained from Coefficient column in Figure 82.High value
R-squared 100% which means that the 100 % variation in GDP dependent variable is explained
by all of the independent variables (Government expenditure, Consumption, Investment, Ex-
ports and Imports).

90
Figure 82: OLS Results

Since the p-value is less than 0.055 except Investment,we rejected the null hypothesis. In
other words, the independent variables have a jointly statistically significant relationship with
the dependent variable (GDP).There is a huge Multicollinearity among the independent vari-
ables. We checked the multicollinearity between all the variables as calculated in subsection
11.2

11.5 GDP RELATION WITH OTHER ECONOMICS VARIABLES

91
Figure 83: GDP Function of Export Import Consumption Investment Expenditure

Figure 84: GDP Function of Export

92
Figure 85: GDP Function of Import

93
Figure 86: GDP Function of Export and Import

94
Figure 87: GDP Function of Export Import Consumption

Figure 88: GDP Function of Export Import Investment

95
Figure 89: GDP Function of Export Import Govt. Expenditure

All the above Figure suggest based on OLS method that there is closed relations between
GDP and other microeconomics variables like import, export, Govt Expenditure, consumption.

11.6 ARIMA MODELLING FOR TIME SERIES FORECASTING


The goal of applying time series modeling and forecasting is to carefully collect and rigorously
study the past observations of a time series to develop an appropriate model which describes the
inherent structure of the series. This model is then used to generate future values for the series,
i.e., to make forecasts. Time series forecasting thus can be termed as the act of predicting the
future by understanding the past. Although Time series forecasting is done only for GDP same
may be replicated for any economic variables.Two ways of obtaining correct value of parameter
(p,q,d) is shown below:-

(a) Step 1: Check if our series is stationary or not. Stationary means mean, variance and
covariance are constant over periods. Augmented Dickey-Fuller (ADF) statistical test was
applied to check if India’s GDP is stationary or not and therefore, require differencing.

result = adfuller(india_GDP_1960_2022.GDP.dropna())
print(’ADF Statistic: %f’ % result[0])
print(’p-value: %f’ % result[1])

Figure 90: ADF Test

P-value is 0.9999 which is greater than 0.05, so the data is non-stationary.

96
(b) Step 2: Other convenient way of find (p,d,q) can be done using in build library as shown
below:
model = pm.auto_arima(india_GDP_1960_2022.GDP, start_p=1, start_q=1,
test=’adf’, #use adftest to find optimal ’d’
max_p=5, max_q=5, # maximum p and q
m=1, # frequency of series
d=None, # let model determine ’d’
seasonal=False, # No Seasonality
start_P=0,
D=0,
trace=True,
error_action=’ignore’,
suppress_warnings=True,
stepwise=True)
print(model.summary())

The above library function provides us the value of (p,d,q) using AIC approach and easy
way of finding (p,d,q).

In the Figure 92 is shown. In the Figure 93 comparison of prediction using ARIMA and
Regression is shown. The future values of GDP by both the method is comparable.

97
Figure 91: Finding p,d,d using library function

Figure 92: GDP Forecasting using ARIMA

98
Figure 93: GDP Forecasting using Regression and ARIMA

99
Figure 94: Import Forecasting using ARIMA

Figure 95: Export Forecasting using ARIMA

100
12 CONCLUSION
In this project, we employed various data analytics techniques to glean insights into India’s trade
dynamics. A fundamental comparative analysis was conducted, focusing on India and its trade
interactions with other nations. We pinpointed the primary imports and exports, identifying
both the top products traded by India and the leading countries from which India imports and
to which it exports.

In the realm of global partnerships, we observed a noteworthy pattern where certain coun-
tries serve as key players in both exports and imports. However, a distinctive trend emerged:
a significant proportion of the top exporting nations predominantly leaned towards Western
countries and the European Union (EU). Conversely, the majority of the top countries from
which India imports goods are situated in North and Eastern regions, primarily within Asia.

We utilized a regression model to analyze India’s GDP and its correlation with various eco-
nomic indicators, including Government Expenditure, Consumption, Gross Formation of Fixed
Capital, and the values of exports and imports for goods and services. Our findings revealed
significant correlations among certain variables. To address potential issues arising from mul-
ticollinearity, we conducted tests to identify and exclude highly correlated variables from the
regression equation.

The conclusive analysis revealed a positive relationship between GDP and both exports and
imports.Furthermore, we employed the ARIMA model to forecast India’s imports, exports, and
GDP. Among the various ARIMA models tested, ARIMA (0,2,1) exhibited the minimum AIC
for GDP. Notably, the values of GDP derived from ARIMA closely aligned with those obtained
through regression analysis, indicating minimal differences between the two methodologies.

References
[1] G. Ahmed, A. S. Al-Gasaymeh, and T. Mehmood. The Global Financial Crisis and Inter-
national Trade. Asian Economic and Financial Review, 2017.
[2] D. A. Goyal and A. Vajid. An analysis of india’s trade intensity with uae. Journal of
Commerce and Trade, pages 27–31, 2018.
[3] J. Lu. Forecasting of u.s. total textiles and apparel exports to the world in next 10 years
(2015-2025). JTATM Journal of Textile and Apparel, Technology and Management, pages
1–8, 2015.
[4] M. M. Mukit. An econometric analysis of the macroeconomic determinants impact of gross
domestic product (gdp) in bangladesh. 2020.
[5] W. Pengliang. Time series analysis of china’s service trade based on arima model. In
International Conference on Economics, Finance and Statistics, pages 382–386. Atlantis
Press, 2017.

[6] Ž. V. Stanko Stanić. Analysis of macroeconomic factors effect to gross domestic product of
bosnia and herzegovina using the multiple linear regression models. pages 91–97, 2019.

101

You might also like