You are on page 1of 17

CSE 1902 : Industrial Internship

The Global Academic Internship Programme (GAIP)


In association with the National University of Singapore and Hewlett Packard
Enterprise

Submitted by :
S Dhanush
20BCE1811
Introduction to the internship

The Global Academic Internship Programme with the National University of Singapore and Hewlett
Packard Enterprise gives a holistic experience at the NUS campus. In this internship we learnt the
concepts of Data Analytics using Machine Learning and Deep Learning and completed a research
project based on our learnings within the stipulated time.
My team and I came up with a project called "Economic Freedom Index Prediction". We presented
our project in front of the NUS and HPE faculties and were evaluated based on our proposal of idea
and presentation.
Introduction to the Project

In this project we have used 8-10 types of Classification models to predict classes assigned to different countries
based on their economic index score. The dataset used is publically available and obtained from the heritage
organisation and it consists of multiple features from different countries and their respective rankings for the time
period of 2013-2022.

The Index of Economic Freedom is a helpful tool for a variety of audiences, including academics, policymakers,
journalists, students, teachers, and those in business and finance. It is an excellent objective tool for analysing 184
economies throughout the world and each country page is a resource for in-depth analysis of a country’s political
and economic developments.
FLOWCHART OF OUR PROJECT
How is the data pre-processed ?
a. Removing countries with no rank i. One hot encoding on region values

b. Converted objects to float data type j. Splitting data into x (input) and y (output)

c. Described data statistically visualise it better k. Oversampling minority classes (SMOTE)

d. Finding max and min score of index l. Dropping columns related to GDP

e. Dividing entire dataset to continuous variable m. Chi square test

f. Thresholding values n. Removing outliers through boxplots

g. Label encoding on index scores o. Correlation matrix

h. Dropping columns having country name and ID p. Keeping columns with correlation less than 0.6

i. One hot encoding on region values q. Splitting data into train and test samples
What models have been applied ?

a. Logistic Regression e. Random Forest

b. Decision Tree f. SVM

c. XGboost g. Naive Bayes

d. Gradient Boost h. ANN


Technology Used : Concept covered :

We have used the following technology to build We have covered the following concepts in
our model: our model which we have build :

- OpenCV - Artificial neural network


- Google Colaboratory - Data Visualisation
- Python - Classification methods
- Anaconda Navigator - Model building
- Kreas - Streamlit
- Streamlit
- TensorFlow
Logistic Regression Decision Tree
● It is a tree-structured classifier, where internal nodes
● Estimates the probability of an event occurring
represent the features of a dataset,
● A logit transformation is applied on the traditional
● Branches represent the decision rules and each leaf
regression equation.
node represents the outcome.

81 % 84 %
Naive Bayes Support Vector Machines
● Probabilistic classifier. ● SVM works by mapping data to a high- dimensional
● Based on probability models that incorporate strong feature space so that data points can be categorized,
independence assumptions. ● A separator between the categories is found, then the data
● The independence assumptions often do not have an are transformed in such a way that the separator could be
impact on reality. drawn as a hyperplane.
● Therefore they are considered as naive.

59 % 85 %
Gradient Boosting Random Forest
● Gradient boosting is a type of machine learning boosting.
● It relies on the intuition that the best possible next model,
when combined with previous models, minimizes the ● Contains a number of decision trees on various subsets of
overall prediction error. the given dataset
● The key idea is to set the target outcomes for this next ● Takes the average to improve the predictive accuracy of
model in order to minimize the error.
that dataset.

86 % 91 %
XGBoost ANN
● Extreme Gradient Boosting
● It is a scalable, distributed gradient-boosted decision tree ● Uses layers of fully connected nodes to predict classes
(GBDT) machine learning library. ● Mimics the way nerve cells work in the human brain.
● It provides parallel tree boosting and is the leading
machine learning library for regression, classification, and
ranking problems.

92 % 64 %
Accuracy Comparison of Models

Best Performing Model :

XGBOOST
Comparison of Precision, Recall and F-Score

Best Performing Model : XGBOOST


Results for XGBoost

From comparing the 8 models based on their accuracy, precision, recall and f-score we come to the
conclusion that “XGBoost” is the best performing model.

Figure 3 : Result of XGBoost

Figure 1 : Confusion matrix Figure 2 : F- Score


Future Enhancements

The 12 economic freedoms and accompanying historical data also provide a comprehensive set of principles and
facts for those who wish to understand the fundamentals of economic growth and prosperity.

We believe our work will be helpful for a variety of audiences including academics, policymakers, big and small
businesses and other entities. This will provide organisations a better hand at decision making if they are
stakeholders of results that come from the EFI of a nation.
Conclusion

The degree of economic freedom present is influenced by numerous factors. No single statistic will be able to fully capture
all of them and their interrelations. The index presented here captures most of the important elements and provides a
reasonably good measure of cross-country differences in economic freedom. However, something as complex as economic
freedom is difficult to measure with precision. Thus, small differences between countries should be taken with a pinch of
salt.

In a nutshell, this internship has been an excellent and rewarding experience. I can conclude that there have been a lot I’ve
learnt from my internship at the NUS. I was able to gain practical skills, work in a fantastic environment, and make
connections that will last a lifetime. I could not be more thankful.
Thank You !

You might also like