Professional Documents
Culture Documents
Submitted by :
S Dhanush
20BCE1811
Introduction to the internship
The Global Academic Internship Programme with the National University of Singapore and Hewlett
Packard Enterprise gives a holistic experience at the NUS campus. In this internship we learnt the
concepts of Data Analytics using Machine Learning and Deep Learning and completed a research
project based on our learnings within the stipulated time.
My team and I came up with a project called "Economic Freedom Index Prediction". We presented
our project in front of the NUS and HPE faculties and were evaluated based on our proposal of idea
and presentation.
Introduction to the Project
In this project we have used 8-10 types of Classification models to predict classes assigned to different countries
based on their economic index score. The dataset used is publically available and obtained from the heritage
organisation and it consists of multiple features from different countries and their respective rankings for the time
period of 2013-2022.
The Index of Economic Freedom is a helpful tool for a variety of audiences, including academics, policymakers,
journalists, students, teachers, and those in business and finance. It is an excellent objective tool for analysing 184
economies throughout the world and each country page is a resource for in-depth analysis of a country’s political
and economic developments.
FLOWCHART OF OUR PROJECT
How is the data pre-processed ?
a. Removing countries with no rank i. One hot encoding on region values
b. Converted objects to float data type j. Splitting data into x (input) and y (output)
d. Finding max and min score of index l. Dropping columns related to GDP
h. Dropping columns having country name and ID p. Keeping columns with correlation less than 0.6
i. One hot encoding on region values q. Splitting data into train and test samples
What models have been applied ?
We have used the following technology to build We have covered the following concepts in
our model: our model which we have build :
81 % 84 %
Naive Bayes Support Vector Machines
● Probabilistic classifier. ● SVM works by mapping data to a high- dimensional
● Based on probability models that incorporate strong feature space so that data points can be categorized,
independence assumptions. ● A separator between the categories is found, then the data
● The independence assumptions often do not have an are transformed in such a way that the separator could be
impact on reality. drawn as a hyperplane.
● Therefore they are considered as naive.
59 % 85 %
Gradient Boosting Random Forest
● Gradient boosting is a type of machine learning boosting.
● It relies on the intuition that the best possible next model,
when combined with previous models, minimizes the ● Contains a number of decision trees on various subsets of
overall prediction error. the given dataset
● The key idea is to set the target outcomes for this next ● Takes the average to improve the predictive accuracy of
model in order to minimize the error.
that dataset.
86 % 91 %
XGBoost ANN
● Extreme Gradient Boosting
● It is a scalable, distributed gradient-boosted decision tree ● Uses layers of fully connected nodes to predict classes
(GBDT) machine learning library. ● Mimics the way nerve cells work in the human brain.
● It provides parallel tree boosting and is the leading
machine learning library for regression, classification, and
ranking problems.
92 % 64 %
Accuracy Comparison of Models
XGBOOST
Comparison of Precision, Recall and F-Score
From comparing the 8 models based on their accuracy, precision, recall and f-score we come to the
conclusion that “XGBoost” is the best performing model.
The 12 economic freedoms and accompanying historical data also provide a comprehensive set of principles and
facts for those who wish to understand the fundamentals of economic growth and prosperity.
We believe our work will be helpful for a variety of audiences including academics, policymakers, big and small
businesses and other entities. This will provide organisations a better hand at decision making if they are
stakeholders of results that come from the EFI of a nation.
Conclusion
The degree of economic freedom present is influenced by numerous factors. No single statistic will be able to fully capture
all of them and their interrelations. The index presented here captures most of the important elements and provides a
reasonably good measure of cross-country differences in economic freedom. However, something as complex as economic
freedom is difficult to measure with precision. Thus, small differences between countries should be taken with a pinch of
salt.
In a nutshell, this internship has been an excellent and rewarding experience. I can conclude that there have been a lot I’ve
learnt from my internship at the NUS. I was able to gain practical skills, work in a fantastic environment, and make
connections that will last a lifetime. I could not be more thankful.
Thank You !