You are on page 1of 13

Data Analysis for Managers MCO105-BCN18023

Correlation between Influencer Engagement Rates and GDP per capita

Valerie van der Linden & Shayantani Twisha

Professor Anthony Lawson

December 10, 2023


Table of Contents
Executive summary............................................................................................................3
Overview of problem to be investigated. Impact of the problem................................4
Question to be investigated.......................................................................................................4
Method..................................................................................................................................5
Correlational Analysis (Scatter Plot):.......................................................................................7
Regression Analysis....................................................................................................................8
Limitations and constraints of the study......................................................................10
Recommendations to managers.....................................................................................11
Conclusion.........................................................................................................................12
Executive summary
The aim of this data analysis project is to investigate the potential correlations between
the mean average engagement of influencers in a country and the country’s GDP per
capita. A curated dataset comprising average engagement of influencers and GDP per
capita from 20 countries served as the foundation for our investigation. Upon conducting
a thorough correlational analysis and regression analysis, the results revealed a notable
absence of a significant positive correlation between the variables. This key finding
challenges the assumption of a linear association between economic prosperity of a
country and the average engagement of the influencers of that country. Our data
analysis challenges the conventional belief that influencer engagement can be
explained by a country’s GDP per capita. The landscape of influencer dynamics calls for
a more holistic approach, considering a spectrum of variables beyond economic
indicators.
Overview of problem to be investigated. Impact of the problem.
Looking at influencer engagement data from 2022, specifically on Instagram in 20
different countries, we wanted to investigate the relationship between average
engagement rates and the GDP Per Capita by country. The aim of this research is to
identify if there is a correlation between these two variables (average engagement and
GDP per capita) and see how this could influence or impact how a business could
identify which influencers to use for their marketing campaigns.
We hypothesized that there is a positive relationship between average engagement
rates of influencers and the GDP per capita in the country they reside in. In other words,
the engagement rates will be highest in countries with high GDP per capita numbers.
This hypothesis is based on knowledge that there are more Instagram users in
countries with higher GDP per capita.

Question to be investigated
Our initial question asks if there is a correlation between influencer engagement on
Instagram and the GDP per capita growth rate in their residing countries. As we began
looking at GDP per capita growth rates in each country we realized that the results
would be insufficient being that growth rates are measured over time, and the
engagement data we gathered was only from the year 2022. Therefore, it wouldn’t
make sense to compare these two variables. Additionally, the growth rates didn’t give
an accurate reflection of the size of the economies, which is the main variable we
wanted to investigate and compare to engagement rates. Therefore, we decided to
identify the GDP per capita amounts for each country for the year 2022. This allowed us
to identify the relationship of the two variables from a 1-year period.
Method
For this project, we collected the Instagram Influencer data of December 2022 from
Kaggle (https://www.kaggle.com/datasets/ramjasmaurya/top-1000-social-media-
channels), which is an online platform and online community of data scientists and
machine learning practitioners under Google LLC for free data sets on different fields
(www.kaggle.com, n.d.). We also collected the data for GDP per capita (in USD), GDP
per capita growth rate of each country from World Bank Open Data
(https://data.worldbank.org/), which is an open data catalog providing a listing of
available World Bank datasets, including databases, pre-formatted tables, reports, and
other resources (World Bank, 2023). Also, we retrieved the data for mobile penetration
rate in each country from Statista (https://www.statista.com/markets/), which is a
German online platform that specializes in data gathering and visualization. In addition,
Statista also provides exclusive data via the platform, which is collected through its
team’s surveys and analysis (Statista, 2019). The retrieval date of the data is: 7th
November 2023. After data collection, we cleaned and filtered the data based on the
requirements of the study. We filtered out the countries that only had one influencer in
the data set. Also, there were some anomalies or irregularities in the data set, for
example, Christiano Ronaldo and David Beckham was listed under India. We fixed
these anomalies to get a clean data set. After that we calculated the mean of all the
influencers’ average engagement from each country to compare them with the GDP per
capita of those countries.
The analytical method employed in this project involves a mixed-methods approach,
combining both qualitative and quantitative data analysis. The primary focus is on
correlational, regression and descriptive analysis to examine the relationship between
the mean average engagement of the influencers of each country and the country’s
GDP per capita. We utilized statistical techniques to determine the correlation and
performed regression analysis to explore the strength and character of the relationship
between the dependent variable, in this case it is mean average engagement of the
influencers of each country and the independent variable, in this case the independent
variable is GDP per capita of those countries. And lastly, we used data visualization
(scatter plots) to present our findings in a graphical manner.
Purpose: Correlational and Descriptive
Approach: Deductive
Methodologies: Mixed- methods
Sources of Secondary Data:
1. Kaggle: https://www.kaggle.com/datasets/ramjasmaurya/top-1000-social-media-
channels
2. World Bank: https://data.worldbank.org/
3. Statista: https://www.statista.com/markets/
Defining Variables
We have both categorical and numerical or quantitative variables for this project. “Data
that can be grouped by specific categories are referred to as categorical data.
Categorical data use either nominal or ordinal scale of measurement. Data that use
numeric values to indicate how much or how many are referred to as quantitative data”
(Anderson, Sweeney and Williams, 2011).
These defining variables encompass both categorical and quantitative aspects,
providing a comprehensive framework for the analysis of influencer dynamics and their
correlation with economic factors of a country.
Categorical variables:
- Influencer Name: The real name of the Instagram influencer included in the
dataset. This variable facilitates individual influencers identification and
differentiation.
- Influencer Handle on Instagram: The Instagram username of each influencer,
serving as their digital identity on the platform, which enables the linkage of
influencers to their Instagram profiles for additional context and verification.
- Influencer category: The categorical classification of influencers based on their
content or niche (example: fashion, travel, lifestyle). This variable provides
insights into the thematic focus of influencers, aiding in the grouping and analysis
of similar profiles.
- Country: The categorical variable indicating the country of origin or residence of
each influencer. This allows for the segmentation of influencers based on their
geographical location, contributing to the main analysis of the project.
Numerical variables:
- Follower Count: The numerical representation of the total number of followers
an influencer has on Instagram. This quantifies the reach and popularity of each
influencer.
- Authentic Engagement: The numerical measure of authentic interactions (likes,
comments, shares, saves) received by an influencer per post, which reflects the
quality of the influencer’s audience interaction.
- Average Engagement: The numerical average of engagement metrics (likes,
comments, shares, saves) per post, providing an overall engagement rate. This
offers a consolidated view of the influencer’s audience engagement, contributing
to the main analysis of the project.
- GDP Per Capita (in USD): The numerical measure of a country’s economic
output per capita, indicating average income, representing the economic context
of the influencers’ home countries, indicating their audience’s purchasing power.
- GDP Per Capita growth (%): The numerical percentage change in GDP per
capita over a specific period, reflecting economic growth of a country, capturing
the economic dynamism in the influencers’ home countries.
- Mobile penetration rate (%): The numerical percentage of the population with
mobile phone access in each country, which reflects the extent of mobile
technology adaptation influencing the influencers’ reach and audience
accessibility.

Results
Correlational Analysis (Scatter Plot):
Upon conducting a correlational analysis between the mean average engagement of the
influencers (Y-axis) and the GDP per capita of the selected countries (X-axis), the
results indicate a lack of significant correlation between these two variables. The
absence of a significant correlation indicates that variations in the mean average
engagement of influencers are not linearly associated with the economic prosperity of
the countries in question. In practical terms, the level of economic development, as
represented by GDP per capita, does not reliably predict or explain the observed
differences in influencer engagement across these countries. This finding underscores
the independence of influencer engagement dynamics from the economic context of a
country. Other factors, such as cultural nuances, industry relevance, and audience
demographic may play more significant roles in influencing engagement.

Column 1 Column 2 Column 3

Column 1 1

Column 2 -0.10102 1
Column 3 -0.05238 -0.15615 1
Regression Analysis
SUMMARY
OUTPUT

Regression Statistics
0.0523797
Multiple R 61
0.0027436
R Square 39
-
Adjusted R 0.0526594
Square 92
654790.91
Standard Error 73

Observations 20

ANOVA
Significanc
df SS MS F eF
212323475 2.12E+1 0.0495 0.8264037
Regression 1 42 0 21 73
7.71752E+1 4.29E+1
Residual 18 2 1
7.73875E+1
Total 19 2

Coefficient Standard Upper Lower Upper


s Error t Stat P-value Lower 95% 95% 95.0% 95.0%
889150.90 200220.598 4.4408 0.0003 468503.04 1309798.7 468503.04 1309798.7
Intercept 97 3 56 16 19 78 19 78
- - - -
1.6334610 7.34027643 0.2225 0.8264 17.054809 13.787887 17.054809 13.787887
X Variable 1 24 2 3 04 56 51 56 51

The regression analysis was conducted to examine the strength of the relationship
between the mean average engagement of influencers (Y-axis) and the GDP per capita
of the selected countries (X-axis). Aside from predicting the values of the dependent
variable (mean average engagement of influencers), regression analysis also enables
the identification of the nature of the mathematical relationship between a dependent
variable and an independent variable (GDP per capita of the countries). It allows
quantification of the impact that variations in the independent variable exert on the
dependent variable and facilitates the identification of anomalous observations
(Berenson, Levine and Krehbiel, 2010).
The Multiple R Value, representing the correlation between the dependent and
independent variables is 0.0524. This value indicates a very weak positive correlation
between mean average engagement and GDP per capita. The R Squared Value, which
denotes the proportion of the variance in the dependent variable explained by the
independent variable is 0.00274. This indicates that only a very small fraction (less than
1%) of the variability in mean average engagement can be attributed to variations in
GDP per capita. The Adjusted R Squared Value, which adjusts the R Squared value
based on the number of predictors in the model is –0.0527. A negative adjusted R
Square suggests that the selected independent variables do not effectively contribute to
explaining the variability in the dependent variable.
The result of the regression analysis suggests a negligible and practically insignificant
relationship between the mean average engagement of influencers and the GDP per
capita of the selected countries. The low R Square and Adjusted R Square values
indicate that the variability in mean average engagement is not well-explained by
variations in GDP per capita. In summary, the regression analysis does not support the
presence of a meaningful linear relationship between the dependent and independent
variables.
Limitations and constraints of the study
The first limitation of our study was that we could not collect the data of the influencers
for all 195 countries of the world. This is simply because the data set, we retrieved did
not include all the countries of the world and was more focused on listing the top
influencers of the world, regardless of where they are from. Additionally, it is unclear if
every country in the world has influencers.
The second limitation was that an in-depth analysis could not be done because for
some countries there is a mention of only one influencer. Hence, we have excluded the
countries with just one influencer. We felt that including these countries may not show
an accurate representation of the relationship given that we were identifying the
“average” engagement numbers for each country.
The third limitation, which builds off the second limitation, is that the number of
influencers in some countries is much higher than in other countries. This indicates an
uneven sample size for different countries. Using the average number rather than each
individual engagement number can help the impact of this limitation, however it is still
worth mentioning when analyzing the results.
The final limitation has to due with the lack of a time series. Since the use of influencers
is new in the marketing field, there is no way to look at engagement rates over a time
period greater than 10 years. Thus, in the future, it will be interesting to revisit the study
of the relationship where there is sufficient data.
Recommendations to managers
After a thorough analysis, it has been observed that there is no significant correlation
between the mean average engagement of influencers in a country and the GDP per
capita of that country. This insight has implications for businesses aiming to leverage
influencer marketing for economic growth and brand promotion. Here are the key
recommendations based on the findings:
1. Engagement rate and reach of the influencers do not depend on a country’s
economy, so when choosing influencers to promote the brand, the company
does not need to take into account how developed a country is. The analysis has
revealed that the engagement rate (likes, comments, etc.) are not significantly
influenced by the economic development of a country. In other words, the level of
economic development, as measured by GDP per capita, does not necessarily
correlate with the level of audience engagement of the extent of an influencer’s
reach. This finding suggests that the effectiveness of influencer marketing
campaigns is not inherently tied to the economic prosperity of the influencer’s
home country. Therefore, businesses need not prioritize influencers based on the
economic development of the region they operate in. While selecting influencers,
businesses can broaden their scope and consider influencers from various
countries, irrespective of their economic status. Focus on influencers with high
engagement rates and broad reach, as these metrics are more indicative of the
influencer’s ability to connect with and impact their audience.
2. Companies should focus more on the influencer’s discipline or category rather
than the indicated purchase power of the people of a certain region. The analysis
underscores that the economic affluence of a region, as measured by GDP per
capita, does not necessarily translate into higher influencer engagement or
effectiveness. Therefore, instead of emphasizing the purchase power of a region,
businesses are encouraged to prioritize the discipline or category of the
influencer’s content. Influencers who align with the brand’s niche or industry are
more likely to resonate with their audience, leading to more authentic and
impactful engagements, irrespective of the economic development level pf the
region. Businesses should shift their focus from targeting influencers based on
the perceived purchasing power of their audience, which may vary across
regions. Instead, concentrate on influencers whose content aligns with the
brand’s objectives and resonates with the target audience, fostering stronger
connections and potentially driving better conversion rates.
These recommendations emphasize a shift from a region-centric approach to a content-
centric one in influencer marketing. By focusing on engagement metrics, reach, and the
alignment of influencer content with the brand’s industry, businesses can make more
strategic and impactful decisions in their influencer marketing campaigns.
Conclusion
In conclusion, we were surprised to see that there is not a clear relationship between
engagement rates and GDP per capita. Our hypothesis stated that there would be a
positive relationship between the two variables and countries with the highest GDP per
capita would also have the highest average engagement rates out of their influencers.
The results tell us that when a company is looking to utilize influencers for their
marketing strategies, they should only look at influencers with the highest engagement
rates. The good thing about Instagram is that a user in China can follow an influencer in
The Netherlands, for example. It’s possible for influencers to see where the majority of
their followers live, so companies can use this to their advantage.
Reference list

Anderson, D.R., Sweeney, D.J. and Williams, T.A. (2011). Essentials of Modern
Business Statistics with Microsoft Excel. Cengage Learning.

Berenson, M.L., Levine, D.M. and Krehbiel, T.C. (2010). Basic Business Statistics.
Prentice Hall.

Statista. (2019). Statista - The Statistics Portal. [online] Available at:


https://www.statista.com/markets/.

World Bank (2023). World bank open data. [online] World Bank. Available at:
https://data.worldbank.org/.

www.kaggle.com. (n.d.). Social Media Influencers. [online] Available at:


https://www.kaggle.com/datasets/ramjasmaurya/top-1000-social-media-channels.

You might also like