Professional Documents
Culture Documents
Lukas Foutz - w1516 Statistics Project - 5375624
Lukas Foutz - w1516 Statistics Project - 5375624
Directions: You will be working on your own to complete the following statistics project. You can
choose a topic of your interest. See examples below or you can choose a topic of your own (MUST
email Mrs. Fletcher to get approval first). Be sure ALL work is shown, sources are cited (if you
collect data from another source), pictures of graphs are included, and explanations are clear,
concise, and grammatically correct. This will count as a summative assessment. Please refer to the
rubric to see how this project will be graded.
Examples of Topics
1. NBA player’s height vs their weight.
2. NBA player’s points per game (PPG) vs number of years in the league.
3. Number of games a Quarterback wins vs his NFL salary.
4. Salary vs batting average for MLB players.
5. Hours of sleep the night before a test vs test grade.
6. Followers a singer has on Twitter vs amount of spotify downloads they have.
7. Amount a person exercises vs their body fat (BMI).
8. Number of people enlisting in the military vs the nation’s unemployment rate.
9. Years of schooling after High School vs Average salary.
10. Family size (# of immediate family members) vs GPA
11. Family size (# of immediate family members) vs travel experience (outside of US)
12. GPA vs travel experience (outside of US)
Is there a correlation between the amount of monthly Spotify listeners (in billions) and
YouTube subscribers (in millions)
2. Why did you choose this topic? What specific result do you expect to find? What type of
correlation do you expect to see (positive or negative) and how strong do you expect the
correlation to be (weak, moderate, or strong)? Why do you expect this result (your rationale)?
Explain.
I chose this topic because I find it interesting how much people listen on Spotify versus
how many people watch the videos on YouTube. I expect to find a moderate positive
correlation. I expect this result because the more people that listen to the artist on Spotify
will flood to YouTube and subscribe to the channel.
I’m going to collect my data on a website. I will find the top 100 listened to artists on
Spotify, then put those artists on a wheel to spin. I will take 25 of those artists and then look
at their respective YouTube account, and see how many subscribers there are.
2. Create a table to organize your data. Be sure to label your table appropriately.
Maluma 30.3 28
The correlation is a weak positive correlation. This means that the more Spotify listeners
there are, the more YouTube subscribers there will most likely be.
3. Use google sheets to calculate the correlation coefficient (r). What do you get? Does the value
of r reinforce the impression conveyed by the scatter plot? Please include an interpretation
of its meaning.
The Correlation Coefficient is .710. This does reinforce it because it’s sort of strong in the
front, but weaker towards the back. The .710 means that it’s a positive moderate correlation.
A strong positive correlation would be 1, and no correlation would be 0. A strong negative
correlation would be -1, which means that when one variable increases, the other variable
decreases.
4. Graph the regression line and find the equation of the regression line using google sheets.
5. Discuss the slope of the regression line and interpret its meaning in terms of your context.
This means that for every .706 YouTube subscribers there are, there are 1 million monthly
listeners on Spotify
6. Use the regression line’s equation to make a prediction in terms of your variables or context
about a y-value given a specific x-value (be sure to pick an x-value that makes sense).
a. What do you want to predict? Explain using your variables or context. Example: I want
to predict how many surfers we can expect at the beach with waves that are 2 feet high.
I want to predict the amount of YouTube subscribers there are if they have 100
million monthly listeners
b. What x-value are you going to use? Why did you choose this x-value?
For my x-value I’m going to use 100. Because it’s in millions, I want to find the
number of subscribers if there are 100 million listeners.
Regression Equation:
d. Explain the results of your prediction. What is the purpose of finding this prediction?
How would this information be beneficial? Who would benefit from these findings?
So if there are 100,000,000 monthly listeners, there would be about 68,770,000
YouTube Subscribers. The purpose of this is to find how many YouTube subscribers
there would be if there were 100,000,000 million monthly listeners. This would be
beneficial so we could find out if someone is above average or below average in their
music career. Justin Bieber has almost 68.77 million subscribers, but he only has 88.6
million monthly listeners. This means he’s doing above average in the music
industry.
Yes. It does. I believed it would be a moderate positive correlation, and it was. As more
people listen to the artist on Spotify, more people will like the artist. This means they might
have a music video of the song. This might cause them to watch the video, and subscribe to
the artists channel on YouTube
2. What conclusions, if any, do you believe you can draw as a result of your study? Were there
any surprises in the data collected? If the results were not expected, what factors might
explain your results? What did you learn about your variables? (see rubric for possible level
4 expectations)
There was a surprise. One of the outliers (Diplo) had 29 million monthly listeners on
Spotify, however, he only had 2.4 million subscribers. So he had a high amount of listeners,
but not a high amount of subscribers. Another outlier was Justin Bieber, who skyrocketed
in both. He had 88.6 million monthly listeners, and 66.6 subscribers. These are way above
average. This means he is the most popular artist of all time.
3. Connect your findings to the real world. Who would benefit from the results of your study?
Explain. (see rubric for possible level 4 expectations)
This would benefit people if they want to be in the music industry because they could see
their amount of monthly listeners on Spotify, and they could expect a certain amount of
YouTube subscribers. Another benefit of this would be to see how artists compare to each
other. You could also listen to the artists who are really popular and see what kind of music
they make and if they’re similar. If the top music is popular, then you would want to make
that kind of music to get more Spotify Listeners
3 INTRODUCTION
❏ Topic is described as a research question
❏ Variables are defined
❏ Predictions are described/explained
DATA
❏ Data collection process is explained
❏ Appropriate amount of data is collected
❏ Table of data is included with appropriate labels
ANALYSIS
❏ Scatter plot is included with appropriate labels
❏ Correlation is described and explained
❏ Correlation coefficient is calculated and its interpretation is included
❏ Graph of scatter plot is included with regression line AND equation of regression line displayed
❏ Slope of regression line equation is discussed and interpreted in terms of context
❏ Prediction is made with equation of regression line with work shown and explanation of results
RESULTS/SUMMARY
❏ Results of study are interpreted, explained, and related back to original research question
❏ Conclusion is explained
❏ Connections are made within the context of the topic
0 Project is not turned in OR parts (or all) of project is copied from another student