Professional Documents
Culture Documents
Example 4
Numerical and
graphical summaries Worldwide Internet and Facebook Use
Picture the Scenario
The number of worldwide Internet users and the number of users of social
networking sites such as Facebook have grown significantly over the past
decade. This growth though has not been distributed evenly throughout
the world. Countries such as Australia, Sweden, and the Netherlands have
achieved an Internet penetration of more than 80%, while only 7.1% of
India’s population uses the Internet. The story with Facebook is similar.
More than 40% of the populations of countries such as the United States and
Australia use Facebook, compared to fewer than 3% of the populations of
countries such as China, India, and Russia.
The Internet Use data file on the text CD contains recent data for 33
countries on Internet penetration, Facebook penetration, broadband sub-
scription percentage, and other variables related to Internet use. In this
example, we’ll investigate the relationship between Internet penetration and
Facebook penetration. Note that we will often say “use” instead of “penetra-
tion” in these two variable names. Table 3.4 displays the values of these two
variables for each of the 33 countries.
(Continued)
100 Chapter 3 Association: Contingency, Correlation, and Regression
UK 70.18% 45.97%
Question to Explore
Use numerical and graphical summaries to describe the shape, center, and vari-
ability of the distributions of Internet penetration and Facebook penetration.
Think It Through
Using MINITAB, we obtain the following numerical measures of center and
variability:
Figure 3.4 portrays the distributions using histograms. We observe that the dis-
tribution of Internet use is unimodal and skewed to the left. The distribution of
Facebook use can be characterized as roughly symmetric with multiple modes.
Figure 3.4 MINITAB Histograms of Internet Use and Facebook Use for the 33 Countries. Question Which nations, if any,
might be outliers in terms of Internet use? Facebook use? Which graphical display would more clearly identify potential outliers?
Section 3.2 The Association Between Two Quantitative Variables 101
Insight
The histograms portray each variable separately. How can we portray the
association between Internet use and Facebook use on a single display?
We’ll study how to do that next.
Scatterplot
A scatterplot is a graphical display for two quantitative variables using the horizontal (x)
axis for the explanatory variable x and the vertical (y) axis for the response variable y. The
values of x and y for a subject are represented by a point relative to the two axes. The
observations for the n subjects are n points on the scatterplot.
Example 5
Questions to Explore
a. Display the relationship between Internet use and Facebook use with
a scatterplot.
b. What can we learn about the association by inspecting the scatterplot?
Think It Through
a. The first step is to identify the response variable and the explana-
tory variable. We’ll study how Facebook use depends on Internet
use. A temporal relationship exists between when individuals become
Internet users and when they become Facebook users; the former pre-
cedes the latter. We will treat Internet use as the explanatory variable
and Facebook use as the response variable. Thus, we use x to denote
Internet use and y to denote Facebook use. We plot Internet use on
the horizontal axis and Facebook use on the vertical axis. Any statisti-
cal software package can create a scatterplot. Using data such as that
in Table 3.4, place Internet use in one column and Facebook use in
another. Select the variable that plays the role of x and the variable
that plays the role of y. Figure 3.5 shows the scatterplot created with
MINITAB.
Consider the observation for the United States. It has Internet use x = 77
and Facebook use y = 47. Find its point circled in black.
102 Chapter 3 Association: Contingency, Correlation, and Regression
Figure 3.5 MINITAB Scatterplot for Internet Use and Facebook Use for 33
Countries. The point for Japan is labeled and has coordinates x = 74 and y = 2.
Question Is there any point that you would identify as standing out in some way? Which
country does it represent, and how is it unusual in the context of these variables?
Insight
Although the points for Japan and, to a lesser extent, the Netherlands, can
be considered atypical, there is a clear overall association. The countries with
lower Internet use tend to have lower Facebook use, and the countries with
high Internet use tend to have high Facebook use.