You are on page 1of 2

Exercises

1.
Consider the data set in:

• https://data.world/sdhilip/pizza-datasets

For it:

1. Select the suitable variables for a principal component analysis.


2. Calculate and represent graphically the correlation matrix for the selected variables.
3. Run a principal component analysis using the correlation matrix.
4. Produce a scree plot of the analysis’ eigenvalues. Select a suitable number of principle
components, justifying your answer.
5. Represent the analysis graphically with a plot of individuals and variables.
6. Perform a VARIMAX rotation on the selected principal components.
7. Indicate the most relevant original variables, justifying your answer.

2.
For the data set in file FREDtreasure.csv:

1. Run a principal component analysis using the correlation matrix.


2. Produce a scree plot of the analysis’ eigenvalues. Select a suitable number of principle
components, justifying your answer.
3. Represent, with a timeline plot, the projection of all observations in each principle
component.

3.
Consider the dataset in movies_social media.csv, regarding data on films from 2013 and 2014
and the social media reaction to them.

1. Standardize all variables.


2. Calculate and represent graphically the correlation matrix for the selected variables.
3. Run a principal component analysis using the correlation matrix.
4. Produce a scree plot of the analysis’ eigenvalues. Select a suitable number of principle
components, justifying your answer.
5. Analyzing the loadings of the selected principal components, comment on their possible
interpretation.
6. Represent the analysis graphically with a plot of individuals and variables.
7. Perform a VARIMAX rotation on the selected principal components and comment on the
results.

4.
Consider the dataset in world_data.csv, regarding data from various countries worldwide.

1. Standardize all variables.


2. Calculate and represent graphically the correlation matrix for the selected variables.
3. Run a principal component analysis using the correlation matrix.
4. Select a suitable number of principle component using a method other than the analysis
of the screeplot.
5. Analyzing the loadings of the selected principal components, comment on their possible
interpretation.
6. Calculate the communalities for each variable on the selected principal components.
Which are the most relevant variables?

You might also like