You are on page 1of 4

APPLIED DATA SCIENCE

MACHINE PROBLEM NO. 1: DATA STRUCTURES

A. The following scores were obtained by a chemical engineering graduate during the recently held licensure
exams:
Day 1: Physical and Chemical Principles 62%
Day 2: Chemical Engineering Principles 81%
Day 3: General Engineering Principles 95%

The examinee’s final rating is determined as follows: Day 1: 30%, Day 2: 40%, Day 3: 30%. Using the variables
day1, day2 and day3 for the scores, create an R code to determine the examinee’s final rating (use the
variable rating). Use the paste() function to obtain an output of the following form:

The examinee’s final rating is <final rating>.

Determine the data type of the variable rating.

B. The following profits/losses were recorded by your café for the past week:

Coffee Profits/Losses
Monday Profit 14,000
Tuesday Loss 5,000
Wednesday Profit 2,000
Thursday Loss 8,000
Friday Profit 18,000
Saturday Profit 23,000

Tea Profits/Losses
Monday Loss 8,000
Tuesday Loss 3,000
Wednesday Profit 10,000
Thursday Loss 6,000
Friday Profit 11,000
Saturday Profit 5,000

Create separate vectors for coffee and tea. Be sure to label the columns with the names of the days. Create a
code to determine and display: (a) the total daily profits/losses; (b) the total profits/losses for the week; (c)
the total profits/losses for both coffee and tea; (d) the profits/losses for coffee on Friday; (e) the average
profits/losses for tea during the midweek ~ Wednesday and Thursday; (f) the days where coffee bring profits.

Page 1 of 4
APPLIED DATA SCIENCE

C. Analyze the box office performance of the Star Wars trilogies. The following table shows the box office
revenues obtained by the first trilogy per region:

US Revenue, Non-US Revenue,


Title
million $ million $
New Hope 461 315
Empire Strikes Back 290 248
Return of the Jedi 309 166

Create a matrix for the data; do not forget to label the rows with the movie title and the columns with the
region revenue.

Determine the total revenue for each of the movies. Add a column for the worldwide box office figures using
the cbind() function.

Create another matrix for the data for the next three movies as follows:

US Revenue, Non-US Revenue,


Title
million $ million $
Phantom Menace 475 553
Attack of the Clones 311 339
Revenge of the Sith 381 469

Combine the two matrices using the rbind() function.

Determine the: (a) total box office revenue per film; (b) the total box office revenue of the entire saga; (c) the
average US revenue for the first two movies; (d) the number of tickets sold per region, assuming that each
ticket costs $5.

Page 2 of 4
APPLIED DATA SCIENCE

D. Create a data frame named heroes_df to contain the following data on some graphic novel characters.

Name Gender Eye Hair First Appearances Deceased? Publisher


Color Color App
Aquaman Male Blue Blond 1941 1121 No DC
Batman Male Blue Black 1939 3093 No DC
Flash Male Blue Blond 1956 1028 No DC
Iron Man Male Blue Black 1963 2961 No Marvel
Jean Grey Female Green Red 1963 1107 Yes Marvel
Ororo Munroe Female Blue White 1975 1512 No Marvel
Stephen Strange Male Grey Black 1963 1307 No Marvel
Superman Male Blue Black 1986 2496 No DC
Wolverine Male Blue Black 1974 3061 No Marvel
Wonder Woman Female Blue Black 1941 1231 No DC

Select and print out the following: (a) complete data for Iron Man; (b) first three values of hair color; (c) data
of deceased hero/es; (d) heroes with blue eyes; (e) all heroes in order of first appearance (f) Marvel heroes;
(g) heroes with more than 2,000 appearances in the novels.

Page 3 of 4
APPLIED DATA SCIENCE

E. Prepare the following variables


 mov: character string containing the movie title
The Shining

 act: vector containing the actors’ names


Jack Nicholson
Shelley Duvall
Danny Lloyd
Scatman Crothers
Barry Nelson

 rev: data frame containing the following


Scores Sources Comments
1 4.5 IMDb1 Best horror film I have ever seen.
2 4.0 IMDb2 A truly brilliant and scary film from Stanley Kubrick
3 5.0 IMDb3 A masterpiece of psychological horror

Create a list named shining_list that contains all three variables given above. Use the names
moviename, actors, reviews.
a. Print out the vector representing the actors.
b. Print out the second element of the vector representing the actors.
c. Add the year 1980 to the list; print out the contents of the final list.

Page 4 of 4

You might also like