You are on page 1of 5

SCL

MTH3409

COMPUTATIONAL
STATISTICS

Department of Mathematics and Statistics


Faculty of Science
Universiti Putra Malaysia

MTH3409 COMPUTATIONAL STATISTICS


MODULE TITLE SUMMARIZING AND DESCRIBING DATA USING GRAPHS
AND NUMERICAL MEASURES
COURSE MTH3409 COMPUTATIONAL STATISTICS
MODULE CODE/NO UPM/FS/MTH3409/SCL_SEM1_2022-2023
DURATION 3 hours (None - face to face)
LEARNING 1. MTH3409 (Lecture Notes).
RESOURCES 2. Any related references book.
PRIOR Students must have prior knowledge on:
KNOWLEDGE 1. Numerical measures for describing data.
2. Type of charts and graphs.
3. Type of loops.
LEARNING At the end of this module, students should be able to work in
OUTCOMES group for constructing several types of graphs, computing
certain numerical measures using R software (TS).
COURSE 1. Basic graphics for describing data
CONTENTS AND 2. Measure of central tendency.
LEARNING 3. For loop and while loop
ACTIVITIES
SUMMARY This student-centered learning activity helps students to work
together in group to solve problem related to statistics. In
addition, the students be able to acquire skills for describing
data based on graphical methods and numerical measures
using R.

ASSIGNMENT & Group work


ASSESSMENT
1. Form a group consisting of 4 - 5 students.
2. You must use R software to answer all the questions
3. In your respective groups, solve ALL the problems and
submit your solution, i.e., the R code and the output in word
or PDF file (softcopy submission via PutraBlast).
4. Prepare a video presentation for selected questions to be
uploaded in PutraBlast.

Deadline: Friday 25th November 2022, 5 pm


REFLECTION Feedback by the instructor.

MTH3409 COMPUTATIONAL STATISTICS


ASSIGNMENT OF STUDENT-CENTERED LEARNING

1) Create four vectors: v1 – sequence of numbers from 1 to 10 with length = 4; v2 – sequence


of numbers from 11 to 20 with increment = 2.5; v3 – 4 times repetition of number 5; v4 – a
numeric vector 50, -50, -100, 100. [4 marks]
a) Combine the four vectors to become a 4×4 matrix. Next, change the row names to a,
b, c, d and name the matrix mymatrix. [2 marks]
b) From the result in (a), switch column 2 with column 3 (make sure the column names
are also changed). [2 marks]
c) From the result in (b), sort column 1 from largest to smallest, then exponentiate its
element. [2 marks]
d) From the result in (c), replace each element of column 4 with their respective row sum.
[2 marks]

2) Using a while loop, write a function in R that consists of two arguments. The first argument
is a numeric vector, and the second argument is a number. This function will print out all the
elements in the numeric vector and stop once an element is larger than the number of the
second argument. [5 marks]

3) Write a double for loop that prints out the following output: [5 marks]

MTH3409 COMPUTATIONAL STATISTICS


4) Import Malaysian COVID-19 data from the following URL into R: [2 marks]
https://raw.githubusercontent.com/MoH-Malaysia/covid19-public/main/epidemic/cases_state.csv

a) Create a new data frame from the imported data by selecting rows 5473 to 5968.
[2 marks]
b) From the data frame in (a), create a new data frame by selecting these variables:
state, cases_new, cases_child, cases_adolescent, cases_adult, and
cases_elderly. Name the data frame as mydata. [2 marks]
c) Based on mydata, write an R script that will print out the cases_new elements that
≥ 300. [2 marks]
d) Based on mydata, write a loop that will print out the cases_new elements that ≥ 300.
Check your answer with the results in (c). [3 marks]

5) Based on mydata obtained in 4(b),


a) Construct a scatter plot as below, [4 marks]

b) Using aggregate(), calculate the mean of cases_new, cases_child,


cases_adolescent, cases_adult, and cases_elderly based on state. [3 marks]
c) Calculate the sum of cases_new, cases_child, cases_adolescent, cases_adult,
and cases_elderly based on state. [3 marks]

MTH3409 COMPUTATIONAL STATISTICS


d) Transform the result that you obtained in (c) into a matrix. The matrix should have five
rows: Kedah, Kelantan, Pahang, Perak, Terengganu, and four columns:
cases_child, cases_adolescent, cases_adult, cases_elderly. [3 marks]
e) Construct a bar graph of the sum (or frequency) of cases_child, cases_adolescent,
cases_adult, and cases_elderly by state based on the results in (d). See the bar
plot below [4 marks]

6) Prepare a video presentation explaining how to write the R code and summarizing your
findings for questions 4 and 5. [Marks based on rubric]

MTH3409 COMPUTATIONAL STATISTICS

You might also like