You are on page 1of 1

Wee Kim Wee School of Communication and Information

CS2401
INFORMATION ANALYTICS: TOOLS, TECHNIQUES & TECHNOLOGIES
HOMEWORK – PREPROCESSING DATA

Name (as in matriculation card): _____________________________________

Refer to the Excel file named homework-1.xlsx. This file contains fictitious data which relates a
company's employee performance score with salary.

1. What is the mean value of the Performance variable for filling in missing values?

Type in your answer: _____________

2. Smooth the Performance variable by bin means, medians and boundaries using a bin depth of
3. Insert the smoothed values into their respective variables: Perf_Bin_Mean,
Perf_Bin_Median, Perf_Bin_Bound.

3. Normalize the Salary variable using min-max normalization with a new range of [5, 200]. Insert
the normalized values into the variable Sal_Norm.

SUBMISSION INSTRUCTIONS

1. All answers should be in 2 decimal places.

2. With the exception of the smoothing and normalization questions, type in your answer for Part
2, Question 3 directly into this file. Then print it out for submission.

3. For the smoothing and normalization questions, type your answers into the Excel file. Then
print it out for submission. Printouts should fit within 1 page.

4. Remember to include your name in both files prior to printing.

5. Submit your printouts at the start of class on 14 February 2023. Answers will be discussed
sometime during that lecture period. Submissions will no longer be accepted once discussions
begin.

31 Nanyang Link, Singapore 637718


Tel: +65 6790 6290, Fax: +65 6791 5214
www.ntu.edu.sg/sci

You might also like