Professional Documents
Culture Documents
CS2401
INFORMATION ANALYTICS: TOOLS, TECHNIQUES & TECHNOLOGIES
HOMEWORK – PREPROCESSING DATA
Refer to the Excel file named homework-1.xlsx. This file contains fictitious data which relates a
company's employee performance score with salary.
1. What is the mean value of the Performance variable for filling in missing values?
2. Smooth the Performance variable by bin means, medians and boundaries using a bin depth of
3. Insert the smoothed values into their respective variables: Perf_Bin_Mean,
Perf_Bin_Median, Perf_Bin_Bound.
3. Normalize the Salary variable using min-max normalization with a new range of [5, 200]. Insert
the normalized values into the variable Sal_Norm.
SUBMISSION INSTRUCTIONS
2. With the exception of the smoothing and normalization questions, type in your answer for Part
2, Question 3 directly into this file. Then print it out for submission.
3. For the smoothing and normalization questions, type your answers into the Excel file. Then
print it out for submission. Printouts should fit within 1 page.
5. Submit your printouts at the start of class on 14 February 2023. Answers will be discussed
sometime during that lecture period. Submissions will no longer be accepted once discussions
begin.