You are on page 1of 1

PROJECT 1

DATA PREPROCESSING

You are given a dataset to perform data pre-processing. Provide a report of your PROJECT 1 including the
following description:

Title of your report: DATA PREPROCESSING: CASE STUDY ON <<CHOICE OF YOUR DATA>> DATASET

In general your assignment should contain an abstract, main document (at least 4 sections including
conclusions) and references.(Please refer the report writing guide (Springer Verlag Manuscript Format) to
standardize your report)

Content of report should cover the following issues.

1. PRELIMINARY PROBLEM AND DATA DESCRIPTION 


Description on Domain problems (including Business Goal and Business Question of your project)

PART A. Data Description


a. Source of Data
b. List of attributes with their types (categorical, nominal, continuous). 
c. Related literature on using the dataset or similar (at least 5)
You are required to present the DATA QUALITY REPORT in this section.
Preliminary examination of data (You are encouraged to visualize your data description

PART B. DATA CLEANING 


a. Handling Missing Values
b. Handling Noise
c. Handling inconsistent data

PART C. DATA REDUCTION


a. Apply data cube aggregation (if necessary)
b. Apply data compression (if necessary)
c. Perform Dimensionality Reduction
d. Perform Numerosity Reduction
e. Perform Data Transformation

PART D. Results Presentation and Visualization


a. Histogram of data distribution (before and after reduction)
b. Example of raw data, cleaned data, discretized data, transformed data. 

Strictly refer to report writing guide. (Springer Writing Format)


Brief Content of your Report (Should have 5 sections)

Section 1 – Introduction 
Section 2 – Related work
Section 3 – Material and Methods (Include PART A, B, C)
Section 4 - Results of data pre-processing (PART D)
Section 5 – Conclusion 
Acknowledgement
References

You might also like