Professional Documents
Culture Documents
Due date: Sunday February 18th by 11:30pm. (Closing date: March 10th at
midnight)
Learning goals:
1. Explore pre-processing techniques on a dataset.
2. Get familiar with the regression approaches available in scikit-learn.
3. Practice applying regression approaches.
4. Practice using cross-validation to select the best performing approach.
Instructions:
In the Brigthspace folder for this assignment, there is a dataset available
(A2data.tsv). This dataset consists of 99 numeric inputs and one numeric label for
48 instances. The dataset is given as a tab-delimited text files with one instance
per line and a column header. The first 99 columns are the features and the last
column is the output/label.
Your job is to work with this data to generate a regression model. You are allowed
to use any pre-processing technique, feature selection and regression method
available in scikit-learn. You are required to assess model performance using
cross-validation. These are the steps to complete this assignment:
Submission:
1. Python code used to complete this assignment. Include instructions on how