Welcome to Scribd!

Default Problem Is Regression. Feel Free To Define Your Own

Uploaded by

0% found this document useful (0 votes)

4 views2 pages

This document provides guidance for students completing a mini-project on an AirBnB dataset from Seattle. It outlines the grading scheme, suggests defining an interesting problem and preparing the data, recommends using machine learning tools like regression and classification, and emphasizes learning something new. Students are warned not to bite off more than they can chew by trying to learn too many new things or tackle overly complex problems and visualizations. The final submission should include a presentation, code, and references.

Original Description:

Original Title

Project Instructions

Copyright

Available Formats

DOCX, PDF, TXT or read online from Scribd

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Report this Document

Copyright:

Available Formats

Download as DOCX, PDF, TXT or read online from Scribd

Flag for inappropriate content

0% found this document useful (0 votes)

4 views2 pages

Default Problem Is Regression. Feel Free To Define Your Own

Uploaded by

Thanh Sang Nguyễn

Copyright:

Available Formats

Download as DOCX, PDF, TXT or read online from Scribd

Flag for inappropriate content

Jump to Page

You are on page 1of 2

Search inside document

Dataset 1 : AirBnB Open Data from Seattle

Source : https://www.kaggle.com/airbnb/seattle
Default problem is Regression. Feel free to define your own.
Primary Data : The dataset posted by AirBnB on Kaggle.
Join the Kaggle platform to obtain the AirBnB Dataset.
This is NOT a Competition. NO participation is needed.

What is the Grading Scheme for the Mini-Project?

10% for coming up with an interesting problem based on the dataset

10% for exploratory data analysis / visualization to understand the data
20% for preparing the dataset to suit your specific problem definition
20% for the use of data science / machine learning to solve the problem
40% for learning something new and/or doing something beyond the course

What is an "interesting problem" based on the dataset?

Choose an interesting problem. It should not be something that you can solve by copy-
pasting the LinearRegression or DecisionTreeClassifier codes from the regular course
material. There should be something beyond that, for which you will have to learn
something new, or apply some new technique. If you are unsure whether your problem
is interesting, ask for the Lab Instructor's advice.
Warning : In quest for "interesting problem", please do not attempt something that you can't
finish in time.
How much of Visualization should be presented?

It's worth only 10%, so do not spend the bulk of time on cool visualization tools. Do
standard exploration of the data, and standard statistical visualizations, as done during
the course, just to understand your data well enough. You DO NOT need to produce
data dashboards and cool web interfaces to do an impressive project.
Warning : In quest for "cool presentation", please do not try something that takes too much time
to learn.
What do you mean by "preparing the dataset"?

The dataset given to you may not be in the proper format to solve the problem you
targeted. Preparing means cleaning the data, resizing/reshaping the data, removing
outliers (if necessary), balancing imbalanced classes (if necessary), grouping the
rows/columns as necessary, etc. This is an important part of any DS/ML project.
How much of DS / ML tools should I use for the project?

This is one of the main parts of your project. You may use any tool and technique that
you have seen during the course, for Regression, Classification, Clustering, Anomaly
Detection. If you want something simple, stick to Scikit-Learn as your DS/ML toolbox.
You may also choose to use new models that you have not seen in the course, like
Random Forest for regression, Naive Bayes for classification, DBSCAN for clustering,
etc.
Warning : In quest for "quick impression", please do not try complex tools that takes too much
time to learn.
What do you mean by "learning something new" beyond the course?

The goal for the mini-project is to make you learn something new. Try to use new
DS/ML model for regression, classification, clustering or anomaly detection, beyond
what we have already covered in the course. That's the quickest way to prove you
learned something new. You may also want to explore a new visualization tool (like
Plotly), or a new technique for data preparation (like resampling), or explore
additional/extra data.
Warning : In quest for "quick impression", please do not try to learn too many new things at the
same time.
What will I finally submit for the Mini-Project?

You will have to submit your PPT/PDF slides used for the Presentation, all your codes
(Jupyter Notebooks, Python codes, Visualization codes, etc.), with reference to the
resources you used during the project. If you build an application, a visualization tool, a
dashboard or a website, you may also submit the link to that.

HP ActualTests HP2-H08 v2011-06-14 by Nowerdays
Document39 pages
HP ActualTests HP2-H08 v2011-06-14 by Nowerdays
Lai Jack
0% (2)
Neonatal Incubator
Document38 pages
Neonatal Incubator
dhamodhar
100% (1)
Tools
Document8 pages
Tools
Akshat
No ratings yet
How To Learn Data Science
Document8 pages
How To Learn Data Science
Atsal
100% (1)
A Complete Guide To Learning Data Analytics in 100 Days-Brittany City
Document7 pages
A Complete Guide To Learning Data Analytics in 100 Days-Brittany City
ox2k7
No ratings yet
M.tech Thesis Labview
Document4 pages
M.tech Thesis Labview
crystalalvarezpasadena
100% (2)
Week5 Modified
Document25 pages
Week5 Modified
turbonstre
No ratings yet
Why Do AI Initiatives Fail
Document5 pages
Why Do AI Initiatives Fail
Md Ahsan Ali
No ratings yet
365 SyllabusDataVisualization
Document3 pages
365 SyllabusDataVisualization
Gokul kapoor
No ratings yet
ML Interview Questions
Document146 pages
ML Interview Questions
IndraneelGhosh
No ratings yet
The Ultimate Learning Path To Become A Data Scientist and Master Machine Learning in 2019
Document12 pages
The Ultimate Learning Path To Become A Data Scientist and Master Machine Learning in 2019
k2sh
No ratings yet
Beginner and Awareness Class: HR Data Science - People Analytics - People Analytics
Document3 pages
Beginner and Awareness Class: HR Data Science - People Analytics - People Analytics
Heru Wiryanto
No ratings yet
Spring 2022 Task List #4
Document6 pages
Spring 2022 Task List #4
Rupal Bilaiya
No ratings yet
10 Things Know Before First Data Science Project
Document8 pages
10 Things Know Before First Data Science Project
shaikkulsum10
No ratings yet
An Introduction To Data Science (2022 Updated Edition)
Document9 pages
An Introduction To Data Science (2022 Updated Edition)
Miguel Silva
No ratings yet
How To Install Tableau in 11 Easy Steps
Document8 pages
How To Install Tableau in 11 Easy Steps
sboothpur
No ratings yet
Data Science Interview Resources
Document12 pages
Data Science Interview Resources
Krutika Sapkal
No ratings yet
Data Analyst
Document16 pages
Data Analyst
Aijaz Khaja
No ratings yet
Final Syllabi Data Analytics
Document3 pages
Final Syllabi Data Analytics
Arun
No ratings yet
65 Free Data Science Resources For Beginners PDF
Document19 pages
65 Free Data Science Resources For Beginners PDF
girishkumar
No ratings yet
Implementing Data Science Projects PDF
Document2 pages
Implementing Data Science Projects PDF
Fouad B
No ratings yet
Tableau Fundamental or Beginners
Document13 pages
Tableau Fundamental or Beginners
Varun Gupta
No ratings yet
Sraw Homework Assignment
Document4 pages
Sraw Homework Assignment
ersc16mz
100% (1)
A Road Map For Data Science. What Is Data Science - by Jared - Towards Data Science PDF
Document6 pages
A Road Map For Data Science. What Is Data Science - by Jared - Towards Data Science PDF
likhith
No ratings yet
Cmis 320 Homework 4
Document6 pages
Cmis 320 Homework 4
afmtgsnad
100% (1)
Data Strategy Guide
Document18 pages
Data Strategy Guide
Amit Jain
No ratings yet
Data Strategy Nov 6
Document35 pages
Data Strategy Nov 6
Amit Jain
No ratings yet
7 Fundamental Steps To Complete A Data Analytics Project
Document6 pages
7 Fundamental Steps To Complete A Data Analytics Project
Adnan Hashmi
No ratings yet
BD Project Document
Document3 pages
BD Project Document
طاهر محمد
No ratings yet
Introduction To Data Science
Document6 pages
Introduction To Data Science
salembalki
No ratings yet
Architecture Thesis Justification
Document7 pages
Architecture Thesis Justification
Amy Cernava
100% (2)
Data Science
Document4 pages
Data Science
Shikhar Choudhary
No ratings yet
Sample Thesis
Document6 pages
Sample Thesis
HelpMeWithMyPaperRochester
100% (2)
Master Microsoft Excel
Document6 pages
Master Microsoft Excel
Mohit Bansal
No ratings yet
如何修改一份完美的DS求职简历（中英文版）
Document8 pages
如何修改一份完美的DS求职简历（中英文版）
xi
No ratings yet
Case Study DSBDA
Document12 pages
Case Study DSBDA
SHREYANSHU KODILKAR
No ratings yet
CCB302 Assessment 2 Task Sheet
Document9 pages
CCB302 Assessment 2 Task Sheet
faith
No ratings yet
How To Spot A Fake Data Scientist - Towards Data Science
Document8 pages
How To Spot A Fake Data Scientist - Towards Data Science
Jorge Francisco
No ratings yet
Master Data Science Essentials 2015-11 SHARPSIGHTLABS
Document11 pages
Master Data Science Essentials 2015-11 SHARPSIGHTLABS
pressorg
No ratings yet
Computer Assisted Learning
Document19 pages
Computer Assisted Learning
Ismail Ekmekci
No ratings yet
T-Shaped Skills Builder Guide in 2020 For End-To-End Data Scientist
Document11 pages
T-Shaped Skills Builder Guide in 2020 For End-To-End Data Scientist
dung
No ratings yet
Term Paper Powerpoint Presentation
Document5 pages
Term Paper Powerpoint Presentation
ea1yd6vn
100% (1)
Product Design Coursework Example Gcse
Document7 pages
Product Design Coursework Example Gcse
irugqgajd
100% (2)
Tablue Req
Document8 pages
Tablue Req
IT DEVE
No ratings yet
CapStone Project
Document4 pages
CapStone Project
Manojay's Directionone
No ratings yet
Capstone Project: Create An AI Product Business Proposal
Document3 pages
Capstone Project: Create An AI Product Business Proposal
Mutati T
No ratings yet
Data Visualization and Communication Syllabus
Document9 pages
Data Visualization and Communication Syllabus
Ashwin Malshe
No ratings yet
Introduction To Business Analytics: II: Dashboards - Answer To What's Happening Now?
Document4 pages
Introduction To Business Analytics: II: Dashboards - Answer To What's Happening Now?
Robert Mythbust
No ratings yet
A2 Ict Coursework Ccea
Document5 pages
A2 Ict Coursework Ccea
f5dct2q8
100% (2)
Project Requirement
Document3 pages
Project Requirement
tszt49dwv4
No ratings yet
Practice Machine Learning With Datasets From The UCI Machine Learning Repository
Document23 pages
Practice Machine Learning With Datasets From The UCI Machine Learning Repository
prediatech
No ratings yet
Executive Overview: What Exists Already
Document5 pages
Executive Overview: What Exists Already
rizzapps
No ratings yet
Key Roles and Life Cycle
Document4 pages
Key Roles and Life Cycle
Aman
No ratings yet
Gcse Graphics Coursework Task Analysis
Document5 pages
Gcse Graphics Coursework Task Analysis
qtcdctzid
100% (2)
Quantitative Methods in Business, Bus 1B Course Syllabus: Updated: 12/5/2013
Document6 pages
Quantitative Methods in Business, Bus 1B Course Syllabus: Updated: 12/5/2013
william1230
No ratings yet
Ebook Data Science
Document48 pages
Ebook Data Science
Ardian Syair
100% (2)
Free Download Practical Statistics For Data Scientists PDF
Document4 pages
Free Download Practical Statistics For Data Scientists PDF
prudvi
No ratings yet
EME 2026: Engineering Design II Design Project (50%) : Select One of The Following Tasks For Your Design Project
Document11 pages
EME 2026: Engineering Design II Design Project (50%) : Select One of The Following Tasks For Your Design Project
Kaviraj Jayaraman
No ratings yet
Thesis Edit Box
Document4 pages
Thesis Edit Box
christydavismobile
100% (1)
Design and Technology A Level Coursework
Document5 pages
Design and Technology A Level Coursework
f5e28dkq
100% (2)
Concept Based Practice Questions for Tableau Desktop Specialist Certification Latest Edition 2023
From Everand
Concept Based Practice Questions for Tableau Desktop Specialist Certification Latest Edition 2023
Exam OG
No ratings yet
How to Write a Technical Manual Fast
From Everand
How to Write a Technical Manual Fast
Regina Clarke
Rating: 5 out of 5 stars
5/5 (2)
COC 3 Setup Computer Server
Document6 pages
COC 3 Setup Computer Server
Robert Coloma
70% (10)
APDL - Chapter 4 - APDL As A Macro Language (UP19980820)
Document18 pages
APDL - Chapter 4 - APDL As A Macro Language (UP19980820)
Nam Vo
No ratings yet
F
Document45 pages
F
Dương Thị Phương Uyên
No ratings yet
Constructor Overloading in Java
Document9 pages
Constructor Overloading in Java
Sayon Gorai
No ratings yet
Presec@80 Logo Design Competition
Document11 pages
Presec@80 Logo Design Competition
Gafaro Mohammed
No ratings yet
Upload Test New
Document7 pages
Upload Test New
Karan
No ratings yet
Sakurairo Maukoro (Sheet Music Piano)
Document6 pages
Sakurairo Maukoro (Sheet Music Piano)
nam_hoang_20
No ratings yet
01 Daily Lesson Log June 07, 2019
Document3 pages
01 Daily Lesson Log June 07, 2019
Ronie Israel Gozon
100% (1)
1 - Artificial Intelligence Introduction
Document30 pages
1 - Artificial Intelligence Introduction
sreem naga venkata shanmukh sai
No ratings yet
So You Want To Teach Digital Technologies
Document22 pages
So You Want To Teach Digital Technologies
antuan136
No ratings yet
Install and Activate Office 2019 For FREE Legally Using Volume License
Document15 pages
Install and Activate Office 2019 For FREE Legally Using Volume License
01666754614
No ratings yet
Arduino SBS Draft Notes May 2015
Document303 pages
Arduino SBS Draft Notes May 2015
Nguyen Tri
No ratings yet
ADL Digital Lean Management
Document4 pages
ADL Digital Lean Management
Duarte CRosa
No ratings yet
How To Create A Signup Form in Benchmark
Document110 pages
How To Create A Signup Form in Benchmark
ElaineGay Echavez
No ratings yet
Key Sequence For Tamil99
Document12 pages
Key Sequence For Tamil99
Karthik
No ratings yet
Unit 5 Programming Assignment Solution
Document6 pages
Unit 5 Programming Assignment Solution
Sunita Lamichhane
No ratings yet
Swiss Eph
Document49 pages
Swiss Eph
Thiago Malaca
No ratings yet
Oopr212 Prelims Reviewer
Document13 pages
Oopr212 Prelims Reviewer
shawn mallari
No ratings yet
Evil-WinRM Shell - Tools - Hack The Box - Forums
Document11 pages
Evil-WinRM Shell - Tools - Hack The Box - Forums
fghfghjfghjvbnyjytj
No ratings yet
Bdu Cde MGT System
Document94 pages
Bdu Cde MGT System
mekonen belete
No ratings yet
Steal This File Sharing Book What They Wont Tell You About File Sharing Wallace Wang 2004
Document228 pages
Steal This File Sharing Book What They Wont Tell You About File Sharing Wallace Wang 2004
Private Infringer
No ratings yet
DSC Govt. School
Document3 pages
DSC Govt. School
KAILASH GURJAR
No ratings yet
Windows Installation Guide
Document70 pages
Windows Installation Guide
Hamid Reza
No ratings yet
Tampugo ES Information Officer Action Plan
Document7 pages
Tampugo ES Information Officer Action Plan
ANTONETTE LAPLANA
No ratings yet
Directive JSP Versi 2 Bab 3
Document2,188 pages
Directive JSP Versi 2 Bab 3
U_arie
No ratings yet
Grid Radio Button Functionality
Document5 pages
Grid Radio Button Functionality
Vijay Bhalerao
No ratings yet
AdminLTE 3 DataTables
Document2 pages
AdminLTE 3 DataTables
Dani Murtadho
No ratings yet