Professional Documents
Culture Documents
Analysis/Dataset Pre-COVID
By Abhilash & Harshit
Agenda
• Objective
• Background
• Key Insights and Findings
• Conclusion
• Appendix
1. Used data Sources
2. Assumption made in Data Model
3. Document of Analysis Methodology
Objective
1. The analysis on data was done using python libraries. The data
received contains 48895 rows and 16 columns.
2. The features in the data had following data types. object(6),
float64(3), int64(7).
3. The data was looked and cleaned, thus removed the missing values
and outliers from it .
4. Visualizations were made using seaborn and Plotly libraries to
understand the key performance indicators from the analysis.
Data Preparation and Cleaning (cont…)
The data had the following features :
Categorical Variables:
- room_type Continous Variables(Numerical):
- Price
- neighbourhood_group
- minimum_nights
- neighbourhood
- number_of_reviews
- reviews_per_month
Location Varibles: - calculated_host_listings_count
- latitude - availability_365
- longitude - Continous Variables could be binned in to groups
Time Variable: too
- last_review
Customer preference of the Three Property Types
• It is seen that Manhattan Region has highest listing number and also have the highest
range of price.
• Most of the customers choose less than 5 days stay at high price range.
• Properties with high number of minimum nights of stay and higher rent the number
of reviews are decreasing.
Appendix – Data Model Assumptions
1. The data prior to the COVID-19 was helping achieve the desired revenue.
2. We assumed the company is not planning for expansion in new territories of
NYC.
3. We also assumed that, company's strategies are based on the increase in
travel post COVID period.