You are on page 1of 12

A Training Report

Submitted in Partial Fulfilment of the Requirements for


The Award of Degree of
Bachelor of Technology in Computer Science and
Engineering

(DATA SCIENCE)

Submitted for
LOVELY PROFESSIONAL UNIVERSITY

From 06/05/23

to 06/28/23

SUBMITTED

BY

NAME:Datla EswarsaiKrishnamraju

Registration Number:12104806

Student Declaration
Student Declaration

I, Datla.Eswarsai Krishnamraju,12104806, hereby declare that the

work done by me on “Analysis and Prediction of Airbnb Listing

Prices” from JUNE 5TH 2023 to JUNE 28TH 2023, is a record of

original work for the partial fulfillment of the requirements for the

award of the degree, Bachelor of Technology in Computer Science

Engineering.

Datla Eswarsai Krishnamraju


Completion of Certificate
 Analysis and Prediction of Airbnb Listing Prices
Of Boston

Abstract
Airbnb listing prices are more than meets the eye as they both closely reflect
local real estate tendencies as well as directly influence such prices. Across the
many studies conducted on Airbnb's effect on regional real estate trajectories,
the positive correlation between the number of listings and rental prices/home
prices is well established; one study reporting a 0.018% increase in rental prices
and a 0.026% increase in home prices for every 1% increase in total listings.

With a narrowed focus on Airbnb in Boston, MA, the aim of this analysis is
threefold. First, I explore surface-level insights regarding quantity of listings, trends
in pricing, and guest traffic in the various neighbourhoods across the city. Second, I
plan to validate the relationship between rental and housing prices against the
number of listings within the timeframe of the Airbnb dataset used. And third,
through the use of a predictive model, develop a better understanding of the
listing features that have the largest contribution to listing price.

Introduction
Airbnb has become the first choice for many travellers to utilize not only when looking for

accommodation across the world, but also when exploring a city’s before getting there.

People would want a “home away from home” when thinking about sleeping over in their

destination. Therefore, Airbnb reinvents a unique and inspiring blue ocean into the market,

“Blue Ocean strategy is the simultaneous pursuit of differentiation and low cost to open up a

new market space and create new demand.” In Blue oceans, demand is created rather than
fought over. Airbnb provides a win-win situation for both customers and hosts; while

customers can get accommodation at lower prices, hosts can make money by renting their

properties.

To go even further with how Airbnb is utilized in Boston, I have analysed Boston Airbnb

listings. The datasets used in this analysis were acquired from Inside Airbnb database under

a Creative Commons CC0 1.0 Universal “Public Domain Dedication” license. The dataset

reports the listing activities of homestays in Boston. The dataset incorporates over 6150

property listings including but not limited to hosts info, prices, neighbourhoods, amenities,

cancellation policy, and reviews.

The analysis and its findings are only observational and not the result of a formal study.

General business questions are listed below to guide us through the analysis to create a

model that can predict the rental price based on some features.

Related Work
As the Airbnb platform became popular over the last few years, several papers have
already addressed the Airbnb price prediction task, benefiting from the publicly available
datasets on InsideAirbnb.com. In this section, literature dealing with Airbnb in general is
presented, then detailed summarization of already existing literature on price prediction and
Airbnb price determinants is provided.

Despite that multiple projects were carried out on predicting the listing prices, none of
them has been performed across different cities. In this work, we focus on the following three
tasks. First, we would like to explore different features via feature extraction and engineering.
Second, we would like to experiment and compare different machine learning techniques in
price prediction. Finally, we want to train a more generalized model and perform transfer
learning.
Dataset
The Airbnb, Boston dataset retrieved from Kaggle (3585, 95). The dataset retrieved is a
collection of property listings, their key features and types, such as property type, host type,
neighbourhood, reviews and much more. The analysis and its findings are only observational
and not the result of a formal study. General business questions are listed below to guide us
through the analysis to create a model that can predict the rental price based on some
features.

Features
We will be further investigating the comparison of above features with the price
column and identify the relationship and inferences from the results. The datasets
were divided into two parts, one subset with room_id, longitude, and latitude of the
room for further process of identifying new characteristics for each listing, and the
rest of the dataset, which will be merged again at the end of the identifying process.

• Label: The ground-truth label is the listing price. As there exists abnormally high prices in the
datasets, we have used two approaches – data thresholding and label transformation – to
alleviate this problem. For data thresholding, we cut off data with price over 500 dollars per
night, which eliminates approximate 1% of the total listings. For label transformation, we have
tested different power transformations, as well as logarithmic transformation, and we found
that square root transformation and logarithmic transformation work well for the price
prediction. Figure 1 compares a regression model performances without/with label
transformation.

• Continuous features: We identified highly correlated continuous features and eliminated


them from our input, and we obtained 28 continuous features in total. We filled the null with 0
for price related features (security deposit and cleaning fee), and mean value for the rest of the
features. We then performed standardization transformation with normalized feature x ∼
N(0,1). For some of the following experiments, we also included 2-degree interaction terms of
the continuous features.

• Categorical features: For most of the categorical features, we directly performed one-hot
encoding, while for a small fraction of list features, like amenities and host verifications, we
encoded them into vectors via dictionary building and mapping. Altogether, we obtained 20
encoded features.

• Text features: To utilize the text features, such as summary, transit, neighbourhood
overview, we counted tiff on unigrams and bigrams. We then performed truncated singular
value decomposition (SVD) to reduce the dimension of each text feature to 50, which makes
the dimension of the 12 text features into a 600-dimension vector.

• Date features: We have 3 date features (host since, first review, last review), and we
converted them into continuous values by filling the null value with the mean date, and
subtracting the earliest date value from all date values.
Code
Entire home/apt 1825
Private room 1353
Shared room 76
Name: room_type, dtype: int64

Apartment 2325
House 547
Condominium 220
Townhouse 50
Bed & Breakfast 39
Loft 32
Other 14
Boat 12
Villa 6
Entire Floor 4
Dorm 2
Guesthouse 1
Name: property_type, dtype: int64

room_type
Entire home/apt 239.097039
Private room 96.356509
Shared room 81.065789
Name: price, dtype: float64
Conclusion
In this analysis, we conducted an exploratory data analysis (EDA) and built
predictive models to understand and predict Airbnb listing prices in Boston. By
analyzing a comprehensive dataset comprising property details, host information,
location, amenities, reviews, and pricing, we gained valuable insights into the factors
influencing listing prices and developed models to estimate prices for new listings.

During the EDA phase, we observed several trends and patterns specific to the
Boston Airbnb market. Factors such as location proximity to popular landmarks, the
number of bedrooms, and the presence of specific amenities emerged as significant
determinants of listing prices. Additionally, we identified seasonal variations in pricing
and observed higher demand during peak tourist seasons.

By employing various machine learning algorithms, including linear regression,


random forests, and gradient boosting, we developed predictive models for
estimating Airbnb listing prices in Boston. These models demonstrated strong
performance, as evidenced by low prediction errors and high accuracy metrics on
the testing dataset. The models effectively captured the complex relationships
between features and listing prices, providing reliable estimates for both hosts and
guests.

The insights derived from this analysis can assist hosts in optimizing their pricing
strategies and maximizing rental income. Hosts can leverage the developed models
to set competitive prices based on property characteristics, location, and market
demand. For guests, the models offer a valuable tool to estimate listing prices,
evaluate affordability, and plan their accommodation budgets accordingly.

It is worth noting that the accuracy and generalizability of the models are subject to
the quality and relevance of the data. Additionally, external factors such as changes
in the economy, local events, or regulatory policies may influence listing prices
beyond the captured features. Therefore, continuous monitoring, retraining, and
updating of the models with fresh data are recommended to ensure their
performance remains robust over time.

In conclusion, this analysis provides valuable insights into the Boston Airbnb market,
offering both hosts and guests a data-driven approach to understanding and
predicting listing prices. The combination of exploratory data analysis and predictive
modeling in R programming enables stakeholders to make informed decisions,
optimize pricing strategies, and enhance the overall Airbnb experience in Boston.

Reference
 Kaggle. Airbnb price prediction
 Borton Airbnb analysis
 Airbnb. (2020a). About Us. Retrieved 2020-09-27, from https://news.airbnb.com/about-us/
 Airbnb. (2020b). How should I choose my listing's price? Retrieved 2020-09-30, from
https://www.airbnb.com/help/article/52/how-should-i-choose-my-listings-price

You might also like