You are on page 1of 13

SF Bay Area Bike Trip Time

Based Exploration

ANLY-500-53

SRIKANTH SHANKAR
Introduction
Bay Area Bike Share are bike sharing companies which provides on-demand bike
rentals for customers in and around Bay Area. Customers can rent bikes from their
choice of stations throughout the area and return them to any station within the same
city

This data analysis based study of bike trip data is designed to provide bike sharing
operators, a better understanding of some of the factors that affect the usage as well
as different patterns or trends seen in bike usage

We are trying to impose a few scenarios showing cities with most bike-rents, while
doing analysis and with other involved factors
Objective
To do a time-based exploration of bicycle trip data, we perform data analysis to understand
how the frequency of use varies with time. Is the bicycle trip number increasing or decreasing
across time? Are there patterns inside the overall variety of bicycle journeys in line with the
time of day or time of year? R programming is used for this analysis on the pre collected data

We are trying to categorize the analyses in few important sections like


• trips by calendar year – how it varies with time, increasing or decreasing?
• total number of trips by day of the week – weekday vs weekend?
• total trips by hour of the day – peak hours, is it consistent across the year?
• number of trips by hour across the year, usage by city.
• customers vs. subscribers usage – who dominate the usage?
Trips by Calendar date
• Total number of trips by calendar date over a two year period
• Depicts how the usage varies throughout the year over a period of time

General expectation/ Null hypothesis

• Trips made in summer might be more compared to winter


• Booming - usage increasing over the years
• Change in fuel cost
Trips by Calendar date
Trips by Calendar date
Usage throughout the year

● Smooth fit shows trips made in July is higher than January - weather dependent?

Usage over two year period

● Although the trips varies over the year but usage has increased over time - booming!

Interesting visualization

● A split in the data scatter plot - weekday vs weekend effect?


Trips by day of the week
•Digging on step deeper to understand the pattern

•Usage over a week - weekday vs weekend pattern

•Understanding customer base - customer vs subscribers pattern


Trips by day of the week
Trips by day of the week
•Fewer trips at weekends than at
weekdays - office commuters!

•Plot with different color coding

•How important is subscribing?


Trips by hour of the day
● Office commuters cause most demand at
9am and 5pm
● Subscribers also find the same time as the
beneficial time to rent
● Around noon, a spike in the graph shows
that most of the travelers or non-subscribers
prefer that time.
No. of Trips by hour across the year
● More or less similar trend throughout
the year

● Can see total number going down in first


and last quarter of the year because of
weather
Customers vs Subscribers
• Subscribers mostly use during the
weekdays
• During weekend, usage is more or less
even
• SF gets extra clients over the weekend
whereas each of Palo Alto, Redwood City
has a balance utilization
Summary and Conclusion
The sparseness of our initial dataset proved to be insufficient for any modeling
techniques. Expanding our observation size as well as predictor space greatly
decreased the overall Mean Square Error of our models. Linear regression with
Cross Validation did not prove to be accurate enough models to predicting the
number of rentals.
With the predictive model we have we could conclude that this research will help
business operators to improve their business models based on customer behavior and
demand. And that San Francisco bay area is good place for the business and is booming at a
good rate though there is a dependency on the weather

You might also like