You are on page 1of 6

ANALYTICS TAKE-HOME TEST

Darpan Chaudhary
22 Aug, 2021
PROBLEM DEFINITION | Exclude trips with missing start_station_id from the trip table. From the
remaining trips, keep those with start_station_ids that were not present at the station table. What
percentage of these trips end up in end_station_ids which are also not present in the station table?

SQL CODE

NO_START_STATION_ID NO_END_STATION_ID PERCENTAGE

97,66,305 19,86,936 20.34


PROBLEM DEFINITION | Filter the trip table to include only trips with starttime from 2018-01-01
onwards. Combine usertype, birth_year, and gender into 1. Assume every unique combination
represents 1 user. Include users with missing usertype/birth_year/gender.

a. For each month in 2018, how many users belong to each segment?

SQL CODE
MONTH_NUM INACTIVE CASUAL POWER

1 115 68 249

2 87 80 265

3 82 74 276

4 37 87 308

5 21 83 328
PROBLEM DEFINITION | Filter the trip table to include only trips with starttime from 2018-01-01
onwards. Combine usertype, birth_year, and gender into 1. Assume every unique combination
represents 1 user. Include users with missing usertype/birth_year/gender.

b. For each month in 2018, compute the movements of users between segments for the next month. For example: from January 2018 to February
2018, how many casual users
stayed as casual, became power, or became inactive? Do the same for the other groups and the other months in 2018

SQL CODE

MONTH REMAIN_IN CASUAL_TO_ POWER_TO_ INACTIVE_TO REMAIN_ POWER_TO_ INACTIVE_TO CASUAL_TO_ REMAIN_
_NUM ACTIVE INACTIVE INACTIVE _CASUAL CASUAL CASUAL _POWER POWER POWER
1 0 0 0 0 0 0 0 0 0
2 76 9 2 36 37 7 3 22 240
3 66 16 0 17 45 12 4 19 253
4 32 5 0 42 38 7 8 31 269
5 5 16 0 31 45 7 1 26 301
PROBLEM DEFINITION | Section 2: Modelling & R/Python

PYTHON CODE PYTHON CODE

You might also like