You are on page 1of 5

AirAsia

Assessment Questions
For
Data Scientists
Task 1: Question from Business

Problem Statement:
You are working on with airasia.com’s homepage product manager, he comes up to you and says that we
have witnessed a ~90% reduction in non-search-based transactions from Sept 2014 to Sep 2015. He
wants you to deep dive on this and come up with insights for this observation

Data:
Attachment <<Search_Transactions_Raw_Data.xlsx>>

Note:
A session is a unique visit (one customer can have multiple sessions).
A session can have a search which then leads to a transaction (conversion)
A transaction can happen without searching as well (click suggestions available)
Some sessions can be identified to be from a certain location (country)
Task 2: Data Cleaning

Problem Statement:
1. Recommend how you would clean and standardize the data provided, walk through the steps you
would take in pseudocode.

GB||EN||HFLFQ6|09Nov14||LHR-DXB-LHR||LHR||DXB||YY||2||Return||2014-11-09|20:08:51||2014-
12-12||No-PromoCode

2. Apply the logic above to the tab file attached, sum up the values found in red figure above.

Data:
Attachment <<151006120126_GA OrderData_20150901-20150930.tab>>

Note:
Take a note of the decisions you make
Task 3: Model Development

Problem Statement:

Every customer who books a flight is a potential buyer of other ancillaries viz. meal, preferred
seat, extra baggage, and travel insurance. A customer can buy these ancillaries anytime before
the flight departure (in case of meals, the current data only includes pre-booked meals and not
the in-flight sale of meals).
Based on each customer’s bookings data (each row represents one booking), predict whether a
customer will buy Insurance (variable name: INS_FLAG)

Data Dictionary
Id Identifier
PAXCOUNT Number of customer traveling
SALESCHANNEL Sales channel booking was made on
TRIPTYPEDESC Trip Type (Round Trip, One Way, Circle Trip)
PURCHASELEAD Number of days between travel date and booking date
LENGTHOFSTAY Number of days spent at destination (derived for one way trips)
flight_hour Hour of day of Flight departure
flight_day Day of week of Flight departure
ROUTE OriginDestination flight route (KULPEN – Kuala Lumpur to Penang)
geoNetwork_country Country from where booking was made
BAGGAGE_CATEGORY Has bought extra baggage in booking
SEAT_CATEGORY Has bought preferred seat in the booking
FNB_CATEGORY Has bought in-flight meals
INS_FLAG Has bought insurance? (Target Variable)
flightDuration_hour Total duration of flight (in hours)

Data:
Attachment <<Dataset-2: AncillaryScoring_insurance.csv>>

Note:
Focus on the following
- EDA
- Feature Engineering
- Choice of Metrics (you choose to optimize)
- Choice of algorithm
- Deployment Design (How would you deploy this solution on the platform, architecture etc.
Task 4

Scenario:
AirAsia is not just an airline anymore. AirAsia has multiple other offerings for its customers
ranging from Hotels, Adventurous Activities to Insurance, Digital Wallet and e-commerce. It is
on its path to becoming a Super App.

You are part of the homepage team, focussed on personalization. The team is conceptualizing
the different use-cases of personalization, to be presented to the business teams.

Brainstorm and come up with these use-cases to enable personalization on airasia.com


homepage.

Note:
- Outcome of each use case
- Data you will need for each use case (Example: Browsing Data etc.)

You might also like