You are on page 1of 19

1

Data Analytics
at Gojek

Hanum Kumala
Business Intelligence Analyst
3

Our Journey
A mobile app for daily needs. 2010 Call-center for
ojek services
The Gojek app offers various
services: transport, food
delivery, courier logistics, 2015 App launched
with 3 services
instant shopping, professional

2016
Expansion and
massages, payments, and new services, first
more. Unicorn in Indonesia

2018 International
growth

2019 Superapp company

ojek is an Indonesian term of motorcycle ride hailing


4

Our Solution for Every Customer’s Needs


5

Our Global Footprint


Operating in 207 cities
throughout South East Asia

+155m +500k +2m +60k


service providers
app downloads merchants drivers
6

Tremendous growth and


maturing product lead us to ...

Bigger quantity and more Difficulties in getting Ongoing needs to get real-time
complex data sources insight from bunch of data insight for business decisions
Business Intelligence at Gojek

Business Intelligence is the first data team at Gojek

1 of 3 independent data teams besides Data Engineering (focusing on data


pipeline in the entire company) and Data Science (focusing on AI and future
data products)

Distributing business data throughout the company: makes data available,


aligned with business and accessible by business users

Thought partner of Product Owners, delivers insights and proves hypothesis


based on data

7
Business Intelligence Process
Decision Making
5 Deep Knowledge

Data Analysis and Reporting


4 Visualization, Advanced Analytics, KPIs

Master Data Management


3 Stewardship, Data Quality, Cleansing

Data Engineering
2 Pipelines, ETL, EDW, EDL, etc.

Data Sources
1 Historical data transaction
Data Warehouse Journey
When Gojek need data at one place?

It started based on the There was no single place Various backends across
We started building
needs of the Business to hold data multiple products
the data warehouse Intelligence team
Single Version of Truth Needed to deliver hassle-
from scratch in Q3 of Labour-intensive work (SVOT) of business datasets free, timely and uniform
and tons of time to was needed data products
2016
deliver business insights
11

Gojek’s Data Warehouse Architecture


Pipeline Data Source ETL / ELT Data Lake DWH Presentation

API Clevertap
Batch PostgreSQL
Golang
Tableau
MongoDB MySQL Other
Sources

Python
Cloud Storage BigQuery Metabase
Streaming

Kafka PubSub Dataflow Spark

Batch job (in daily, weekly, monthly)


Operations Near real time data (in minute, hour)
Real time data (streaming)
Monitoring
Airflow Grafana Datadog Stackdriver Terraform

Gojek data architecture as of Q1 2019


Data Quality Service
Data Quality

Completeness Uniqueness Validity


The proportion of stored No thing will be recorded Data are valid if conforms
data against the potential of more than once based upon to the syntax (format, type,
“100% Complete” how that thing is identified range) of its definition

Consistency Accuracy Timeliness


The absence of difference, The degree to which data The degree to which data
when comparing two or more correctly describes the represent reality from the
representation of the metric “real world” object or event required point in time
Data Analysis and Decision Making
Data Analysis Process
Hypothesis or
Get the Data Data Exploratory Interpret Result
Problem Statements

Asking the right question Get the data from the Exploratory will help us Interpreting the data will
for the analysis by Data Warehouse usually understand what are the help to decide whether
defining the hypothesis using Query things or trends that the hypothesis can be
or problem statements happened in the data accepted or not

What happened? Visualization Actionable Insights


Why did it happened? Data Query Statistical Analysis Decision Making
What will happen? Duplication Checking Modelling etc
How can I make it Outlier Checking etc
happen? etc
etc
Product Dashboard
Objective: provide SVOT
comprehensive snapshot of
business performance

Problem Statements
1. How much has the booking
and transaction increased
this month?
2. Is there an increasing trend
at the festive season?
* Data has been masked due to confidentiality reason

16
Demand Heatmap
Problem Statements
1. Where does demand come from?
2. Is it permanent or temporary
demand?

Main users
● Commercial Expansion
● Merchant Sales

* Data has been masked due to confidentiality reason

17
Customer Movement
Problem Statements
1. Do our customers stay in the
platform?
2. If yes, do they transact often?

Main users
● Strategy
● Product
● Growth
* Data has been masked due to confidentiality reason

18
19

Thank you!

You might also like