Professional Documents
Culture Documents
Implementasi Big Data Di Fintech
Implementasi Big Data Di Fintech
LinkAja 2020 4
LinkAja 2020 5
LinkAja 2020 6
Big Data
at
Linkaja
Oct 2019, Before the fun begins..
1. Monolithic architecture
Scalability issues
2. Architecture & network modification was a challenge
6. Scattered databases
Now, Talking about the team
Vendor-driven
development
1. No internal resource, too much dependency to vendors
2. Communication is a challenge
Information retrieval
is challenging 1. Not so consistent data
No extended data
products
Scalability issues
Compute engine & networking
The needs of Data unification, layering, & Talent pool demand to build big data
democratization capabilities, such as data engineering,
analytics & data governance, and AI/ML
Critical business activities require big data Requirement to advocate high quality and
technology, i.e. business decision, product secure data
development, & crucial operational works
1 2 3 4
Building a single source of Instil the organisation as a Empowering business with Building Artificial
truth of company data lake, whole with fit-for-purpose accessible and secure data Intelligence/Machine Learning
that is scalable, reliable, and data governance principles platform to extract the benefit products that produce high
with high availability from data driven culture business value
What we do and how we do it
Big Data
Group
Data Governance
Raise awareness, build frameworks and enforce
activities of data quality and data security in all
data domains across the organisation
How Big Data
started ?
Evolution Big Data
5V of Big Data
Type of Data Source
Common roles in Data Teams
Big Data Architecture
• Lambda Architecture
• Kappa Architecture
Lambda Architecture
Example Lambda Architecture
Kappa Architecture
Streaming/ Streaming/
Data Analytical Analytics &
Real-time Real-time
Sources Ingestion Processing Data Store Reporting
Example Kappa Architecture
• DataWarehouse/Data Marts
Analytical • Data Lake
Data Store • Lakehouse
Data Warehouse VS Data Lake VS Lakehouse
Data Ingestion
Self service
Sources Datalake Datamart
playground
BI Reporting AI/ML
Democratization is nonsense without clear governance framework
We need to ensure the data is in high quality We need to ensure the data is secure
Tertiary AI/ML
Reporting,
Secondary
Dashboard, &
Analytics
Primary
Data engineering
Use Case Big
Data at LinkAja
Tech Stacks and Production Components
Data Engineering Technology Ecosystems
Data Sources Persistence Layer
Data Pipelines
Databases
Excel
Connect Connect
Google
Files Batching Infrastructure Cloud Storage
API
Monitoring
Airflow
..... GCP Console
Dashboard
LinkAja 2020 41
Core Business Intelligence
Self-service platform
Data evangelist Visualisation Data enablement • Business data dictionary
• Operational data dashboard
• Query optimization how-to guide
Data linkage Business Business-
and data Intelligence focused
access dashboards enablement Government projects
consultant
• Kartu Prakerja disbursement
• Pegadian Emas integration with LinkAja
Data-driven
business Core reporting
advocate
Financial regulators partner
LinkAja 2020 22
Product Analytics and Experimentation
Performance dashboards
LinkAja 2020 43
Data Governance
Data quality and security • Prioritisation of data of high importance and value
• Data quality tools and processes improvements
• Data ingestion flow metrics (work in progress!)
LinkAja 2020 44
• Building AI/ML Platform
AI/ML Engineering • Making AI/ML platform adoption easier
• Automation of AI/ML production
LinkAja 2020 45
AI/ML Scientist
LinkAja 2020 46