No. of Record Customer Ids Date Data Received Customer Demographic Customer Address Transaction

Uploaded by

Đỗ Minh

0% found this document useful (0 votes)

40 views2 pages

Original Title

Task 1

Copyright

Available Formats

DOCX, PDF, TXT or read online from Scribd

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Report this Document

Copyright:

Available Formats

Download as DOCX, PDF, TXT or read online from Scribd

Flag for inappropriate content

0% found this document useful (0 votes)

40 views2 pages

No. of Record Customer Ids Date Data Received Customer Demographic Customer Address Transaction

Uploaded by

Đỗ Minh

Copyright:

Available Formats

Download as DOCX, PDF, TXT or read online from Scribd

Flag for inappropriate content

Jump to Page

You are on page 1of 2

Search inside document

Dear [the client],

I am Minh Do, representing KPMG. Thank you for providing us with 03 datasets
from Sprocket Central Pty Ltd. The table below highlights the summary statistics
from the three datasets received. Please let us know if the figures are not aligned
with your understanding.

No. of record Customer IDs Date Data Received

Customer
4000 4000 -
Demographic
Customer
3999 4003 -
Address
Transaction 20000 20000 -

Notable data quality issues that were encountered and the methods used to mitigate
the identified data inconsistencies are as follows. Furthermore, recommendations
have been provided to avoid the re-occurrence of data quality issues and improve
the accuracy of the underlying data used to drive business decisions.
 Additional customer_ids in the ‘Transactions table’ and ‘Customer Address
table’ but not in ‘Customer Master (Customer Demographic)’
Mitigation: Please ensure that all tables are from the same period. Only customers
in the Customer Master list will be used as a training set for our model. This
indicates that the data received may not be in sync with each other which may
skew the analysis results if there are missing data records. Please refer to excel file
‘data_outliers.xlsx’ for the list of outliers between tables.
 Various columns, such as the brand of a purchase, or job title, have empty
values in certain records
Mitigation: If only a small number of rows are empty, filter out the record entirely
from the training set for prediction. Else, if it is a core field, impute based on
distribution in the training dataset. For key datasets, such as transactions, less than
1% of transactions (totalling less than 0.1% of revenue) have missing fields. These
records have been removed from the training dataset.
 Inconsistent values for the same attribute
(e.g. Victoria being represented as “V”, “Vic” and “Victoria”)
Mitigation: Use regular expression to replaced extended values into abbreviations
to ensure consistency across addresses.
Recommendation: Enforce a drop-down list for the user entering the data rather
than a free text field. In order to construct meaningful variables for the model, the
data has been cleaned to avoid multiple representations of the same value.
Additionally, gender records where ‘U’ have been replaced based on the
distribution from the training dataset.
 Inconsistent data type for the same attribute
(e.g. numeric values for some fields and strings for others)
Mitigation: Convert selected records in characters to numeric. Remove non-
numeric characters from string.
Recommendation: Having different data types for a given field make it difficult to
interpret results at the later stage. Therefore, appropriate data transformations are
made to ensure consistent data types for a given field
Moving forward, the team will continue with the data cleaning, standardisation and
transformation process for the purpose of model analysis. Questions will be raised
along the way and assumptions documented. After we have completed this, it
would be great to spend some time with your data SME to ensure that all
assumptions are aligned with Sprocket Central’s understanding.

Hidden Figures: The American Dream and the Untold Story of the Black Women Mathematicians Who Helped Win the Space Race
From Everand
Hidden Figures: The American Dream and the Untold Story of the Black Women Mathematicians Who Helped Win the Space Race
Margot Lee Shetterly
Rating: 4 out of 5 stars
4/5 (895)
Chapter 14 - Ques For Review
Document6 pages
Chapter 14 - Ques For Review
Đỗ Minh
No ratings yet
Never Split the Difference: Negotiating As If Your Life Depended On It
From Everand
Never Split the Difference: Negotiating As If Your Life Depended On It
Chris Voss
Rating: 4.5 out of 5 stars
4.5/5 (838)
02 03 FunctionsAndFormulas Finish
Document884 pages
02 03 FunctionsAndFormulas Finish
Priyank Solanki
No ratings yet
The Yellow House: A Memoir (2019 National Book Award Winner)
From Everand
The Yellow House: A Memoir (2019 National Book Award Winner)
Sarah M. Broom
Rating: 4 out of 5 stars
4/5 (98)
Ghi Chú ME - 4.6.21
Document1 page
Ghi Chú ME - 4.6.21
Đỗ Minh
No ratings yet
The Subtle Art of Not Giving a F*ck: A Counterintuitive Approach to Living a Good Life
From Everand
The Subtle Art of Not Giving a F*ck: A Counterintuitive Approach to Living a Good Life
Mark Manson
Rating: 4 out of 5 stars
4/5 (5794)
07 - 03 ChartingData - Finish
Document426 pages
07 - 03 ChartingData - Finish
Đỗ Minh
No ratings yet
Shoe Dog: A Memoir by the Creator of Nike
From Everand
Shoe Dog: A Memoir by the Creator of Nike
Phil Knight
Rating: 4.5 out of 5 stars
4.5/5 (537)
Chapter 14 - Prob & App
Document3 pages
Chapter 14 - Prob & App
Đỗ Minh
No ratings yet
Devil in the Grove: Thurgood Marshall, the Groveland Boys, and the Dawn of a New America
From Everand
Devil in the Grove: Thurgood Marshall, the Groveland Boys, and the Dawn of a New America
Gilbert King
Rating: 4.5 out of 5 stars
4.5/5 (266)
HW Chapter 2 - Macroeconomics - Đ H.minh 20203228
Document11 pages
HW Chapter 2 - Macroeconomics - Đ H.minh 20203228
Đỗ Minh
No ratings yet
The Little Book of Hygge: Danish Secrets to Happy Living
From Everand
The Little Book of Hygge: Danish Secrets to Happy Living
Meik Wiking
Rating: 3.5 out of 5 stars
3.5/5 (400)
02 03 FunctionsAndFormulas Start
Document666 pages
02 03 FunctionsAndFormulas Start
Priyank Solanki
No ratings yet
Elon Musk: Tesla, SpaceX, and the Quest for a Fantastic Future
From Everand
Elon Musk: Tesla, SpaceX, and the Quest for a Fantastic Future
Ashlee Vance
Rating: 4.5 out of 5 stars
4.5/5 (474)
Team Justice - Nguyen Thanh Thao - TP057639
Document12 pages
Team Justice - Nguyen Thanh Thao - TP057639
Đỗ Minh
No ratings yet
Yes Please
From Everand
Yes Please
Amy Poehler
Rating: 4 out of 5 stars
4/5 (1891)
Homework For IELTS WRITING B2 (Group 4) : Đ Hoàng Minh 20203228
Document3 pages
Homework For IELTS WRITING B2 (Group 4) : Đ Hoàng Minh 20203228
Đỗ Minh
No ratings yet
A Heartbreaking Work Of Staggering Genius: A Memoir Based on a True Story
From Everand
A Heartbreaking Work Of Staggering Genius: A Memoir Based on a True Story
Dave Eggers
Rating: 3.5 out of 5 stars
3.5/5 (231)
Draft Product Ob
Document4 pages
Draft Product Ob
Đỗ Minh
No ratings yet
Grit: The Power of Passion and Perseverance
From Everand
Grit: The Power of Passion and Perseverance
Angela Duckworth
Rating: 4 out of 5 stars
4/5 (588)
Draft Product Ob
Document4 pages
Draft Product Ob
Đỗ Minh
No ratings yet
The Emperor of All Maladies: A Biography of Cancer
From Everand
The Emperor of All Maladies: A Biography of Cancer
Siddhartha Mukherjee
Rating: 4.5 out of 5 stars
4.5/5 (271)
Date
Document1 page
Date
Đỗ Minh
No ratings yet
The Unwinding: An Inner History of the New America
From Everand
The Unwinding: An Inner History of the New America
George Packer
Rating: 4 out of 5 stars
4/5 (45)
XSTK
Document9 pages
XSTK
Đỗ Minh
No ratings yet
On Fire: The (Burning) Case for a Green New Deal
From Everand
On Fire: The (Burning) Case for a Green New Deal
Naomi Klein
Rating: 4 out of 5 stars
4/5 (74)
In My View Room So Great All of Views in Dalat City Our Trip Plan Look For The Beautiful Views in The City, Go Sightseeing Really Like / Love in Here
Document2 pages
In My View Room So Great All of Views in Dalat City Our Trip Plan Look For The Beautiful Views in The City, Go Sightseeing Really Like / Love in Here
Đỗ Minh
No ratings yet
The Hard Thing About Hard Things: Building a Business When There Are No Easy Answers
From Everand
The Hard Thing About Hard Things: Building a Business When There Are No Easy Answers
Ben Horowitz
Rating: 4.5 out of 5 stars
4.5/5 (345)
Homework 5 - Part - I - AK
Document14 pages
Homework 5 - Part - I - AK
Bagus Saputro
No ratings yet
Team of Rivals: The Political Genius of Abraham Lincoln
From Everand
Team of Rivals: The Political Genius of Abraham Lincoln
Doris Kearns Goodwin
Rating: 4.5 out of 5 stars
4.5/5 (234)
Bcs
Document3 pages
Bcs
Đỗ Minh
No ratings yet
Principles: Life and Work
From Everand
Principles: Life and Work
Ray Dalio
Rating: 4 out of 5 stars
4/5 (599)
SPEAKING
Document14 pages
SPEAKING
Đỗ Minh
No ratings yet
Angela's Ashes: A Memoir
From Everand
Angela's Ashes: A Memoir
Frank McCourt
Rating: 4.5 out of 5 stars
4.5/5 (440)
Draft Product Ob
Document4 pages
Draft Product Ob
Đỗ Minh
No ratings yet
The Gifts of Imperfection: Let Go of Who You Think You're Supposed to Be and Embrace Who You Are
From Everand
The Gifts of Imperfection: Let Go of Who You Think You're Supposed to Be and Embrace Who You Are
Brené Brown
Rating: 4 out of 5 stars
4/5 (1090)
Bcs
Document3 pages
Bcs
Đỗ Minh
No ratings yet
Fear: Trump in the White House
From Everand
Fear: Trump in the White House
Bob Woodward
Rating: 3.5 out of 5 stars
3.5/5 (738)
文書
Document2 pages
文書
Đỗ Minh
No ratings yet
Steve Jobs
From Everand
Steve Jobs
Walter Isaacson
Rating: 4.5 out of 5 stars
4.5/5 (806)
CV Template:, Professional Picture Only Contact Info
Document3 pages
CV Template:, Professional Picture Only Contact Info
Nguyễn Thanh Thảo
No ratings yet
The World Is Flat 3.0: A Brief History of the Twenty-first Century
From Everand
The World Is Flat 3.0: A Brief History of the Twenty-first Century
Thomas L. Friedman
Rating: 3.5 out of 5 stars
3.5/5 (2259)
James Serra Data & AI Architect Microsoft, NYC MTC
Document93 pages
James Serra Data & AI Architect Microsoft, NYC MTC
Dodo winy
No ratings yet
Rise of ISIS: A Threat We Can't Ignore
From Everand
Rise of ISIS: A Threat We Can't Ignore
Jay Sekulow
Rating: 3.5 out of 5 stars
3.5/5 (137)
Describe Tasks and Responsibilities Academic Office of HUST
Document2 pages
Describe Tasks and Responsibilities Academic Office of HUST
Đỗ Minh
No ratings yet
John Adams
From Everand
John Adams
David McCullough
Rating: 4.5 out of 5 stars
4.5/5 (2409)
References: Strategy (Online) - Available From
Document4 pages
References: Strategy (Online) - Available From
Đỗ Minh
No ratings yet
The Glass Castle: A Memoir
From Everand
The Glass Castle: A Memoir
Jeannette Walls
Rating: 4.5 out of 5 stars
4.5/5 (1713)
9slide - Intro Animation
Document5 pages
9slide - Intro Animation
Đỗ Minh
No ratings yet
Bad Feminist: Essays
From Everand
Bad Feminist: Essays
Roxane Gay
Rating: 4 out of 5 stars
4/5 (1016)
TỔNG HỢP ĐƯỜNG LINK CÁC BÀI ÔN TẬP HỌC PHẦN 2
Document1 page
TỔNG HỢP ĐƯỜNG LINK CÁC BÀI ÔN TẬP HỌC PHẦN 2
Đỗ Minh
No ratings yet
The Sympathizer: A Novel (Pulitzer Prize for Fiction)
From Everand
The Sympathizer: A Novel (Pulitzer Prize for Fiction)
Viet Thanh Nguyen
Rating: 4.5 out of 5 stars
4.5/5 (121)
Purchase Orders Supplier Order No. Item No. Item Description Item Cost Quantity
Document4 pages
Purchase Orders Supplier Order No. Item No. Item Description Item Cost Quantity
Rio De Leon
No ratings yet
The Outsider: A Novel
From Everand
The Outsider: A Novel
Stephen King
Rating: 4 out of 5 stars
4/5 (1839)
Ô Vàng: Nhập vào số Ô Xanh: Nhập vào chữ Ô Trắng: Không nhập gì hết
Document20 pages
Ô Vàng: Nhập vào số Ô Xanh: Nhập vào chữ Ô Trắng: Không nhập gì hết
Đỗ Minh
No ratings yet
The Light Between Oceans: A Novel
From Everand
The Light Between Oceans: A Novel
M.L. Stedman
Rating: 4.5 out of 5 stars
4.5/5 (789)
Learning Vocabulary Through Reading
Document1 page
Learning Vocabulary Through Reading
Đỗ Minh
No ratings yet
The Woman in Cabin 10
From Everand
The Woman in Cabin 10
Ruth Ware
Rating: 3.5 out of 5 stars
3.5/5 (2322)
Cbe Joining Instruction For Diploma 1 September Intake 2023-2024 Jif2
Document10 pages
Cbe Joining Instruction For Diploma 1 September Intake 2023-2024 Jif2
Daniel Eudes
No ratings yet
A Man Called Ove: A Novel
From Everand
A Man Called Ove: A Novel
Fredrik Backman
Rating: 4.5 out of 5 stars
4.5/5 (4609)
Forms of Business Organisation
Document31 pages
Forms of Business Organisation
Ila Suresh
No ratings yet
Wolf Hall: A Novel
From Everand
Wolf Hall: A Novel
Hilary Mantel
Rating: 4 out of 5 stars
4/5 (3811)
Lululemon Athletica Case Analysis
Document19 pages
Lululemon Athletica Case Analysis
kuntodarpito
100% (2)
Manhattan Beach: A Novel
From Everand
Manhattan Beach: A Novel
Jennifer Egan
Rating: 3.5 out of 5 stars
3.5/5 (792)
CIR V PASRC
Document9 pages
CIR V PASRC
JENNY BUTACAN
No ratings yet
Brooklyn: A Novel
From Everand
Brooklyn: A Novel
Colm Toibin
Rating: 3.5 out of 5 stars
3.5/5 (1937)
Labor Relations: Assistant Ombudsman Leilanie Bernadette C. Cabras
Document9 pages
Labor Relations: Assistant Ombudsman Leilanie Bernadette C. Cabras
Ronaldo Valladores
No ratings yet
The Perks of Being a Wallflower
From Everand
The Perks of Being a Wallflower
Stephen Chbosky
Rating: 4.5 out of 5 stars
4.5/5 (2104)
Lecture 3 - Supply Chain Drivers and Metrics
Document35 pages
Lecture 3 - Supply Chain Drivers and Metrics
Nadine Aguila
No ratings yet
The Art of Racing in the Rain: A Novel
From Everand
The Art of Racing in the Rain: A Novel
Garth Stein
Rating: 4 out of 5 stars
4/5 (4200)
Puntland Aid Report
Document23 pages
Puntland Aid Report
kunciil
No ratings yet
Little Women
From Everand
Little Women
Louisa May Alcott
Rating: 4 out of 5 stars
4/5 (104)
University of Delhi: Semester Examination TERM I MARCH 2023 Statement of Marks / Grades
Document2 pages
University of Delhi: Semester Examination TERM I MARCH 2023 Statement of Marks / Grades
Dev Chauhan
No ratings yet
A Tree Grows in Brooklyn
From Everand
A Tree Grows in Brooklyn
Betty Smith
Rating: 4.5 out of 5 stars
4.5/5 (1929)
Annual Report Icici Bank
Document132 pages
Annual Report Icici Bank
Mohit Maheshwari
No ratings yet
The Constant Gardener: A Novel
From Everand
The Constant Gardener: A Novel
John le Carré
Rating: 3.5 out of 5 stars
3.5/5 (104)
Pass4sure: Everything You Need To Prepare, Learn & Pass Your Certification Exam Easily
Document8 pages
Pass4sure: Everything You Need To Prepare, Learn & Pass Your Certification Exam Easily
sherif adf
No ratings yet
Sing, Unburied, Sing: A Novel
From Everand
Sing, Unburied, Sing: A Novel
Jesmyn Ward
Rating: 4 out of 5 stars
4/5 (1103)
Assignment Brief - Btec (RQF) Higher National Diploma in Business
Document2 pages
Assignment Brief - Btec (RQF) Higher National Diploma in Business
Ran Kar
No ratings yet
Her Body and Other Parties: Stories
From Everand
Her Body and Other Parties: Stories
Carmen Maria Machado
Rating: 4 out of 5 stars
4/5 (821)
Square Root Theory
Document2 pages
Square Root Theory
Sidorov Andrey
100% (1)
Information Technology Auditing 4th Edition Hall Test Bank
Document38 pages
Information Technology Auditing 4th Edition Hall Test Bank
apostemegudgeon8ew9
100% (15)
ECO 210 - Macroeconomics
Document5 pages
ECO 210 - Macroeconomics
Rahul raj Sah
No ratings yet
Charter Party
Document16 pages
Charter Party
Chinnadurai Senguttuvan
No ratings yet
News Release On Arrest of Patrick Oki
Document2 pages
News Release On Arrest of Patrick Oki
Honolulu Star-Advertiser
No ratings yet
Ana Anida, Arief Daryanto, Dan Dudi S. Hendrawan
Document12 pages
Ana Anida, Arief Daryanto, Dan Dudi S. Hendrawan
alfitra mahendra
No ratings yet
Quality Management-If: Acceptance Sampling
Document22 pages
Quality Management-If: Acceptance Sampling
Surbhi Gupta
No ratings yet
Baroda e Gateway
Document7 pages
Baroda e Gateway
Rajkot academy
No ratings yet
Fifa 23 Tutorial
Document2 pages
Fifa 23 Tutorial
Drazi YTB
No ratings yet
Investment Ass#6 - Daniyal Ali 18u00265
Document3 pages
Investment Ass#6 - Daniyal Ali 18u00265
Daniyal Ali
100% (1)
Hapus - Citi - Employee - Referral - Program - 2014
Document80 pages
Hapus - Citi - Employee - Referral - Program - 2014
Joe
No ratings yet
Tvlrcai Bot Meeting FTM January 2024
Document83 pages
Tvlrcai Bot Meeting FTM January 2024
Keds Mikaela
No ratings yet
Subject: Revised DPWH Program On Awards And: Ou - Tl.Ue
Document15 pages
Subject: Revised DPWH Program On Awards And: Ou - Tl.Ue
brian paul ragudo
No ratings yet
1330 Sarevsh Shinde Pptx. HRM II
Document9 pages
1330 Sarevsh Shinde Pptx. HRM II
sarvesh shinde
No ratings yet
Strategic Management Concepts Competitiveness and Globalization Hitt 11th Edition Solutions Manual
Document29 pages
Strategic Management Concepts Competitiveness and Globalization Hitt 11th Edition Solutions Manual
Johnny Adams
100% (28)
ACK138389790080523
Document1 page
ACK138389790080523
Vishal Goyal
No ratings yet
U H "Eƒ ¡I" Beha Construction Y Ƒ° SÓKÝ TD Má Pî& Job Description Notes Form
Document2 pages
U H "Eƒ ¡I" Beha Construction Y Ƒ° SÓKÝ TD Má Pî& Job Description Notes Form
Abrsh Abrsh
No ratings yet
Banitog - Chapter 6 (Problem 1)
Document3 pages
Banitog - Chapter 6 (Problem 1)
Myunimint
No ratings yet
Money Made Simple
Document35 pages
Money Made Simple
Saikat Sukai
100% (3)
ITIL 4: High-velocity IT: Reference and study guide
From Everand
ITIL 4: High-velocity IT: Reference and study guide
Mark Smalley
No ratings yet
Starting Database Administration: Oracle DBA
From Everand
Starting Database Administration: Oracle DBA
Oraclesql-plsql
Rating: 3 out of 5 stars
3/5 (2)
Optimizing DAX: Improving DAX performance in Microsoft Power BI and Analysis Services
From Everand
Optimizing DAX: Improving DAX performance in Microsoft Power BI and Analysis Services
Alberto Ferrari
No ratings yet
Dark Data: Why What You Don’t Know Matters
From Everand
Dark Data: Why What You Don’t Know Matters
David J. Hand
Rating: 4.5 out of 5 stars
4.5/5 (3)
Microsoft Access Guide for Success
From Everand
Microsoft Access Guide for Success
Kevin Pitch
Rating: 5 out of 5 stars
5/5 (2)
Grokking Algorithms: An illustrated guide for programmers and other curious people
From Everand
Grokking Algorithms: An illustrated guide for programmers and other curious people
Aditya Bhargava
Rating: 4 out of 5 stars
4/5 (16)
Fusion Strategy: How Real-Time Data and AI Will Power the Industrial Future
From Everand
Fusion Strategy: How Real-Time Data and AI Will Power the Industrial Future
Vijay Govindarajan
No ratings yet
Access 2019 For Dummies
From Everand
Access 2019 For Dummies
Laurie A. Ulrich
No ratings yet
Oracle Database 12c Quickstart
From Everand
Oracle Database 12c Quickstart
Michael Elliott
Rating: 5 out of 5 stars
5/5 (5)