You are on page 1of 28

Welcome to Data Science

for Product Managers


Niki Kittur
Faculty
Niki Kittur

Introductions TAs
Victoria Qian

Shreya Tallam
Niki Kittur
Professor, SCS, HCII
Making sense of overwhelming information

Designer / Programmer /
Product manager / Founder

Researcher
Let’s meet our TAs!

Victoria Qian Shreya Tallam


Office Hours

• The TAs will announce office hours (TBD).


• You can also ask questions on Canvas, email, and Slack (hciforpms2024)
Welcome! (back)
We are glad you are joining us 🎉
What is data science

Data science is the application of


computational and statistical techniques
to address or gain insight into some
problem in the real world

Slide credits: CMU AI, Zico Kolter, Pat Virtue


What is data science

Data science = statistics +


data processing +
machine learning +
scientific inquiry +
visualization +
business analytics +
big data +

Slide credits: CMU AI, Zico Kolter, Pat Virtue
What is data science

Slide credits: CMU AI, Zico Kolter, Pat Virtue


What is data science

Great product
managers are =
data scientists
There is more data than ever before

More data sent across the internet


every second than stored in the entire
internet 20 years ago
Walmart collects > 2.5 petabytes of
data every hour from customer
transactions
From gut feelings to data-backed decisions

Companies that leverage data-driven


decision making are 5% more
productive and 6% more profitable
than their competitors
McKinsey
Understand customer needs
Netflix uses data to personalize and create content

There are 33 million different versions


of Netflix
– Joris Evers, Netflix
Understand customer needs
Data-driven user personas and marketing

Companies making decisions based on


data improve marketing return on
investment by 15-20%

– McKinsey
Build what users need
Data-informed product decisions

“…[W]e decided to let our community solve the problem


for us. Using a rich dataset comprised of guest and host
interactions, we build a model that estimated a conditional
probability of booking in a location, given where the
person searched. A search for ‘San Francisco’ would thus
skew toward neighborhoods where people who also search
for San Francisco typically wind up booking, for example
the Mission District or Lower Haight.”

– Riley Newman, AirBnB former head of data science


Build what users need
A/B testing

Which of these led to a 52% increase in conversion rate?


Measure key performance indicators
What is the north star of your product?

Nights booked Connected 7 friends


Measure key performance indicators
Retention curves

https://browsee.io/blog/10-magic-metrics-for-product-market-fit-conversions-and-customer-retention/
Course Goals
By the end of this course you should be able to

1. Answer important PM questions using data

• Understand user behavior


• Decide what features to build
• Calculate key metrics

2. Wrangle, explore, and explain datasets using interactive data science tools
Data Management
Collecting, Wrangling
Data Science Tools
Tableau, Jupyter, Python
What we expect from you

• The class will involve programming and debugging. We will start using Python in-
class on Wednesday
• There any no exams or required readings

• There will be 3 assignments, each totaling to 80% of your final grade

• Assignments have bonus points for those who already have a lot of prior knowledge
and want more advanced opportunities
• Participation will comprise the remaining 20%. We expect you to come to class,
actively participate in the course and learn
Weekly schedule

• Monday: Lecture & theory (Instructor-led)

• Wednesday: Labs & doing (TA-led)


2-week modules

[Not including this intro week, which includes an intro to jupyter & python lab]

• Module 1: Understanding user behavior



Data visualization, tableau, basic statistics, clustering

• Module 2: Deciding what features to build



Survey analysis, NLP, qualitative coding

• Module 3: Calculating key performance indicators



KPIs such as retention, dau/wau, LTV:CAC, picking a north star

• Last lab of each module is homework-helping lab


Analyzing real data

• We will use real data from Skeema, a tab manager chrome extension that we spun out
of CMU, ~10k users, 1M+ tabs saved/resumed

• The original Skeema is no longer under development and so we can use data from it
for analyses and educational purposes

• Data has been anonymized and in some places cleaned up and augmented to fit the
class
Introduction to Skeema

https://www.loom.com/share/9c56fa06bd264f73b3ec33240fd0fae1
This week

• Come to lab on Wednesday with your laptop charged & ready

• Sign into Slack (https://join.slack.com/t/hciforpms2024/shared_invite/zt-2duw7mqpj-


NcstNrdkA7oobAeap433wA)

• Any questions, ask on Slack

You might also like