Professional Documents
Culture Documents
• Data Collection,
• Understanding data
To conduct research about features, price range, target market, competitor analysis
etc. data has to be collected from appropriate sources.
The marketing team can conduct various data collection activities such as online
surveys or focus groups.
The survey should have all the right questions about features and pricing such as
“What are the top 3 features expected from an upcoming product?”
or “How much are your likely to spend on this product?” or “Which competitors
provide similar products?” etc.
Analytical Methodology
Data Collection
Qualitative Vs Quantitative
Primary Vs Secondary?
Online Vs Offline
Interview Vs Questionnaire
Telephonic vs Personal
Analytical Methodology
Quantitative Data Collection
Analytical Methodology
Quantitative Data Collection
Understanding the Data
“Without context, data is useless, and any visualization you create with it will also
be useless. Using data without knowing anything about it, other than the values
themselves”
Understanding the Data
You should know the who, what, when, where, why, and how —about the data
before you can know what the numbers are actually about.
Who
“A quote in a major newspaper carries more weight than one from a
celebrity gossip site that has a reputation for stretching the truth.”
Similarly, data from a reputable source typically implies better accuracy than a
random online poll.
In addition to who collected the data, who the data is about is also important
Understanding the Data
How?
People often skip methodology because it tends to be complex and for a technical
audience, but it’s worth getting to know the gist of how the data of interest was
collected.
• Do you trust it right away, or do you investigate?
• Look out for small samples, high margins of error, and unfit assumptions about the
subjects
• Sometimes people generate indices to measure the quality of life in countries, and a
metric like literacy
Understanding the Data
What?
Ultimately, you want to know what your data is about, but before you
can do that, you should know what surrounds the numbers.
Talk to subject experts, study accompanying documentation.
When you get to real-world data, the goal shifts to information
gathering.
You shift from, “What is in the numbers?” to “What does the data
represent in the world; does it make sense; and how does this relate to
other data?”
Understanding the Data
When?
Most data is linked to time in some way in that it might be a time series,
or it’s a snapshot from a specific period.
In both cases, you have to know when the data was collected. An
estimate made decades ago does not equate to one in the present.
This seems obvious, but it’s a common mistake to take old data and pass
it off as new because it’s what’s available.
Things change, people change, and places change, and so naturally, data
changes.
Understanding the Data
Where?
Things can change across cities, states, and countries just as they do
over time.
For example, it’s best to avoid global generalizations when the data
comes from only a few countries.
The same logic applies to digital locations. Data from websites, such as
Twitter or Facebook, encapsulates the behavior of its users and doesn’t
necessarily translate to the physical world.
Understanding the Data
Why?
You must know the reason data was collected, mostly as a
sanity check for bias.
Sometimes data is collected, or even fabricated, to serve an
agenda, and you should be wary of these cases.
What is Data Preparation?
Data preparation is the process of cleaning and transforming raw data prior to
processing and analysis.
It is an important step prior to processing and often involves reformatting data,
making corrections to data and the combining of data sets to enrich data.
Data preparation is often a lengthy undertaking for data professionals or business
users, but it is essential as a prerequisite to put data in context in order to turn it into
insights and eliminate bias resulting from poor data quality.
For example, the data preparation process usually includes standardizing data
formats, enriching source data, and/or removing outliers.
Data Preparation
Benefits of Data Preparation
“Most of data scientists say that data preparation is the worst part of their job, but
the efficient, accurate business decisions can only be made with clean data.”
Data preparation helps:
Fix errors quickly — Data preparation helps catch errors before processing. After
data has been removed from its original source, these errors become more difficult
to understand and correct.
Produce top-quality data — Cleaning and reformatting datasets ensures that all data
used in analysis will be high quality.
Make better business decisions — Higher quality data that can be processed and
analyzed more quickly and efficiently leads to more timely, efficient and high-
quality business decisions.
Data Preparation
Data Preparation Steps:
1. Gather data
5. Store data
Data Cleaning?
• Data cleaning is the process of fixing or removing incorrect,
corrupted, incorrectly formatted, duplicate, or incomplete data within a
dataset.
• When combining multiple data sources, there are many opportunities
for data to be duplicated or mislabeled.
• If data is incorrect, outcomes and algorithms are unreliable, even
though they may look correct.
• There is no one absolute way to prescribe the exact steps in the data
cleaning process because the processes will vary from dataset to
dataset.
• But it is crucial to establish a template for your data cleaning process
so you know you are doing it the right way every time.
Step 1: Remove duplicate or irrelevant observations
Remove unwanted observations from your dataset, including duplicate observations
or irrelevant observations
• Fast decision-making. Summing up data is easy and fast with graphics, which let
you quickly see that a column or touchpoint is higher than others without looking
through several pages of statistics in Google Sheets or Excel.
• More people involved. Most people are better at perceiving and remembering
information presented visually.
• Higher degree of involvement. Beautiful and bright graphics with clear messages
attract readers’ attention.
• Better understanding. Perfect reports are transparent not only for technical
specialists, analysts, and data scientists but also for CMOs and CEOs, and help
each and every worker make decisions in their area of responsibility.
Common general types of data visualization:
Charts
Tables
Graphs
Maps
Infographics
Dashboards
More specific examples of methods to visualize data:
Area Chart
Bar Chart
Box-and-whisker Plots
Scatter plot