You are on page 1of 9

MISY262: Fundamentals of Business

Analytics
Data and Variables

Ali Tosyali, PhD


Assistant Professor of MIS
Data, Information, and Knowledge
● Data are compilations of facts, figures, or other contents, both
numerical and nonnumerical.
● Data that have been organized, analyzed and processed in a
meaningful and purposeful way become information.
● We use a blend of data, contextual information, experience, and
intuition to derive knowledge that can be applied and put into action in
specific situations.
Sample vs. Population
A population consists of all observations or items of interest in an analysis. A
sample is a subset of the population. We examine sample data to make inferences
about the population.
We rely on sampling because ...
● Obtaining information on the entire population is expensive.
● It is impossible to examine every member of the population.
Cross-Sectional and Time Series Data
● Sample data are generally collected in one of two ways.
○ Cross-sectional data: refers to data collected by recording a characteristic of many subjects
at the same point in time, or without regard to differences in time.
Cross-Sectional and Time Series Data
● Sample data are generally collected in one of two ways.
○ Time series data: refer to data collected over several time periods focusing on certain groups
of people, specific events, or objects.
Structured vs. Unstructured Data
● Generally, structured data reside
in a predefined, row-column
format.
○ We use spreadsheets or database
applications to enter, store, query, and
analyze structured data.
○ Examples of structured data include
numbers, dates, and groups of words
and numbers, typically stored in a
tabular format.
Structured vs. Unstructured Data
● Unlike structured data, unstructured data do not conform to a predefined,
row-column format.
○ They tend to be textual (e.g., written reports, email messages, doctor's note, or open-ended
survey responses) or have multimedia contents (e.g., photographs, videos, an audio data) .
● Social media data such as Twitter, YouTube, Facebook, and blogs are
examples of unstructured data.
Variables
● In business analytics, we focus on people, firms, or events with particular
characteristics.
○ When a characteristic differs in kind or degree among various observations (records), then the
characteristic can be termed a variable.
● Variables are classified as either categorical or numerical.
○ Categorical (qualitative): marital status, eye color, major, whether or not has a credit card, etc.
○ Numerical (quantitative): temperature, income, age, score of an exam, etc.

You might also like