Assistant Professor of MIS Data, Information, and Knowledge ● Data are compilations of facts, figures, or other contents, both numerical and nonnumerical. ● Data that have been organized, analyzed and processed in a meaningful and purposeful way become information. ● We use a blend of data, contextual information, experience, and intuition to derive knowledge that can be applied and put into action in specific situations. Sample vs. Population A population consists of all observations or items of interest in an analysis. A sample is a subset of the population. We examine sample data to make inferences about the population. We rely on sampling because ... ● Obtaining information on the entire population is expensive. ● It is impossible to examine every member of the population. Cross-Sectional and Time Series Data ● Sample data are generally collected in one of two ways. ○ Cross-sectional data: refers to data collected by recording a characteristic of many subjects at the same point in time, or without regard to differences in time. Cross-Sectional and Time Series Data ● Sample data are generally collected in one of two ways. ○ Time series data: refer to data collected over several time periods focusing on certain groups of people, specific events, or objects. Structured vs. Unstructured Data ● Generally, structured data reside in a predefined, row-column format. ○ We use spreadsheets or database applications to enter, store, query, and analyze structured data. ○ Examples of structured data include numbers, dates, and groups of words and numbers, typically stored in a tabular format. Structured vs. Unstructured Data ● Unlike structured data, unstructured data do not conform to a predefined, row-column format. ○ They tend to be textual (e.g., written reports, email messages, doctor's note, or open-ended survey responses) or have multimedia contents (e.g., photographs, videos, an audio data) . ● Social media data such as Twitter, YouTube, Facebook, and blogs are examples of unstructured data. Variables ● In business analytics, we focus on people, firms, or events with particular characteristics. ○ When a characteristic differs in kind or degree among various observations (records), then the characteristic can be termed a variable. ● Variables are classified as either categorical or numerical. ○ Categorical (qualitative): marital status, eye color, major, whether or not has a credit card, etc. ○ Numerical (quantitative): temperature, income, age, score of an exam, etc.