This document provides an introduction to business analytics. It discusses how businesses are now competing on analytics in order to make data-driven decisions and gain competitive advantages. It also describes some of the key analytical approaches used, including descriptive, predictive, and prescriptive analytics. Finally, it discusses some of the facilitating developments for business analytics like technological advances, methodological developments, and increased computing power and data storage capacity.
This document provides an introduction to business analytics. It discusses how businesses are now competing on analytics in order to make data-driven decisions and gain competitive advantages. It also describes some of the key analytical approaches used, including descriptive, predictive, and prescriptive analytics. Finally, it discusses some of the facilitating developments for business analytics like technological advances, methodological developments, and increased computing power and data storage capacity.
This document provides an introduction to business analytics. It discusses how businesses are now competing on analytics in order to make data-driven decisions and gain competitive advantages. It also describes some of the key analytical approaches used, including descriptive, predictive, and prescriptive analytics. Finally, it discusses some of the facilitating developments for business analytics like technological advances, methodological developments, and increased computing power and data storage capacity.
you can gain a competitive Business Analytics advantage - Concerned about data driven 2. Methodological development decision and application of analytical a. Ongoing research that approaches to decision making includes advances in computational approaches Data can be collected through electronic b. Explores massive amounts of means (due to staggering amount of data to visualize data (gain data) insight) - Organizations are competing on 3. Computing power and storage analytics in order to achieve key capacity performance indicators a. Huge amounts of data, where are you going to store Companies use data to: it? Data houses. - Boost business process and cost b. Traditional processing efficiency systems can no longer store - Monitor and improve financial huge amounts of data performance c. Get a better software - Drive strategy and change because it can allow you to solve problems faster and accurately Example Decision Making Applying for a loan for the first time. - Competitive businesses make - The bank wants to know whether important decisions you are going to pay. 1. Strategic - They use data such as credit - Higher level issue concerned with history, financial condition, and the overall direction of the disposable income to assess you organization as a borrower (predictive model) - Define the organization’s overall - They provide a forecast/prediction, goals and aspirations for the future combined with the rule, becomes 2. Tactical a prescriptive model - How the organization should achieve - Ex. If we create a rule of the the goals and objectives set by its estimated probability of default is strategy 0.6 , then we should not award the - Are usually the responsibility of mid loan (this is prescriptive analytics) level management - They rely on a set of rules, which 3. Operational is called a rule based model. - Affect how the firm is run from day to day - Are the domain of operations 3 facilitating developments managers, who are the closest to 1. Technological advances (sensors, the customer collect data in vast quantities.) - Helping us quantify risk (overshoot Notice faults in a businesses and be the or undershoot the amount of first one to have the initiative to resolve inventories) that - Yielding better alternatives through analysis and optimization Deciding the nature of a business’s marketing campaign, and using analytics Categorization of Analytical Models to understand which strategy to utilize 1. Descriptive Analytics Decision making process: - Descriptive statistics, data 1. Identify and define the problem visualization, descriptive data 2. Determine the criteria that will be mining, statistical inference used to evaluate alternative - Tells us what happened in the past solutions (cost-effectiveness, ease (data queries, reports, descriptive of implementation) statistics, data visualization, 3. Determine the set of alternative data-mining techniques, basic solutions what-if spreadsheet models) 4. Evaluate the alternatives - Dashboard: collection of tables, 5. Choose an alternative charts, maps, and summary statistics Common approaches to making - Help management monitor decisions: specific aspects of the 1. Tradition company’s performance 2. Intuition (gut feeling) - Summarize sales by region, 3. Rules of thumb (based from your current inventory levels experience) - View dashboards that 4. Using the relevant data available contain metrics related to staffing levels, local inventory Business Analytics Defined levels, and short-term sales - Scientific process of transforming forecasts data into insight for making better - Data-mining: the use of analytical decisions (not all data is beneficial, techniques for better understanding you have to prepare or clean the patterns and relationships that exist data) in large data sets - Used for data-driven or fact-based - Include cluster analysis, decision making sentiment analysis
Tools of business analytics can aid
decisions making by: 2. Predictive Analytics - Creating insights from data - Linear regression, time series (descriptive analytics ex. Reports, forecasting, predictive data mining, statistics) spreadsheet models - Improving our ability to more - Constructed from the past to accurately forecast for planning Forecast a future decisions - Ascertain the impact of one variable b. Streaming data, milliseconds to one another to seconds to respond - Survey data and past purchase 3. Variety behavior may be used to help a. The more complication types predict the market share of a new of data are now available and product are great value to businesses b. Audio data are collected from 3. Prescriptive Analytics service calls - Spreadsheet models, monte carlo c. Video data: shopping stores simulation, linear optimization to analyze shopping behavior models, integer optimization d. Structured, unstructured - Indicates a best course of action (harder to visualize), text, to take: (basically has a rule) multimedia - A forecast or prediction, 4. Veracity combined with a rule, a. Uncertainty due to data becomes a prescriptive inconsistency & model incompleteness, ambiguities, - Optimization models: models that latency, deception, model give the best decision subject to approximations constraints of the situation - Simulation optimization: combines Descriptive Statistics the use of probability and statistics to model uncertainty with Statistics optimizations techniques to find - Foundation in making an important good decisions in highly complex business decision and highly uncertain settings - Decision analysis: used to develop Data Preparation an optimal strategy when a decision - The process of cleaning and maker is faced with several decision transforming raw data prior to alternatives an uncertain set of processing and analysis future events - Choose only quality data - Goals: organize data, efficient Big Data analysis, limit errors 1. Volume a. Any set of data collected electronically b. Data must be stored Data c. 100 terabytes of storage - Facts and figures collected, 2. Velocity analyzed, and summarized for a. How data are stored, and the presentation and interpretation speed how the data is - Descriptive statistics: summary of analyzed towards decision important aspects of a data set making - How to summarize data? - Frequency tables, charts - Central tendency (mean, - Ordinary arithmetic median, mode) operations are meaningful for - Dispersion/Distance between quantitative data two observations (important - Qualitative: non-numerical data to know where error that is categorical data (symbols, occurred) qualities) - Labels or names used to Terminologies identify an attribute of each 1. Elements: are the entities on which element data are collected - Nominal and Ordinal 2. Variables: characteristic or a quantity - Can be either numeric or that can take infinite possibilities or nonnumeric dimensions (quality or attribute of an - Appropriate statistical element) analysis is rather limited 3. Observation: set of values >>> The statistical analysis that is corresponding to a set of variables appropriate depends on whether the data 4. Variation: the difference between in for the variable is categorical or quantitative a variable measured over (know which analysis method to use) observations a. It can have a profound effect Structured Data on the business performance - Can be entered in a database b. Sales, ROI Unstructured Data - Ex. text, audio, video, images Types of Data Scales of Measurement Population and Sample Data 1. Nominal (lowest information) - Represents all elements of interest a. Data are labels or names - Ex. total number of students in used to identify an attribute RVRCOB of the element - Sample: subset of a population b. A non-numerical label or (representative sample; make sure numeric code may be used that there is no bias) 2. Ordinal (highest amount) - Sample size: >= 30 a. It has a property of nominal - How to make sure there’s no bias? data Sampling Methods (random b. The order or rank of the data sampling) is meaningful c. A non-numerical label or Quantitative and Categorical Data numeric code may be used - Quantitative: data can be d. Ex. level of satisfaction, (rich, measured with numbers (stock standard, and poor) prices, # of stocks issued) 3. Interval - Continuous (how much) and a. The data have the properties discrete (how many) of both nominal and ordinal - Always numeric b. Distance between entities c. Interval data are always - Monitor aspects of the business (ex. numerical Zoo: attendance data, diff. Locations d. Ex. temperature, CAT scores of where visitors spend their time e. Always numeric most, which items they sell most) 4. Ratio Scale - Highlighting the substance of your a. The data that have all the data, taken from the raw data properties of interval data - Removes errors and the ratio of two values is meaningful Data Ink Ratio b. Variables such as distance, - Measures the proportion of what height, weight, height, and total amount of ink used in a table or time use the ratio scale chart c. The scale must contain a - Necessary to convey the meaning to zero value that indicates that the audience nothing exists for the variable - Helpful for creating effective tables at the zero point - Low data ink ratio: use of d. Extent of difference between unnecessary vertical/horizontal lines two values - Data ink ratio: minimalist, as simple as possible Data Visualization - With so much information being When to create a table? collected in a business, you must 1. Refer to a specific numerical have a way to visualize or values interpret the data 2. Making precise comparisons 1. What areas to improve on? between different values and 2. Which factors affect not just relative comparisons customer 3. The values being displayed satisfaction/dissatisfaction? have different units or very 3. Who should be the different magnitudes customers that they have to sell specific products to? Table Design Principles (demographics, consumer 1. Avoid using vertical lines in a profile) table unless they are - Provides visual contexts through necessary for clarity charts, tables, and maps 2. Horizontal lines are generally - Visual data to communicate necessary only for separating information fast, universal, and column titles from data value effective when indicating that a - Managers are not the one analyzing calculation has taken place the data, they are recipients of the data Charts - Analyst must make it as simple as - Visual methods of displaying data possible to easier comprehend the data - Scatter chart: graphical - helps calculate margins of error in representation of the relationship customer satisfaction surveys, the between two quantitative variables volatility of stock prices, and much - Trendline: a line that provides an more approximation of the relationship between the variables Percentile - Line chart: a line connects the - The value of a variable at which a points in the chart (time series) specified percentage of observations - Sparkline: displays only the line of are below that value data (minimalist) - Q1 = 25th percentile - Histograms: quantitative data, - Q2 = 50th percentile (median) variable of interest and frequency - Q3 = 75th percentile measure Z-score Numerical Measures - Measures the relative location of a value - Helps determine how far a particular Mean value is from the mean relative to - Average value the data set’s standard deviation - Observations / # of observations - Standardized value Median - Less than -3 and greater than +3 - Value in the middle (odd) is an outlier - Average of two middle values (even) Mode - Occurs most frequently - Multimodal: at least two modes - Bimodal: exactly two modes
Range EXCEL CODING
- Largest value minus smallest value Variance Mean =AVERAGE - Variability that utilizes all the data Median =MEDIAN - Based on the deviation about the Mode =MODE.MULTI mean (diff. Value of each Range =MAX()-MIN() observation and the mean) Variance =VAR - spread between numbers in a Stand. Dev =STDEV data set - how far each number in the set is Coefficient of Var =STDEV/VAR from the mean Q1 =QUARTILE.EXC(xxx,1) Q2 =QUARTILE.EXC(xxx,2) Standard Deviation Q3 =QUARTILE.EXC(xxx,3) - Positive square root of variance - measures how much individual Z-score =STANDARDIZE data points vary from the mean or average of a set of data.