You are on page 1of 76

What is operational excellence?

- Most of us have rented cars while on vacation or on a business trip. When you arrive at
your destination airport, you go to the rental car facility. The lines are long, but no
worries, because you're a gold preferred customer. So you head straight to the marquee
board, expecting to find your name and car listed there. Surprise, surprise, your name is
nowhere to be found. So now, you have to go into the rental car office, hoping to speak
to someone at the gold preferred counter. But that counter is closed for the night. So
now, you have to stand in that long line with the regular customers. But wait, your
nightmare is not over yet. When you finally get to the counter after that long wait, you
are told that they ran out of cars and the only vehicles available are luxury SUVs for a
higher rate. What's wrong with this picture? You had a reservation. You're a VIP, a gold
preferred customer. But all that went out the window. You had a difficult and bad
experience, not better. It took longer and it cost more. Operationally, it was far from
excellent. What does it mean to be operationally excellent? In general, what is operation
excellence? Simply put, it is better, faster, and cheaper. Better. Better quality of products
and services, better processes, and better user experience, and better value. Better in
whatever's relevant and important. For example, if safety and reliability are
important, then better safety and better reliability. Better means improved performance
on metrics in the quality dimension. Quality in its broadest sense. Faster. Faster service,
faster response, faster processing, and on-time or faster delivery, if customers
prefer. Faster means improved performance on metrics in the time or temporal
dimension. Cheaper. Cheaper to operate, cheaper to process, and cheaper for
customers. Cheaper means being more efficient in the use of resources and improved
performance on metrics in the cost dimension. To achieve operational
excellence, organizations needs to have processes that are effective and efficient in
delivering value. Processes need to be well-designed, capable, and consistent. Where
people perform the work are capable and have the means of knowing what's important,
what to do, when to take action, and what actions to take. They have the necessary
process authority, responsibility, and accountability. Organizations also need to have the
tools and techniques for design, improvement, and control. For example, for design,
there's Design for Six Sigma. For improvement, there's Six Sigma and Lean. For control,
there's Value Stream or Process Management. In addition, operational excellence means
having the mindset and behaviors where everybody wants to and is able to be
operationally excellent. The right mindset is embraced and the right behaviors are
encouraged by the leadership, enabled by targets and metrics, and encouraged
by performance goals, rewards, and recognition. Finally, an organization cannot
achieve operational excellence without four things. Enterprise-wide alignment of
strategies, priorities, policies, and decisions. This has to be across different functions and
up and down all levels of the organization to consistently drive the right behaviors and
results. These four elements are essential to achieve operational excellence. To be
better, faster, and cheaper. You want to get to the point when everyone in the
organization is willing and is able to do well, to be operationally excellent
everyday. Everyone is engaged, enabled, and empowered to achieve operational
excellence and the desired results for competitive advantage. So the next time you rent
a car, hopefully you have a much better experience, receive faster service at a cheaper
price, and drive off in a car you reserved.

Process stakeholders and SIPOC


- Whether you know it or not, you participate in one or more processes every
day. Whether you work in a restaurant, a call center, a bank, a factory, or a bakery, you
are a process stakeholder. What is a process stakeholder, and what are the roles of a
process stakeholder? In other words, what are your roles in a process? Process
stakeholders are people involved in a process, people who have a stake in a
process who have an impact on how well the process performs. When people think of
process stakeholders they think of people who perform the process, the operators or
processors. Processors perform the work of transformation converting or transforming
inputs to outputs in one or more activities in a process. For example, when you conduct
an analysis for your boss, your work transforms data to useful information. You are a
processor. But there are other roles in a process. In addition to processors there are
suppliers and customers. Suppliers provide inputs to the process of process
activities. Inputs can be materials, parts, documents, data, or information needed for
work to be carried out. Most people think of suppliers as outside suppliers, external to
the process. But there are suppliers internal to the process. You and everyone else in the
process are suppliers. How is that possible? When you receive the work output of
someone upstream in the process, that person is your supplier. Similarly, when you
provide your work output to someone downstream in the process, such as when you
provide the results of your analysis to your boss, you are performing the role of
supplier. Your boss who receives your output is your customer, but you are also a
customer. Remember when you received and used data from IT to prepare that report
for your boss? You are a customer of IT. To summarize in this example, you performed
the roles of processor, supplier, and customer. Processor when you transform inputs to
outputs. In this case you transformed data to useful information. Supplier when you
provide the results of your analysis to your boss, and customer when you receive from
IT the data for your analysis. These three roles were first conceptualized decades ago by
the late quality guru, Dr. Joseph M. Juran as the triple role concept. These process
stakeholder roles come with responsibilities. Responsibilities are prerequisites for
operational excellence. As a supplier you are responsible for knowing what's important
and critical to your customers. Also you have to understand the needs and
expectations for your outputs. In this example you need to understand how your boss
intends to use the results of your analysis. This way the purpose and scope of the
analysis can be defined. As a processor you must know what's important and what must
be done well in the process. In other words, what performance criteria and process
metrics should be established and monitored, and what targets must be set and
achieved in order to satisfy customers. This information determines how you should
plan, perform, and monitor your work in the process. As a customer you are responsible
for communicating your needs and expectations to suppliers. Let them know what's
important to you, and provide them feedback at appropriate intervals to let 'em know
how well they're doing as your supplier. Every process stakeholder is a supplier, a
processor, and a customer. Executing these roles and responsibilities well is a
prerequisite for operational excellence.

Voice of the customer, CTQs, and metrics


- When you order online for a take out from a restaurant, what's important to you? Well
let's say you have a peanut allergy. You order a seemingly innocent food item, like a
grilled cheese sandwich, and you state in your order that you have a peanut allergy. In
short, you probably don't want any peanuts or traces of peanuts in your food. No
peanut allergens in your grilled cheese sandwich is what's important to you, the
customer. But these are expressed from the customer's viewpoint. These are what we
call the Voice of the Customer or VOC. VOC are needs and expectations expressed in
the customer's language. Now put yourself in the shoes of the restaurant owner. How
can you make these words meaningful to your employees as they make and deliver
orders such as this one? You will have to translate them from the customer's
language. You have to put it in terms that are meaningful to your employees' work. Put
another way, you have to translate the voice of the customer into critical-to-quality
requirements or CTQs. What are CTQs? CTQs are the performance characteristics of a
process, product, or service that are critically important to customers. CTQs are
measurable. They measure how good performance needs to be to satisfy the needs and
expectations of customers. Back to our grilled cheese sandwich. From the voice of the
customer, or VOC, we know that this customer wants no peanut allergens in their food
order. We can translate no peanut allergens in the grilled cheese sandwich to no peanut
allergens in the ingredients and careful handling to avoid any cross-contamination. So
the critical-to-quality requirements, or CTQs, are no peanut allergens in the
ingredients and no cross-contamination. These CTQs can be measured by A, amount of
peanut allergens in the sandwich, and B, how cross-contamination free are the cooking
utensils and preparation by the staff? We can specify how well the restaurant must
perform on these metrics to satisfy this customer. In other words, we can determine the
specifications and targets for these CTQ metrics. In this example, the CTQ targets or
specifications will be zero amount of peanut allergens in the ingredients and zero cross-
contamination from all preparation and utensils. To recap CTQs are the measurable
performance characteristics of a process, product, or service that are critically important
to customers. With CTQs you know what metrics to monitor and how well they must
perform to satisfy customers. In our example you want to monitor the ingredients,
preparation, and utensils. On an on-going basis beyond this order all food preparation
and utensils should be monitored to avoid any cross-contamination. CTQs provide
customer focus for your process on a day-in day-out basis, and if you embark on any
improvement or design projects, don't lose focus on the customer. Ensure that you
measure CTQs pre- and post-project. After all, we definitely don't want customers to
suffer a preventable adverse reaction.

Kano model and Its Implications


- When you shop online, board a flight, or stay at a hotel, what's important to you? Is
everything equally important? For example, is having a safe flight of equal importance
as having free champagne? Definitely not. These quality attributes describe different
aspects of the experience, service, or products, and they are indeed different, and
should be treated as such. But wait, there is a model to help you do just that. It's called
the Kano model. Named and developed by Dr. Noriaki Kano to model how customers
perceive quality, it is based on the level of achievement on three types of attributes that
impact customer satisfaction and dissatisfaction. The three types are dissatisfiers, or
basic quality attributes; satisfiers, or performance quality attributes; and delighters, or
excitement quality attributes. Here is a Kano model where the vertical scale is increasing
satisfaction as you go higher up above the horizontal axis, and increasing dissatisfaction
as you go further down the horizontal axis. The horizontal scale shows the degree of
execution from nonfulfillment on the left of the vertical axis to increasing fulfillment on
the right of the vertical axis. Let me describe each of these in turn. Dissatisfiers, or basic
quality attributes. They must be present, otherwise customers will be dissatisfied. These
attributes are essential must-haves or must-be's. Addressing them will reduce or
eliminate dissatisfaction, but will not increase satisfaction. For example, nobody jumps
for joy when they have a safe flight, or when there are clean towels in a hotel. It is a
basic requirement. However, not having a safe flight or not having clean, unused
towels brings about dissatisfaction. That is why basic quality attributes are called
dissatisfiers. Their presence does not increase satisfaction, but their absence results in
dissatisfaction. Addressing dissatisfiers, or basic quality attributes, will reduce
dissatisfaction. However, reducing dissatisfaction is not the same thing as increasing
satisfaction, and this brings us to satisfiers. Satisfiers, or performance quality
attributes. Satisfaction and dissatisfaction levels vary according to the level of execution
on these attributes. For example, with cars and trucks, better fuel economy or higher
miles per gallon increases satisfaction and lower miles per gallon lowers
satisfaction. Finally, delighters, or excitement quality attributes. The absence of these
attributes does not result in dissatisfaction, but their presence can lead to delight, or a
wow reaction, where satisfaction shoots up exponentially. For example, getting free
champagne during a flight in coach or economy class. Or a complimentary spa
treatment during your hotel stay. So what are the implications of the Kano model? Let's
discuss how they impact design projects, improvement projects, and day-to-day
management. In design projects, dissatisfiers, or basic quality attributes are
nonnegotiable and must all be addressed. Performance quality attributes, or
satisfiers, must be sufficient to achieve good customer satisfaction levels. And to
increase market share, delighters must be present. For improvement projects, if the
intent is a reduced customer dissatisfaction, focus on ensuring that all dissatisfiers, or
basic quality attributes, are addressed. If the intent is to improve customer
satisfaction, then improve on the satisfiers, or performance attributes, first before
addressing any delighters. For day-to-day operations, ensure that basic quality
requirements are always met, satisifers, or performance quality levels, are
maintained, and any delighters that are present remain present. So the next time you
shop online, board a flight, or stay at a hotel, I hope there is no dissatisfaction. Instead,
you can get some satisfaction and even be delighted.

Variation
- A customer places an order and is told it will take two to 10 days for delivery. You
make a request for a transaction in your company and you're told it will happen in five
to 10 days. Why is there so much variation? A more important question is what
happens when you try to promise, budget, staff, schedule, and plan on getting things
done correctly and on time when there is so much variation? The answer is with a lot of
difficulty, uncertainty, increased costs, and mixed results and performance such as not
meeting deadlines, not meeting surface level agreements, and other contractual
obligations. Not to mention, upsetting customers and potential loss of business. Let's
understand variation. Let's say there are five activities or processing steps needed to
complete a transaction. These steps could be mapped onto many different kinds of
processes. So, think about a process that's relevant to your work. It can be
manufacturing, services, or healthcare. But no matter what the process is, there is
variation in each step. In step one, it may take anywhere from one to four hours. Step
two, takes one to eight hours. Step three, one to three hours. Step four, one to six, and
step five, one to three hours. So the entire process can take anywhere from five to 24
hours. Mathematically speaking, averages or means add up, but standard deviations do
not. It is a square of the standard deviations. In other words, it is the variances that add
up. The variance of the total processing time is the sum of the variances of each of the
five steps. That is why you see a very wide processing time curve for the whole
process. Variation does not just impact time performance. Variation also impacts how
well specifications are met or not met. In other words, variation impacts quality. There
are specifications for services and products. These specifications are expressed as
specification limits. Here's the curve showing an example of the distribution compared
to the specification limits. The performance is off target and has wide variation. Because
large areas of the curve are outside the lower and upper specification limits, defects
occur frequently. While companies make changes to adjust the mean or average to be
on target, as shown here, the wide variation still results in performance outside the
specification limits. In other words, defects still occur. How often do companies report
only means or averages? And when the mean hits a target, they celebrate, or should
they? For example, we did very well last month. Our average or mean delivery time was
33 hours and that is less than our 34 hour guarantee. But then, why are there customer
complaints on late deliveries? This diagram shows why. It is the variation. Variation has
to be reduced to improve performance. Section marked projects are excellent for
reducing variation. Here's the result. With performance on target and the variation
reduced, there is a huge reduction in defects, quality improves. Let me conclude by
sharing an everyday example. How many of us have experienced shopping online and
buying products which states assembly required? What if there is variation in the size of
each of the parts or components beyond their specification limits? Well, as we learned
earlier, variances do add up. So the cabinet that you order online might not fit properly
together when you try to assemble it. Doors hang unevenly, screws don't fit in
holes. This is because of variation. That is why, where possible, I avoid buying any
product that requires assembly. So the next time anyone brags about average
performance being on target, ask them about variation.

Quality at the source


- How often do you have to redo or rework someone else's work output? Or how often
are defects and problems found at the end of a process? If a defect occurs in step
two, why is it not discovered until the 99th step when inspection takes place? Think of
all of the costs incurred in processing the defective item from step two to step 99. What
about the cost of scrap or, if it's salvageable, the cost of repair and rework and re-
inspection? Or worse yet, the effects remain undetected until they end up in customers'
hands. What about the cost of recalls, warranty claims, lawsuits, the loss of customers,
and bad reputation? Quality should be assured at the source. The principle of quality at
the source means that quality is assured at the source where work is performed so that
no defects are passed down to the next step or to subsequent processes. The source can
be a person, such as a waiter taking an order, a workstation in a factory or a call
center, a department processing a transaction, or, externally, a supplier who supplies
products and services. Quality at the source can be achieved in one or more of the
following ways: quality by design, quality by process monitoring and control, and quality
by self-checks and verification. Quality by design is used to ensure that it is always done
right the first time. To ensure quality by design, mistake proofing, error proofing, or
poka-yoke as it is called in lean are used. For example, when completing an online
form, it will not allow you to proceed to the next screen if all essential fields in the
current screen are not completed correctly first. By design, it is impossible to
proceed with missing information. Let me share another example: charging your
phone. By design, the charger plug can fit into the phone facing up or down. Both ways
will fit so you don't risk damaging the phone or charger. It is designed to avoid the
tendency to shove the plug into the phone the wrong way. Quality by processing
monitoring and control where tools such as statistical process control are used to
ensure that critical process parameters are monitored and controlled. For example, the
control of baking time and oven temperature when baking pizzas. Quality by self-check
and verification. To assure quality at the source this way, develop and implement
procedures and protocols that mandate it, and ensure compliance to those
procedures. For example, airline pilots are required to perform a preflight
checklist before pulling away from the gate. Another example, before a surgery, nurses
identify the patient's identify and surgical procedure, comparing verbal responses to
patient records. Internally, everyone should have the responsibility and authority to pull
the cord to stop defects from being produced and to bring it to the attention of
supervisors or management, all without any fear of repercussions. Management and
employees must embrace this mindset, the mindset of doing it right the first time, a
mindset where quality trumps production quotas, where metrics and rewards encourage
good quality output and do not reward quantity over quality. Externally, at
suppliers, quality at the source can be implemented the same way, by design, by process
monitoring and control, and by self-checks and verification. As a customer, you can
conduct audits to verify supplier performance. Not only audits on products, but also on
processes to ensure that supplier processes have sufficiently good process capability to
produce defect-free products. If these audits are successful, inspections at the receiving
dock can be reduced or even eliminated. Just-in-time deliveries can then be made
directly from the supplier to the points of use in your production line. In summary,
quality at the source facilitates just-in-time flow, being more responsive to customer
demand without any hiccups or delays. It is more cost-effective to do it right the first
time to prevent defects from occurring or from being passed down to subsequent steps
and processes. So, the next time you have to rework or redo someone else's work, think
about quality at the source.

Error-proofing or poka-yoke
- When you shop online, the credit card number you input has to be accurate, otherwise
you cannot complete your order. That's an example of error proofing, or mistake
proofing. In lean, it's often called poka-yoke from the Japanese terms (speaks in foreign
language) meaning mistake, and (speaks in foreign language) meaning to avoid. To
avoid inadvertent error or human error. Wouldn't it be useful to error proof or mistake
proof your processes, products, and services? Let's discuss the basic principles of error
proofing and how they can be applied. Error proofing is best when it prevents error
from occurring. If that's not possible, the next best thing is to facilitate the work so that
errors are minimized. Lastly, if errors do take place, then detection should be
made obvious and immediate, or be automated. There are basically three levels of error
proofing. Here they are in order of preference. One, prevention. Two, facilitation. Three,
detection. The first and most preferred is error proofing by prevention. An example is
the traction control system in cars. It works actively full-time to prevent wheels from
over spinning on slippery roads or low friction surfaces. Regardless of how good or bad
the weather is, or the driver is. Prevention-based error proofing is also possible in
processes. For example, in-store pickup of a prepaid online order. In addition to
showing a driver's license, the same credit card must be scanned for the system to
authorize and process the pick up. If prevention is not possible, then the next choice is
error proofing by facilitation. An example of this is anti-lock brakes, in which the car
helps facilitate your application of brakes in an emergency, so it is done efficiently,
effectively, and the brakes don't lock up. Similarly, in processing transactions, dedicated
pre-programmed buttons such as those on point of sales systems, for cheeseburger or
large fries are used. Also, you can connect your point of sales system to inventory
management. It can help you track inventory and prompt you when and how much to
order. Or as I've experienced recently, many easy to assemble furniture are not easy to
assemble. However, I've come across some well-designed truly easy to assemble
furniture where parts to be assembled to each other are labeled by the same matching
numbers and the parts fit perfectly. Facilitation-based error proofing makes it easier and
it minimizes errors, but it does not prevent non-compliance due to forgetfulness or
human error such as pressing the wrong button by mistake. If prevention or facilitation
is not possible, then employ the third choice, error proofing by detection, where
detection of errors is immediate, either by being made obvious or by automated
detection. An example is the annoying beeping sound and warning light in your car to
alert you when you forget to put on your seatbelts. Another example, is the fuel
gauge where the warning light comes on when the gas tank is low with a range of less
than 30 miles left. The only trouble is, for me, when I finally notice the light, I'm never
quite sure if it just came on or if it came on 25 miles ago. To summarize, if prevention is
not possible, use a combination of facilitation and detection. An example is a
combination of anti-lock brakes, seatbelts, and airbags. So prevent errors if possible. If
not, minimize errors or at the very least, detect and mitigate these effects immediately.

Lean principles
- Remember the last time you were delayed, or when you made multiple trips to a
customer work site? Or when you had to rework or redo a task? Or spend time, effort,
and money to hurry up only to wait at the next step? Bottom line, let's just say what
you've experienced is simply not lean. Lean can be viewed as a management
philosophy, a mindset, a methodology, a tool set, or an approach to daily work. At it's
core, it's all about minimizing wastes and maximizing value to customers. Value is what
customers need and expect and are willing to pay for. This includes receiving the right
products and services at a specified price, time, and place. For example, if you see an
extra charge in your invoice because warehouse personnel spent five hours to look at
your item, would you want to pay for that? Of course not! It is of no value to you. Waste
is anything that is of no value or adds no value for customers. Waste comes in many
forms. Eight types of waste have been identified. I know a guy named Tim Woods
who can help us remember it all. TIMWOODS is an acronym. T for transport, that's
movement of people, materials, products, or documents between activities or
locations. I for inventory. Just in case additional inventory whether it is raw materials,
work in process, or finished goods inventory. M for motion, movement within an
activity that does not add value for the customer. W for waiting, time wasted between
activities, waiting for required resources, materials, parts, people, or information. O for
overproduction, producing more than what's needed. For example, making 20 copies
when only two were needed. O for overprocessing, inappropriate, excessive
processing and unnecessary duplication of work that adds no value. For example,
polishing your presentation slides repeatedly after multiple rehearsals before presenting
to senior management. D for defects, defective work, defective items, or any undesired
outcome that adds no value. S for skills underutilized. This refers to underutilization of
employee skills and intellect such as when a highly qualified scientist or engineer is
assigned to administrative work. So that's TIMWOODS, the eight types of waste. Lean
targets the elimination of waste to improve efficiency, flow, and speed. Here are some
key concepts of lean. Value stream, a value stream is all those processes, activities, and
resources, including information, used to transform inputs into outputs that are sealable
to customers. The elimination of waste will improve efficiency, speed, and flow of the
value stream. Theory of constraints, the output of a value stream is only as fast as the
slowest processing step, bottleneck, or constraint. To improve the rate of output or
throughput, focus on improving the constraint until it is no longer a limiting factor. That
is the crux of the theory of constraints, developed by Eli Goldratt. Pull instead of push,
customer demand pulls the order, product, or transaction throughout the value stream
from suppliers to customers. Traditionally, products have been pushed to customers by
suppliers or producers, regardless of whether they want it or not or if they're ready to
receive it or not. That is why you see your end clearance sales. Pull, on the other hand, is
about customer demand pulling and authorizing work and delivery as and when it is
needed. Just-in-time flow, items or transactions should be produced, processed, or
delivered just-in-time, as it is needed, at the same rate as customer demand. If goods or
services are produced to the drum beat or rate of customer demand, where a demand
for an item signals a production and delivery of that item from the next upstream
workstation then there's no need for inventory. When you find yourself redoing
something, making unnecessary trips, waiting in line, or stuck with items you don't
need, fix the issue by applying lean concepts.

Process mapping
- They say that a picture is worth a thousand words. Yes, it is much easier if you are able
to visualize what you're trying to understand, manage or improve. A process map is a
diagram that provides a visual representation of the process flow, or a sequence of
activities or steps that take place in a process from start to finish. There are different
types of process maps. At the highest level is the high-level process map. This provides
a view of the process at 10,000 feet high. A high-level process map displays the main
activities of major steps in the process. Usually showing a whole process in 10 or fewer
major steps. If any of these major process steps needs more granularity to be better
understood, utilize process decomposition. This is when you drill down or decompose
those specific steps into more detail using detailed process maps. A detailed process
map provides sufficient granularity to enable the project team to understand what is
going on, not going on, and display where the decisions, re-work loops, delays,
bottlenecks, and walk arounds occur. If multiple groups are involved in a process, then a
swimlane process map will be useful to map across functional process. Think of a
swimlane process map as a detailed process map that has been allocated in the
respective lanes where the activities are performed. Visualize an Olympic swimming
pool where you have multiple lanes. One for each swimmer. A lane for each group
function or department involved in the cross functional process. A swimlane process
map is also called a deployment map because it shows where the work is deployed. You
may ask, so what, why bother with a swimlane process map? A swimlane map shows
which group or department is performing each process step. And where the handoffs
are. Handoffs are the weak links in a process. Where things can fall through the
cracks due to miscommunication between departments resulting in delays, mistakes and
defects. Being able to see these opportunities for failure is very useful. So if you have a
cross functional process, map it using a swimlane process map. Another type of process
map that has gained popularity with lean and operational excellence is the value stream
map. A value stream map is a diagram that shows the major steps involved in getting a
product or service from supplier to customer. It shows the material and information
flows from order to delivery. It is basically a high-level process map with additional
information, such as customer data, processing data, and information flows pertinent to
the value stream. At a glance, we can see the end-to-end process. From order to
delivery. Or from check-in to check-out. It shows the flow of information and
material including process steps, processing time, cycle time and the number of
servers. We can see the backlog or work in process, in front of each process step. The
timeline at the bottom of the map shows the lead time and actual processing times. The
value stream map provides a snapshot of entire value stream and its performance. To
summarize, the different types of process maps discussed are: high-level process
map, detailed process map, swimlane process map or deployment map, and value
stream map. Using these process maps, you can create a picture of the process that is
really worth a thousand words.

FMEA: Failure modes and effects analysis


- Take any process in your organization. On any given day, what can possibly go
wrong? As the saying goes, whatever can go wrong will go wrong. So what can be done
to anticipate and mitigate the risk of failure? Well there is a tool to help you do just
that. It's called FMEA or Failure Modes and Effects Analysis. It is a tool for mitigating the
risk of failure. First used in the 1960s for the Apollo Space Program, FMEA is widely used
in many industries and required in some, such as aerospace and automotive. There are
two types of FMEA. The Design FMEA and the Process FMEA. The Design FMEA is for
reducing the failure risks associated with a product or service design such as when you
are designing a new hospital or the next smart device. The Process FMEA is for reducing
the risk of potential failures in a process, any process, whether it is the procurement
process or on-boarding a new hire. FMEA works the same way for both Design and
Process FMEAs. Design FMEAs evaluate the risk of failure for each function, system, or
component, while the Process FMEA evaluates the risks for each process step. These
questions are evaluated. What can possibly go wrong? In what ways can it fail? These
are called potential failure modes. For each potential failure mode, what is its
effect? How severe is the potential effect using a one to 10 scale where 10 is the
worst. That is called the Severity Score. And what could have possibly caused each
potential failure mode in the first place? How likely is the occurrence of these causes on
a one to 10 scale, where 10 is the most likely? That is called the Occurrence Score. What
controls are currently in place to detect the cause or the failure mode? What is the
likelihood of detection on a one to 10 scale where 10 is least likely to detect? That is
called the Detection Score. The FMEA then multiplies the severity score with occurrence
score and the detection score to come up with a composite score called the Risk priority
number or RPN. The RPNs can then be used to prioritize the failure modes. The highest
scoring failure modes can be targeted for improvement first to reduce or eliminate the
risk of potential failures. Here's an example of a partial FMEA. With the FMEA, whatever
can go wrong will not go wrong but instead can be anticipated and the risk of failures
mitigated.
Process control and the control plan
- Do your operators, engineers, or customer service reps know what needs to be
monitored and controlled to achieve the desired outcome? For example, in a pizza
chain, do employees working in the kitchen know what's important and what to watch
for such as what the baking temperature should be and what the actual temperature is
at anytime? Are they authorized to take any action if the heat is too high or too low? Do
they know what actions to take to make it right? Well, there's a tool to address all these
questions, it's called the control plan or process control plan or matrix, they are all the
same thing. The control plan is a tool that provides process owners and operators with
the means to control a process so that it performs well on an ongoing basis. The control
plan specifies what needs to be controlled, these are control subjects. Control subjects
are factors or perimeters that impact the outcome. In our example, one control subject
is baking temperature. What it should be, desired target or specification. This enables
operators and process owners to know what to aim for, in this case, what should be the
correct range or target temperature. We can specify the target as 425 degrees
fahrenheit and the desired range as 423 to 427 degrees. What reality actually is, that is
the control plan specifies exactly how actual performance is made known including how
the measurement is to be made, how frequently, and where it is recorded. This provides
operators and process owners with information on actual performance thus knowing
whether target or specifications are met. When to take action and what actions to
take. This is basically a set of action triggers something happens and a specific behavior
is meant to follow such as leave the pizza oven temperature alone when it is between
423 degrees and 427 degrees, and when a temperature is above 427, turn the heat
down, when temperature is below 423, turn up the heat, and so on. Your control plan
lays out all of these important details. Who is responsible and authorized to take
action? In our example, we can state that the cook or the chef is responsible for
temperature. He's authorized to take action and record results and actions taken. The
control plan enables operators and process owners to know what's important and what
needs to be monitored by whom. It helps them know what target or specification is
desired and to know if actual performance is acceptable. Finally, following a control plan
helps you know when to take action and what actions to take. There can be multiple
control subjects specified in any control plan. In a pizza kitchen, this could be oven
temperature, baking time, weight of dough, amount of sauce and cheese, and so
on. Now, this doesn't mean you should place all elements of your operation in a control
plan. Only include the key ones, the key factors driving performance and outcome. As
the saying goes, knowledge is power. The control plan provides employees with the
knowledge, the means, and the authority to take action so as to achieve specified
performance levels.
Key roles in operational excellence
- So your company wants to pursue operational excellence. What are the key roles of
executives, managers, and process owners? Well, before we address those
questions, let's pause and consider what operational excellence is and isn't. For starters,
operational excellence is not and should not be thought of nor should it ever be
perceived as a one-time pursuit, a project, or one-off event with a start and end
date. The pursuit of operational excellence never ends. Operational excellence cannot be
assigned or delegated down. Operational excellence is predicated on how the
organization is lead, managed, enabled, and empowered, so that everyone executes
well. Executives, managers, and process owners should all focus on executing well to
gain competitive advantage. Their key role is the focus on running a successful
enterprise so that it executes better, faster, and cheaper in delivering value to
customers. Operational excellence is not something extra. It isn't a label or a specific
job, it is a way of operating that should be integrated into all work related to the
business. It should be part of the organization's DNA. The following are key roles of
executives, managers, and process owners. Embrace and encourage the right mindset
and behaviors so that everybody wants to and is able to be operationally
excellent. Ensure that metrics, targets, performance goals, reward and recognition
encourage these behaviors. Talk is cheap. You need to ensure and reward the right
performance. For ongoing day-to-day operations, ensure that employees are
capable, have the right skills and training. Enable employees to excel by providing
them with the means to know what's important, what to do, when to take action, and
what actions to take in order to deliver value to customers. Ensure they have sufficient
authority that's consistent with their responsibility and personal accountability. There
will be some processes that are poorly designed and/or performed inconsistently. For
these, improve or redesign them by selecting and launching appropriate projects. But
not every project should be pursued, only the ones that achieve strategic and annual
goals. So be selective. Executives and senior managers should view projects as a means
to achieve strategic and annual goals. Establish project selection criteria to ensure
alignment of projects with those goals, review proposed projects, evaluate, and select
projects. Assign executives or senior managers to be project champions. Review updates
from champions and provide resources as needed. What is a project champion and why
have project champions? The project champion is the management team's point
person who is tasked with ensuring project success. It is also management's way of
making it very clear that they want to see the project succeed. It would be great to
improve everything, but you have limited resources, so you need to prioritize and be
selective. With a project champion from the management ranks, you will lose sight of
the forest for the trees. Issues need to be addressed, but address in context. Project
champions provide that balanced perspective. The project champion should be an
executive or senior manager who has enough clout and respect to remove
roadblocks and to ensure that a project is given priority and has the right resources it
needs to succeed. For employees selected for projects, reallocate their duties to
others, so as to free them up with sufficient time to carry out project work. Recognize
individual contribution to project success in the performance appraisals. Provide
resources and training in the relevant operational excellence tools and techniques. For
example, for design projects, it's Design for Six Sigma. For improvement projects, Six
Sigma and Lean. And for control, it's process management or value stream
management. Finally, executives, managers, and process owners should ensure that all
decisions and policies are consistent with and supportive of the strategies and priorities
of the enterprise. Such alignment should be enterprise-wide, across different
functions, and up and down all levels of the organization. This consistency will drive the
right behaviors and results.

Alignment for operational excellence


- As management guru Peter Drucker once said, "What gets measured gets
managed." There are several types of measurements and metrics used across many
organizations. A common complain among employees is: "We have so many metrics
and so many so-called KPIs," or key performance indicators, "to the point where we
don't really know "which ones are truly the key KPIs." There's confusion among
employees as to what is being managed and on which ones they should perform
well. For example, if I am a customer service rep in a call center, am I evaluated on
average call handling time or first call resolution rate? Should I rush through a call, or
should I take the time and effort to resolve a customer problem? Unfortunately, this
confusion is all too common, regardless of whether you're working in the front office in
customer-facing jobs, in back office operations, in manufacturing, or support functions
such as IT and HR. To add to that, employees get mixed messages. For example, those
working on projects are often told by their supervisors that they need to spend less time
on the project and more time on the job in order to hit their numbers, but the project is
to improve a key process in that department. A phrase commonly heard is: "I love
working on this project, "but my boss tells me that I have this project "but I also have
my real job." Another big problem is the use of counterproductive performance
measures across departments or functions. For example, sales is rewarded on sales
volume and revenue. That's good, right? But it is not uncommon to see sales reps
promise customizations and multiple versions of the same basic product to customers in
order to make the sales numbers. We may ask, "What's wrong with that?" Well,
customizations and the proliferation of products or SKUs is a nightmare for other
functions, such as manufacturing and supply chain. Cost per unit goes up. While this has
no impact on the sales team, it adversely impacts the manufacturing and supply chain
teams. Metrics, performance measures, KPIs, and the accompanying reward,
recognition, performance appraisals, and promotions, these must all be made consistent
and aligned across departments and functions and across organizational levels. Every
employee must be rowing in the same direction for the ship, the enterprise to sail and
reach its destination successfully. To do that, reduce the number of KPIs to the truly key
ones that support higher level KPIs at higher levels of the organization. There should be
a vertical line of sight from individual employees to the highest level in the
organization. Every employee should be able to see how his or her performance
contributes to how the enterprise succeeds. In addition, there should be a horizontal line
of sight among individuals between different departments and functions who support
the same value stream or end-to-end process. For example, order fulfillment or hiring
and onboarding. Every employee should be able to see how the metrics and KPIs that an
individual is personally accountable for are supporting not just his or her
department and value stream, but the success of the entire enterprise. Alignment should
be enterprise-wide across different functions and up and down all levels of the
organization to consistently drive the right mindset, behaviors, and results.

Choose the right methodology


- When there is a need to identify efforts for operational excellence, which approach or
methodology is appropriate? Well first, we may need to ask, what are the various types
of projects and methodologies for operational excellence? Let's start with process
management of value stream management. If employees do not know how to perform
their work, or know when to intervene to take action, or what actions to take when
running a process, or if there is no effective monitoring and followup, or if there's a lack
of process metrics, discipline, or protocol, then that process requires process
management of value stream management. For design or improvement efforts, use
projects. Projects are very focused planned efforts with a start and end date to achieve
stated goals. These are launched as needed, and they do end when project goals are
achieved. They are not endless. Next, let's talk about lean projects. If the drivers of
performance are known or easily determined using the collective knowledge and
experience of the right people, then identify and select lean projects. Lean projects are
usually executed as kaizen events or workouts. The primary focus of lean projects is
efficiency, speed, and flow. Processes with redundant and wasteful activities are
streamlined. In other cases, the root causes of problems or drivers of performance are
unknown. For this, select six sigma D-M-A-I-C, or DMAIC, projects. DMAIC is an acronym
for the five phases of the project methodology. It stands for define, measure,
analyze, improve, and control. DMAIC is used for improvement projects where data
driven analysis is done to determine the key drivers of performance. If a new product,
service, or process is required, or if an existing product, service, or process is so
broken that it needs to be redesigned, then design projects using design for six
sigma, or D-M-A-D-V, should be identified and selected. D-M-A-D-V, or DMADV, is an
acronym for the five phases of define, measure, analyze, design, and verify. Selecting the
correct methodology for each type of project is important. When projects are
selected, project scope and process boundaries, should be defined. Project scope spells
out what's included and what's excluded. For example, in scope is product k and its
residential customers. But its commercial customers are out of scope. Process
boundaries spell out the start and end points of relevant processes. For example, if it's a
package delivery process, when does the process start? Does it start when a package is
dropped off? Or does it start when the customer calls for a pickup? When does the
process end? When the package is delivered to the receiving dock of a factory or office
building? Or does it end when the recipient gets the package at his or her desk? Usually
these processes are cross-functional processes. The project team has the challenge of
getting buy in and support from process owners and other stakeholders from across
multiple functions, departments, or groups. A tool that is useful for scoping process
boundaries and for identifying stakeholders is the S-I-P-O-C, or SIPOC, diagram. SIPOC
is short for supplier, input, process, output, customers. The use of the SIPOC diagram
helps a project team identify key process inputs and outputs, as well as suppliers to and
customers of the process. To summarize, depending on your goals, identify and select
the right processes and projects and choose the correct methodology. Define project
scope and process boundaries inputs and outputs. And finally, identify process owners
and other stakeholders who can drive project success.

Process or value stream management


- As a manager or fellow employee, don't you dread it when you hear these
words: "That's the way the system works." Or, "We'll make do." Or the dreaded phrase,
"it is what it is." These are all symptomatic of the value stream or process not being
managed and work not being done. Effective value stream or process management and
control require the following. Standard work and authority, control plans, process
metrics and KPIs, process owners, value stream managers. Standard work and authority
are essential. Process steps and tasks should be streamlined, defined, documented, and
made known to employees. This includes who does what, when, and how. Standard
work, standard operating procedures, or SOPs, and visual standards are used and
reinforced through training. Employees should know what they're supposed to do. They
need to know how to do it. And they need to know that they have the responsibility and
authority to make things right. Authority must be commensurate or consistent with
responsibility and accountability. Next, control plans should be implemented to
enable employees to be in control of the process. The control plan is an organizing
tool and comes in various forms. For example, a checklist or table. They are used by
employees to monitor and regulate the process. In the control plan, several things
should be defined. What should be monitored, with what metric, to what target, how
actual performance is to be measured, and how frequently. Measurements should be
frequent enough to adjust and regulate promptly. In addition, employees need to know
when to intervene and when to leave the process alone, and if action is to be taken,
what action to take and who does it needs to be clear. Otherwise it will be nearly
impossible to regulate and bring actual performance in line with targets. Imagine that
your pizza chefs making pizzas don't know the temperature of the ovens until they see
the weekly report. You could be left with hundreds of burnt pizzas. Also imagine if these
same employees don't know who is even allowed to regulate the temperature. It seems
extreme, but also often this breakdown in control occurs. Let's move on to process
metrics and key performance indicators, or KPIs. Process metrics and outcome
measures, including KPIs, should establish by management. This way actual performance
and desired targets are transparently visible to all employees in the process of value
stream. Keep the number of KPIs and metrics to a minimum. A common pitfall among
companies is to measure everything and label all metrics as key performance
indicators. Don't fall into this trap. The result is mass confusion, not to mention the
time and effort wasted in measuring and monitoring everything. Make sure that all
metrics are traceable to what's critical to quality, or CTQs. In other words, all metrics and
KPIs should be traceable to impacting what's important to customers. Take fast food
restaurants, for example. Customers want fast, inexpensive, tasty food. That's it. So
measuring and monitoring and reporting on customer satisfaction levels of fancy decor
and ambience is a waste of time. Process owners. Processes and value streams are cross-
functional in nature. They cut across departments and functions in the
company, sometimes even outside the company with suppliers and outside
partners. While there are managers for the respective departments of functions
involved, no one person is in charge of the end to end process or value stream. For
example, in the order fulfillment process, many departments, including sales, customer
service, manufacturing, supply chain and logistics are involved. But often there is no one
process owner for order fulfillment. A process owner should be assigned to ensure that
nothing falls through the cracks during the handoffs between the departments or
functions. The process owner should have accountability for that value stream meeting
its targets. Equally important, the process owner should have the necessary authority to
carry out those responsibilities. Value stream managers. In some operationally excellent
enterprises, the role of process owners goes one step further. The organization chart is
turned on its side, where there are no functional departments and department
managers. Instead, the company is organized by value streams and end to end business
processes. There are permanent positions called value stream managers who manage
value streams or business processes in the company. Employees working in the value
stream report to that value stream manager. The value stream manager has full
authority, responsibility, and accountability for the value stream. This includes the
resources, the authority to hire and fire, to reward and promote. Essentially, to run the
value stream. In conclusion, a couple of points should be reinforced. Timeliness and
frequency of reporting on actual performance. It should be as immediate as possible so
that appropriate actions can be taken to regulate and correct quickly to eliminate or
minimize any adverse impact. Imagine the driver of your delivery truck not having a
speedometer, but instead, receives a monthly speedometer report. Responsibility and
authority to regulate the process should be pushed down to as close as possible to
where the actual work is done. For example, imagine that same driver of your
truck having to get permission from corporate HQ before pressing the brakes. While
managing a process or value stream may sound mundane, it is critical that it is done
correctly day in and day out. Otherwise, results from the best planning, redesign, and
improvement projects will not be sustainable.

Lean event, Kaizen, or workout


- Have you ever wished if you could only gather the right people together, put them all
in a room, and pick their brains to address a problem or to improve on one area or
process. Well now you can. It's called a kaizen. Kaizen comes from a Japanese
term. Loosely translated, it's change for the better or simply put,
improvement. Sometimes it is called a lean event, a rapid improvement event, or a
WorkOut. A few different names all describing the same thing. The term WorkOut was
coined at GE. Essentially if you take unnecessary work out of the process to streamline
and make it more efficient. Kaizen is a well organized, structured, and facilitated event to
improve a work area, a department, a process, or an entire value stream. These rapid
improvement events can vary from one to five days, depending on the objective and
scope of the kaizen, such as to understand how the value stream flows or does not flow
due to mistakes, delays, and bottlenecks in order to improve it, prioritize what's
important to customers of a value stream in order to identify key process
metrics, brainstorm and identify likely causes of a problem, generate ideas and prioritize
solution alternatives, reduce equipment setup and changeover time, develop or
redesign procedures or standard work, or organize a work area to improve
workflow. The tools used do not require any intensive or rigorous data analysis, but they
do capitalize on the collective knowledge and hands on experience of participants at the
event. Examples of tools and techniques used include value stream mapping and
process maps to provide a common understanding of what's currently going on and not
going on, Gemba Walk to observe at locations where work is performed, process and
value add analysis to identify non value add steps, rework delays and
bottlenecks, spaghetti diagrams to map the physical flow of parts or transactions in a
facility, the acronyms DOWNTIME, or TIM WOODS, to identify the different types of
waste. Pareto analysis using available data to focus the analysis, brainstorming, cause-
effect diagrams, and five Whys to quickly identify potential causes, creativity techniques
to develop solution alternatives, and multi voting and prioritization matrices to select
alternatives, set up reduction and workload analysis to enable smaller batches and
reduce cycle times, 5S to sort and organize items so that there's a place for everything
and everything's in its place. So when is a kaizen or WorkOut applicable? When quick
analysis and improvements can be achieved using simple tools that do not require any
rigorous data analysis, but can benefit greatly from the collective firsthand knowledge
and experience of the right people. The right people are usually operators, stakeholders,
and customers of the process or value stream. During a kaizen event, participants
work under the guidance of a facilitator who is trained in operational excellence. More
specifically in lean tools. This is important so that the work is done right. The knowledge
and skillset of the facilitator has a big impact on the success of the kaizen. At the end of
the kaizen or lead event, the resulting output is a list. The deliverables include approved
recommendations and sometimes even implemented solutions. Done. The key to a
successful event is planning. Planning should be more than half of the total effort. It
should start at least three to four weeks before the event. Sufficient time and effort
should be allocated to gather existing data, obtain primary information from process
stakeholders, identify the objective and scope of the event, and develop the
agenda. Then you have to identify and invite the right participants for this event. The
lean tools used capitalize on participants' knowledge and hands on experience, so it is
critical that the right stakeholders are invited. Also, this will improve buy in and
acceptance during implementation. To summarize, if a issue does not require intensive
data analysis and you want rapid improvements, kaizen or WorkOuts are definitely
recommended.

Six Sigma DMAIC


- What if your process has poor yields? What if it can't produce defect free items or
services? Or it takes too long? Or is there so much variation that it is impossible to
plan, schedule, staff, or budget accurately? And what if you have no idea what's causing
the problem? That's when Six Sigma DMAIC projects are required. Six Sigma DMAIC
projects are what you need when you don't know the root causes, factors, or drivers of
performance. DMAIC are the five phases of Six Sigma projects and they are define,
measure, analyze, improve, and control. The underlying premise of Six Sigma is the
equation Y equals f of X or Y is a function of X. Y is the outcome and the Xs that belong
in that equation are the causes, or factors, that impact the outcome. DMAIC is data
driven. Data is used to understand, analyze, and determine the key Xs that have the
biggest impact on the performance of Y. In other words what are the key Xs in the
equation Y is a function of X? By knowing which Xs impact Y you can control the Xs to
obtain the Y you want. Armed with this knowledge you can move from reactive,
detection based interventions to proactive, prevention based routine excellence. In
other words from firefighting to fire prevention. Let's say you are the owner of a chain of
pizza restaurants in your state. Complaints have gone up. Refunds to customers
increased as a result of complaints and many repeat customers don't come
back. Hmm, what do you do? Well note that we actually do not know what the root
causes of these problems are. So we should launch a Six Sigma DMAIC project to
address the problem. Here's a summary of the five phases using a pizza problem as an
example. Define. In the define phase the project is defined, the team is selected, and
project is launched by management. What's important or critical to customers is
understood and the performance outcome to be improved or the Y in Y is a function of
X is defined. In our example Y is the number of complaints. You want to reduce the
number of complaints. The financial impact is a reduction in the amount of refunds and
the potential loss of repeat customers. Measure. In the measure phase the size and
scope of the problem is understood and performance of Y is measured. In our pizza
example we collect data to measure the number of complaints and the types of
complaints. A Pareto chart of the complaint data shows that the highest number of
complaints is on pizza crust and it is persistently the biggest complaint across the whole
state. So what do we do? Hire pizza crust inspectors? No, we will not fall into
this traditional detection based mindset. We want to set up our process to proactively
prevent poor pizza crust on a routine basis. Analyze. In the analyze phase we analyze
data to determine the causes or factors that impact performance. In other words we
diagnose and prove which X factors impact Y. The more specific the Y the quicker the
analysis as to which factors are the key Xs. Back to our pizza example what is Y in our Y
equals f of X? Is it the number of complaints? Or can we get a more specific Y? Yes, we
know that crust is the culprit for the majority of the complaints. Applying the Pareto of
80 20 rule we can get the biggest return on our efforts by focusing on crust
complaints. So our focus Y is crust complaints. Potentials Xs or potential causes or
theories of our pizza crust problem are proposed and we test them using data to
prove or disprove which ones are the key Xs or root causes. Say the root cause analysis
shows three key Xs are causing poor crust quality. Inconsistent tossing
technique, variation in oven temperatures, and inconsistent baking times. Improve. In
the improve phase solutions are developed to address the proven X factors so that Y
can be improved. Solutions are developed, tested, piloted, and implemented to optimize
the pizza crust quality. In our example let's say we determine the proven optimal
settings are 425 degrees Fahrenheit, 11.2 minutes in the oven, and as for tossing
technique it is three times clockwise and you get the perfect crust every time. Control. In
the control phase controls are established to ensure there are improvements or gains
are sustainable. Controls and procedures are put in place so that employees know
when and how to intervene to ensure superior performance. For our pizza chefs across
the pizza chain we make sure they are trained on the new procedures and control plans
are implemented. Then we would hope that the number of complaints would drop
drastically and, in turn, benefit financially from fewer refunds and higher retention of
customers. By determining the key X factors that impact Y we can be proactive in
ensuring superior Y performance simply by controlling the key X factors. For more on
DMAIC check out my other courses Six Sigma Foundations, Six Sigma Green Belt, and
Six Sigma Black Belt. With Six Sigma we shift from inspection and detection to
prevention and control.

DMADV: Design for Six Sigma


- Why do people prefer one product over another even though it has a higher price? Or
why do people buy services from one company instead of its competitor just around the
corner? Because the winning company provides the products, services, and experience
that customers want and are willing to pay for. Having a design that provides the right
features that work as required is what design for Six Sigma is all about. Design for Six
Sigma, or DFSS, is a systematic methodology for the design or redesign of products,
processes, and services. The goal is to ensure that design meets or exceeds customer
expectations and key requirements. Let's talk about some characteristics of DFSS. First,
quality is designed in instead of inspected in. That is, the process, product, or service
itself is crafted to ensure quality rather than relying on end of the line inspection
checks. Next, customer expectations and key requirements are prioritized and
incorporated into the design from the start. In DFSS, we call this requirements flow
down. It ensures there is a link between customer requirements and functional product
and process requirements. Lastly, quality is predictable. There are no surprises. The
capability of the design is predicted as the project progresses to ensure that design is
capable of meeting customer requirements and performance targets. DFSS projects
follow a five phase methodology. Define, measure, analyze, design, and verify, or
DMADV, pronounced as de-mad-vee. In the define phase, the business case for the
design project is established and the project charter, including objectives, scope, goals,
and design team, are approved. This phase establishes the reason for the design
effort and helps keep the team focused. Next, in the measure phase, VOC, or voice of
the customer analysis, is tear down. Customer needs are identified, prioritized, and
translated into measurable requirements called critical to quality requirements, or
CTQs. CTQs are the performance characteristics that are critical in satisfying customer
needs. For example, for a coffee maker, VOC efforts may identify the top three customer
needs as suits my taste, easy to use, and fresh coffee when I want it. We could translate
the need for fresh coffee when I want it into three CTQs, brews quickly, adjustable
amount of coffee, and fast total time from start to finish. A scorecard called a design
scorecard is established to score and track design capability against CTQs and
performance targets throughout each phase. In analyze phase, CTQs are examined to
determine the functions of functionality needed, that is we need to find out what does
the product or service need to be able to do to meet those CTQs? These functional
requirements drive the generation of conceptual designs and the best is
selected. During the design phase, product features are created and specifications are
finalized. Process requirements are developed and parameters are optimized. And last,
in the verify phase, the final design is agreed upon and tested to make sure it is
producible and still meets customer needs. Controls are designed and implementation
then takes place. Sometimes in a D-mac DMAIC project, after the A, or analyze phase, a
redesign effort may be required. In that case, a mini DMADV cycle can deliver the design
as part of the improve phase. There is much more to learn about Design for Six
Sigma than what we discussed here. There are several courses on this site that you can
watch for a deeper discussion, such as Six Sigma: Green Belt and Six Sigma: Black
Belt. So if you ever need to design or redesign a product, process or service, and you
want to address customer needs systematically, then Design for Six Sigma is what you
need to use.

Implementation challenges
- There are many challenges in implementing and sustaining operational
excellence. Here are some common implementation challenges. Fear of headcount
reduction and job losses. Many employees view operational excellence, especially lean,
as a means by management to lean out the organization, resulting in job cuts. Rightly or
wrongly, that's the view and fear. Senior management has to address this fear head
on. Not through pronouncements, but through actions. When processes of value
streams become more efficient, any displaced workers should be reassigned to fill useful
roles elsewhere within the company. And as the company grows, these folks can fill new
roles, first removing any need for new hires. Lack of buy-in and acceptance. Most people
don't like change. To improve buy-in, and acceptance of changes and improvements,
ensure that the right metrics and performance targets are implemented. Metrics and
rewards that motivate the right mindsets, behaviors, and results much support the
changes. For example, traditionally incentive pay is based on piece count or number of
transactions. Instead, reward based on the percentage of defect-free pieces produced or
the percentage of correct and accurate transactions processed. Reverting back to old
habits and the good old days. It's just human nature to take the easiest path or the path
of least resistance. Ensure that the new way of doing the work is easier than the old
way. If it is more cumbersome and more involved, it will not be sustainable. Employees
should love the new improved way of getting things done because it is now easier and
more user-friendly than ever. Keep it simple. Too many initiatives and projects. Most
organizations are already stretched thin with very lean staffing levels. It is not
uncommon to here that everyone's busy with their regular jobs and to top it off, there
are too many projects in progress, tying up whatever little time and resources there are
left. A common challenge is that there's no time and no resources available for
projects. To address this, take inventory of all existing projects and initiatives. Map them
to the company's annual goals and strategies to see if there is any connection. If there is
no connection, let alone alignment, then either kill those projects, put a hold, or
postpone them. For the remaining projects, prioritize them to gather the new
operational excellence projects. Such a mapping and alignment exercise will reduce the
number of ongoing projects. And to keep it streamlined, senior management can
proactively select and prioritize the list of projects. Too many metrics. This is another
common challenge when in the rush and temptation to measure and be data-driven, the
list of metrics to be monitored and controlled keeps growing. As the list grows, so
grows the frustration of employees. To address this, evaluate the reason for every
metric by checking to see how well the metric maps through CTQs and all the
company's business priorities. And any KPIs or key performance indicators should be
limited to just the key ones. Imagine the stress and nightmare if you drive a car with a
dashboard showing more than a dozen indicators on digital displays. You don't need to
measure everything, just what matters. These common challenges should be addressed
head on. Management must follow through, actively demonstrate their own buy-in and
walk the talk. It is management's responsibility to make the new operationally excellent
way of working a routine. Everyone in the organization will then be engaged, and able,
and empowered to achieve the desired results.

Audits to sustain operational excellence


- On New Year's Eve, many people are all excited about making and pronouncing their
New Year resolutions with a lot of fanfare, sharing them with friends and on social
media. Later on, when all that excitement dies down, how many of those resolutions are
kept? Well, there's always next year. But you do not want this to happen to operational
excellence. Operational excellence is not a slogan or rallying cry for excellent
performance, nor should it ever be. Slogans become old news and fade. To sustain
operational excellence, it must become part of the organization's DNA, become part of
the regular routine for creating and delivering value to customers. And the secret sauce
is process discipline. Process audits are excellent for verifying and reinforcing what has
already been implemented. The objectives and scope of these process audits should
include the following. To ensure that processes are well designed and capable, the audit
needs to evaluate process capability in meeting performance targets and
specifications. Audit for employee compliance to established procedures. Verify that
employee skills and training are current and employees are capable of performing their
work. Audit control plans and verify that employees have the means of knowing what's
important in their key job roles within the process of value stream. Know what to do in
the process. Know when to intervene and what actions to take. Verify that employees in
key roles have the necessary process authority, responsibility, and accountability. Audit
to ensure that managers and process owners have access, either directly or indirectly, to
resources, tools, and techniques for design, improvement, and control. Verify that every
project is indeed necessary. Audit linkage of projects to accomplishing the
organization's annual goals and strategic priorities. Verify that projects have the support
from management and have sufficient allocation of time and resources to these projects
regardless of whether they are Kaizen events, DMAIC, DMADV projects, or value stream
management. Audit for results and improvements from projects and validate that they
were sustained. Review statistical process control charts for evidence. Audit to verify that
value stream managers and process owners have the authority and the means to
manage, control, and improve the entire value stream or the end-to-end process. Verify
that process metrics and KPIs are supportive of CTQs, or critical-to-quality
requirements. Audit performance metrics and rewards to verify that they are aligned to
the organization's annual goals and strategic priorities. Verify that KPIs, or key
performance indicators, are indeed key and are limited to the vital few. Verify
enterprise-wide alignment of strategies, priorities, policies, and decisions. Verify
alignment across different functions that support value streams or end-to-end
processes. Verify alignment up and down different levels of the organization. For
example, alignment between the individual level, team, department, location to division
and company level. These audits can be done in a staggered way throughout the year,
at least once a year for each area, department, or process. The frequency for any one
location or process can be increased if there are noncompliance or performance
issues. You want to get to the point when everyone is willing and able to do well to be
operationally excellent everyday. In other words, the entire enterprise can deliver
value better, faster, cheaper.

Next steps
- Congratulations. You have made it to the end of this course. Operational excellence is all about
getting work done better, faster, cheaper, while delivering superior value to customers. Just a
review. To achieve operational excellence, you need to have processes that are effective and
efficient at delivering value, tools and techniques for design, improvement, and control, the right
mindset and behaviors where everybody wants to and is able to be operationally excellent. And
enterprise wide alignment of strategies, priorities, and decisions. If you are interested in learning
more, I recommend watching my courses on Six Sigma Foundations, Six Sigma: Green Belt, and
Six Sigma: Black Belt. These courses provide a deeper dive into many of the concepts, tools, and
techniques we have discussed. For a closer look into graphs, charts, and data analysis for
operational excellence, I recommend my introductory course called Learning
Minitab. Remember, operational excellence is thought, and should not be thought of, nor should
it ever be perceived as a one time initiative or project. The pursuit of operational excellence
never ends.
Give fee
Statistics Foundations: 1

Welcome
- In our modern society we've all become addicted to numbers, even those that say they
hate math. What's their first move every morning? They grab their phone. Check the
time and temperature, numbers. Numbers tell us how fast to move, how warm to
dress. They help us decide where to invest our money. Numbers can motivate us to
act, and that's just when the numbers are given to us. If you have the power to organize
large pools of data, you have the ability to discover trends, prove yourself right, or
maybe prove others wrong. 
This is the power of statistics. Whether you're a manager or a designer, whether you're
in business, science, sports or education, whether you're trying to save time, money or
any other valuable resource, understanding statistics is vital if you wanna be more
effective and efficient. 
Hi there, my name is Eddie Davila and I'm a university instructor with degrees in
business and engineering. I write ebooks and of course I develop other online
educational content. I'm a huge sports fan, I love to follow the entertainment
industry and I'm passionate about science and health, and I can tell you that in every
important facet of my life having a better understanding of statistics allows me to
improve my performance and often to find a greater level of satisfaction whether I'm
working or playing. 
This course, Statistics Fundamentals, is the first of a three part series that I'm hoping will
empower you to better understand the numbers you will encounter in your life. In this
course we'll discuss basic terms like mean, median and standard deviation. We'll look at
many different forms of probability. We'll explore the power of the bell shaped normal
distribution curve. We'll discuss issues like false positives, and expected monetary value,
and I'll tell you even if you know what all these things are, I think you'll walk away with a
new prospective. Actually I'm hoping you'll never look at these basic concepts the same
way again. You won't just understand what these numbers are and how they're
calculated, you'll know their inherent weakness too. So, welcome to Statistics
Fundamentals.

What you should know before watching this course


- So, you decided to dive in to a statistics course. Are you ready for it? I think you
are. While this course will explore statistical concepts, numbers, charts, and
probabilities, we won't be doing any significant mathematical gymnastics. If you know
your math basics, adding and subtracting, multiplying and dividing, square roots, if
you're comfortable with basic fractions, if you can understand that when we're
discussing probabilities, 0.05 is the same thing as 5%, 0.40 is the same thing as 40%, and
that 100% is the same as 1.00, you'll be fine. And even if some of those things make
you a little bit uncomfortable, don't worry. Often we use pictures, charts, and tables to
help illustrate the concept. Sometimes we attack problems in more than one way. And
of course, through the power of the internet, you could always pause and rewind. So,
whether your math muscles are strong, or you're just beginning to rediscover math
concepts, I think the probability of success in discovery is quite high. Thanks for
exploring Statistics Fundamentals and good luck.

Using the exercise files


If you're watching this course on a computer, you have access to the exercise files. You
can download them to the desktop for easy access as you're watching the
course. There's a folder for many of the chapters in the course. What will you find in
these exercise files? In some of them, you'll find helpful graphs, tables, or illustrations
used in the videos. In others, you might find sample problems so you can test your
statistical skills. Along with those problems, you'll get both the answers and the step-by-
step solutions. If you don't have access to those files, don't worry. The problems and
scenarios discussed in the videos are often described in detail and illustrated via on-
screen formulas, calculations, and graphics.

Why statistics matter in your life


- I know what you want. You want answers, and in the modern world, often the most
important answers come in the form of numbers, but why do you want these
answers? What could you do with those magic numbers? Perhaps, if you analyze those
numbers, you could make more informed decisions. 
Maybe you want to convince somebody of something. Perhaps, with the right
numbers, you could motivate your employees to work harder, or maybe, you can help
them work smarter. Maybe you help lead a global organization. A global organization
with employees and volunteers all around the world. You can't be everywhere, but it's
possible that numbers will allow you to manage from afar. 
Those numbers might tell you how your facility 10,000 miles away is performing, and
maybe, by looking at those numbers, you might discover something new about your
organization, your employees, and volunteers, or even your customers. Sometimes,
though, someone brings you the numbers. When someone hands you a report, sends
you a spreadsheet in an email, gives you a boardroom presentation, or when they text
you this morning's sales numbers, what are they doing? Perhaps, they're trying to
convince you of something, or it's possible that they might be trying to deceive you. 
In today's world, we're being bombarded by numbers all day long at our desk, in
meetings, in our cars, on the train, on our phones. Some of those numbers are
helpful. Some are confusing, and others are probably just distracting us from what's
really important. The problem, of course, is trying to figure out which numbers are which
ones are bad? And I don't care how comfortable you are with numbers. In a world where
so much data is so readily available and so much about the world is unknown, it never
hurts to develop a deeper understanding of statistics. 
Statistics can help us quantify uncertainty. Statistics can help us discern if results are
providing us true illustration of a situation, or if results are presenting us with a biased
view. By understanding data sets, the center of these data sets, the spread of the
numbers in these data sets, by knowing how to read some basic statistical charts, and by
having an understanding of basic probability, you'll be taking your first steps toward
being a confident contributor in your organization, someone that can make more
informed decisions by knowing which numbers are helpful in each situation. 
You may also become a leader that can explain and illustrate to others how certain
statistics will lead to better outcomes. As we being our journey into statistics, I want you
to start to question the numbers that cross your path in the emails you receive, the
articles you read, and in the conversations you have with your colleagues. Where did
those numbers come from? How are they calculated? Are those the right numbers
needed to make this decision? Questioning and wondering, those are the first steps in
becoming a statistician.

Is my data set good?


- A runner's 100 meter dash time, a patient's heart rate, the number of items a customer
purchased at the grocery store, the number of students that failed an exam, a
customer's age, an employee's annual salary, these are a mix of observations and
measurements. Each represents a tiny bit of data. On their own they may tell us a tiny
bit about each person that was measured or observed, but if we can get the sex, age,
home address, sales data and browser statistics for the 20,000 customers that bought
products from our website last month, now we have a rather significant pool of data. 
This massive pool of observations and measurements it probably has hidden among it
lessons in how to better organize our website and our warehouse. It could probably tell
us how to more efficiently and effectively reach our customers, but in its disorganized
state this massive pool of data can't teach us very much. The lessons only become
known when the data is organized, when it is made visual through charts, and when it
has been processed with the right formulas. 
Once we have organized data, we can begin to discover useful facts. Once we have
organized data, we have information. How valuable is that information? Well, it depends
on many different things. Chief among them, though, is the quality of the data
itself. Before we begin to put a value on the information that was calculated and
presented, we need to question how the data was gathered. In some cases, the data is
just handed to us. We ourselves don't actually collect the data. 
This secondary data was collected by others. In some cases their methods were good
and sound. We like the data. In other cases there may have been some flaws in their
methodology. Or maybe something they did makes it harder for us to feel secure in our
calculations and conclusions. 
When we actually collect the data though, we can actually observe the subjects in their
natural environment, or in the very strict experiment we set up. We also get to write the
very specific questions on their surveys. Plus, we are the ones that will moderate the
discussions in a focus group. It's not to say that our data won't be flawed, that our
questions and observations won't be biased, but at least we know we are the ones to
blame for the flaws in the data. And, perhaps, by knowing the flaws we can more
accurately report the level of uncertainty in our conclusions. And we can also establish
how future data gathering might be modified. 
So, just as in our daily diet, you are what you eat. If you eat food made of healthy
ingredients, you have a better chance at improved health. The same holds in
statistics. When our data pools are gathered through healthy methods our data pools
have a better chance of containing truly valuable measurements and observations that
can better inform us in the decisions we make. 
Next time someone provides you with some interesting or surprising statistics, perhaps
you should take some time and ask about how the data was collected, what might be
the flaws in the data, and how that might change our opinion of the results.

Understanding statistics with the use of charts


- What does data look like? Well often, when news programs or movies are trying to
scare us with the idea of a data-driven world, they will display a screen filled with
numbers, perhaps strings of ones and zeroes, waves of words, numbers and
symbols. Much like a disassembled car or plane, it can be a bit ugly, confusing, and yes,
scary. But, when put together the right way, this intimidating array of parts can
become an attractive and functional machine, a machine that can provide us with
incredible service and convenience. 
Data works the same way. When organized properly, data can provide service and
convenience. It can help us make good decisions quickly. It can be a tool to persuade
colleagues and in doing so it can save us time and money. Similarly, imagine a world-
class novel whose words have all been jumbled. It's impossible to read, it makes no
sense, but put those words in the right order and the novel can tell a beautiful
story. And the incredible thing about stories is that the same story can have different
meanings for different people. 
So, how can we organize and assemble data such that it can be useful, attractive, and
interesting? More than that, how do we use data to tell thought-provoking
stories? Often statisticians will use tables and graphs to tell their stories. What kind of
tables and graphs? 

Well, let's say we have data relating to a number of adults. In a table, we can display all
the weights of these adults in pounds from heaviest to lightest. That's sort of
interesting. We can also create a table that reports the frequency of each weight. In
other words, how many adults weighed 170 pounds? How many weighed 130
pounds? Charts are interesting, but how about if we create a dot plot? 
This is very similar to our table that reported the frequency of each weight. But, for
some reason this is just more appealing and easier to consume for our eyes and
brains. Still, there are so many possible values that both the chart and table can still be a
bit overwhelming. How about if we grouped this data by creating 10-pound
intervals? Here's what our table would look like.

 
But we could now also use a bar graph called a histogram. As you can see, it's pretty
much the same thing as our table, but again, the histogram is a bit more appealing. One
more time, back to our interval table. 

Let's add another column. Based on our 50 observations, we can now also report in a
third column what was the relative frequency in each weight interval? As you can see,
according to our chart, 10 adults weighed between 140 and 149 pounds. 10 adults of 50
adults would be 20% of all the people measured. 
And if you'd like, we can turn this into a relative frequency histogram. 
Weight intervals define each separate bar. The height of the bar indicates the relative
frequency. How about if we aren't concerned with weights? 
Actually, how about if we aren't measuring based on numbers? For example, let's
consider the color of each person's hair. Here's the data in table form. 

Notice, the categories here are qualitative, different colors of hair, they are not
quantitative like the weight ranges we saw before. Here's a histogram for that hair color
data. Here's one with relative frequencies, but we can also use pie charts to display this
data. A circle or the whole pie represents all possible data. Each slice gives us an
indication of just how many adults are in each category. So if I provide the data in this
manner, an astute audience might be able to see both the frequency and the relative
frequency of each hair color for the population measured. These tables and charts are
merely the tip of the iceberg. While these are some of the most common tables and
charts, they are by no means the only available options. In the last decade, a group of
talented and creative statisticians have started to use their skills to create incredible
infographics that are interesting, colorful and sometimes even funny. 
So, next time you look at a chart or table, don't get intimidated, don't look for the right
answer, read it like a story. Think about what it means to you and embrace questions as
an opportunity to open a discussion about the quality of the data and how the data
might help you make good decisions.

The middle of the data: Means and medians


- Every new set of data is filled with mystery. You have no idea what it contains. Will it
tell us something odd, something interesting, or will it just confirm something we
already knew? So when given a new set of data, where should we get started? How can
we begin to feel comfortable with our new data set? Again, let's remember our data
set is a collection of values. Some might be big, some might be small and individually
these values are likely too much to handle, but we're hoping together all of these values
will tell us a story. 
So what might be a good beginning to this story? Oddly enough for many, they like to
start their story in the middle. With so many data points, wouldn't it be nice to know the
center of the data? It makes sense. Knowing the center of the data would seem to give
us some balance. The bigger question is what do we mean by the center of the
data? For many, the center of the data would be the average, also called the mean. 

This is the sum of all the data points divided by the total number of
observations. Looking at data set one, our test scores, we can add our 25 test scores and
then divide by 25. Our average is 65%. That doesn't look like a very good average, does
it? The students in the class might complain that the exam was too difficult. That would
be one way to look at those results. Another common way to find the center of the data
though is by finding the midpoint. We call this the median. 

For this, we organize our 25 exam scores from top to bottom. With 25 values, the 13th
value is our midpoint or median. Why? The 13th value has 12 values above and 12
values below it. When we look at our exam data this way, we find that the median
student's score is a 76. This would seem to indicate that the exam might have been
quite fair for those that studied. The problem might have been for those that did not
study. They were doomed to get horrible test scores and when we look at the lowest
test scores, we can see that just a few students really brought down the course average. 
Remember though, there could be so many different explanations for these
outcomes. Perhaps some students were not qualified to take this course. Maybe a
majority of the class had heard the exams were going to be extremely difficult and hired
tutors. Perhaps the lowest scores were earned by students that could not afford
tutors. How about if many of the lowest performing students were exchange students
that struggled with language and reading? 
As you can see, the mean and median help us identify two different types of
centers. When we investigate the exam scores, the median and average scores each tell
us a different story. Neither is complete, but together they have helped write the
beginning of our story. 
Governments, businesses, and the media love to provide people with means and
medians. Often they are intended to tell you a story. Next time you are given a mean or
a median, don't look at those numbers as the end of a story. Look at them as the
beginning of an adventure or mystery. Start to figure out how they might provide
clues. Start to imagine how the individual data points in your data set might look. Come
up with questions you'd like to ask. You never know, it's possible that their mean and
medians aren't telling a true story. Perhaps they are just part of someone's fantasy.

Medians for data sets with even numbers of data points


- The median is a fairly simple concept, but let's briefly discuss a common issue you'll
likely encounter in finding your median. Let's recap. The median is the midpoint in the
data. There are an equal number of data points above and below the median value. So,
if we had five data points, the third highest would be the median. 101 data points, the
51st data point would be the median. 50 smaller data points, 50 larger data points. But
what if we had an even number of data points? 

For example, let's say we have 10 data points. The fifth data point has four data points
below it and five above it. The sixth data point has five data points below it and four
above it. Neither is the true median. What to do? Simple. In this case, we just take the
average of the two middle points. Here, we can see that the fifth data point is 20 and
the sixth data point is 30. So, our median point for this data set is 25. So, now you're
ready to find the median of any data set.

Weighted mean
- A mean, or an average, is good when all data points are created equal. But sometimes,
some data points are more important than others. Let's consider an academic course as
our example. 

If the professor had four exams, and these were your four exam scores, calculating your
average would be easy, 80%. But in many courses, instructors will also have
quizzes, homework, and term papers. In those cases, exams are usually worth more than
quizzes, and homework, and term papers, may also have their own values. How do we
figure out a student's average in a class like this? 
For this, we will use a weighted mean. Let's figure out how to calculate a weighted
mean, and then discuss issues that should be considered in calculating and interpreting
a weighted mean. So, let's look at how weighted means are calculated. First, let's
consider the weights of each category. Let's say our class has two exams, each worth
30% of the total grade. The quizzes are worth 10% of the grade, as is the
homework. The term paper is worth 20% of the grade. Notice, the weights add up to
100%. Next, let's look at each student score in each category. 90% on exam one, 80% on
exam two, the student's average on their quizzes was 75%, they got 100% on all of their
homework, and they got an 85% on their term paper. 
Now all we do, is multiply the score times the weight. 90% on exam one, with a weight
of 30%, 0.30, that gives us a weight score of 27.0. We then do this for each
category, multiply the score in each category by the weight for that category. Now that
we have the weighted score for each category, we can add up all of the weighted
scores, to get the weight mean. This student, has an average of 85.5% for the course. 
This doesn't just work for students in a class, we could use weighted means to rate
employees, or suppliers, we could use it to pick a home, or a school. 

And while we're presently looking at data that has already been collected, we can use
weighted means to signal to others what we value. When a professor tells you that each
exam is worth 30% of the grade, students get signals about the potential difficulty of
exams, and also, how students should best use their time. Which of course means, if you
are going to use a weighted average, the categories and the weights of each
category, are very important to consider. 
In general, there are no rules about which categories are chosen, there are certainly no
rules about how weights for each category are determined. So, it's important, that when
someone provides you with a weighted mean, that you question how the categories and
weights were chosen. Equally as important, if you are the person in charge of
developing a weighted mean, you need to carefully consider your choices in category
and weight, it is likely your audience will ask you very specific questions about your
calculations. 
Finally, if you plan to use your weighted mean to motivate people, whether they be
employees, students or suppliers, consider the messages they will receive when they see
your rubric. If someone gave you this rubric,

how would you react? How about if they gave you this one? As you can see, the
weighted mean is simple, it's flexible, so, it's a very popular tool, but, it is also fairly
arbitrary. If you are using a weighted mean, be sure to take the time and effort to
maximize the value of the data it will generate.

The mode: Find it and understand it


- [Voiceover] In our quest to find a stabilizing center to our data set, we've considered
the mean, the median, and even the weighted mean. One other tool that is sometimes
used to quickly assess a data set is the mode. The mode is simple concept, no
calculations are required. The mode is simply the data point that is most prevalent in the
data set. 
For example, in this tiny data set seven is the mode. Here, the mode is 15. This data set
has two modes, 10 and 15. How 'bout this data set?
 Well, since every number only appears once, this data set does not have a mode, but
let's get to the big question, why do people reference the mode? Well, the idea behind
the mode is that it represents the most likely outcome in a data set. Some might even
think that it points to a center point in the data set, others might like to use it in
conjunction with the median and mean. Are these things true? Is a mode helpful? Well,
one important thing to remember is that mode does not have a minimum frequency, in
other words, in this data set the mode is four. 
It shows up six times in a set of 15 numbers. Here, the mode is also four, but now it
shows up only two times in a set of 15 numbers. In one data set, the mode really did
point to a very prevalent outcome. In the other data set, there was no particular
outcome that was likely, it just happened that four was listed twice. 
Let's use histograms to better understand some of the other potential pitfalls of the
mode. 
Let's consider this data set. 60 is the mode, but is it really a very likely outcome? It's the
most frequent outcome, but there are many other data points that are not equal to
60. Also, while 60 is the most frequent outcome, it is not really representative of the rest
of the data. Most exam scores in this data set are much higher. In fact, the mean and
median are much higher, and perhaps this is one of the better ways to use the mode in
conjunction with mean and median. Consider this data set, median 50, mean 48, mode
52. 
Now, let's consider this data set, 

median 60, mean 70, mode 80. Here's what this might look like. As you can see, mode,
median, mean, none of these on their own provide us a complete picture, but when you
use mode, median, mean in a chart together you get a nice glimpse of your complete
data set, and perhaps now you can start to ask the deeper question about the data
set and how it might help you make better decisions.

The range
- Let's take a look at these two small data sets. 
Notice both have the same average, both have the same median values. Still, it's
obvious that the data sets are vastly different. These data sets are small so we can
quickly view all of the data and see the differences. What happens when the data sets
are enormous? How can we measure the differences in data sets that might have very
similar medians and means? Better yet, how can we get a better idea of what kind of
data makes up this data set? When we measured mean and median, we were looking for
the middle. Let's now measure how far out from the means and averages the farthest
data points lie. The simplest measure of variability is the range. Finding the range is
easy. You just take the largest number in the data set and the smallest number in the
data set. The difference between these two numbers is the range. 
When you look at this data set, our range is 50. 

Here, our range is 150. 


Now, when you are provided with these numbers, mean 60, median 58, range 70, we
begin to understand that while the center of the data is near 60, the difference between
values in the data set can be very large. What's a possible pit fall here? I think the most
common mental error is thinking that since 60 is our theoretical middle and the range is
70 that the biggest numbers in the data set are likely 35 units bigger than 60, and the
smallest numbers in the data set are 35 units smaller than 60. Don't just assume that a
mean of 60 with a range of 70 means that the highest score was about 95 and the
lowest score was 25. 

Suppose these are exam scores. It's possible one student didn't study at all and got a
15%, 45% less than the average, and the highest grade might've been 85%. Again, I
would recommend using a histogram to help you better understand the makeup of your
mean, median, and range. 
Here, we can see that one student really opened up the range. Perhaps the range isn't as
helpful as we thought. 

Then again, if the data set looks like this, mean 60, median 58, range 10, we know that
this data would seem to be fairly centralized. Not only are the mean and median
similar, the difference between the biggest and smallest values is only 10 units. Range is
a nice, simple tool for our statistics toolbox, but we need to remember that it's not
always indicative of the overall data set. It only takes one rogue data point to
exaggerate the size of your data set's range.

Standard deviation: Calculate it and understand it


- Knowing the difference between our biggest and smallest data points, which we call
our range, is often interesting, but it can be deceiving when you have even one rogue
data point. So how can we get a better overall feel for the distribution of our data
points? Often statisticians turn to a handy tool called the standard deviation. 

It's a fairly common term in the world of statistics. But still, the concept can be
intimidating to many. So what exactly is the standard deviation? Well it's sort of the
average distance from the mean. Let's look at these three data sets. 
All are very different. Still, all three have an average of 18. The first data set's range
indicates that this is clearly a data set of similar numbers, but data sets two and three
have identical means and ranges. Here, standard deviation is useful. While both have
large and small data points, the standard deviation of data set three tells us the
numbers in data set three are more similar to each other than those in data set
two. Notice we keep seeing a similar pattern. Often it takes a collection of basic statistics
tools to get a sense of what a data set contains. So how do we calculate a standard
deviation for a data set? Remember when I said it was sort of an average distance from
the mean? Well I meant it. It's sort of the average distance from the mean, but not
quite. It's the average square distance from the mean. Here's the formula. 
Yeah kind of ugly, but don't worry, most calculators and spreadsheets are capable of
doing all the work for you. Plug in the values of the data set and the machine does all
the work. So finding a standard deviation doesn't need to be so difficult. Still for the
curious, let's go ahead and show you how to use this formula. This can get ugly so let's
start out with a small data set. 

First, we need the mean. That's easy enough. 


Next, we need the sample size minus one. So now our formula looks like this. 

The last components of the formula are the individual data points. 

So for the first data point, take two minus eight. Negative six squared is 36. The sigma
tells us we need to do this for every data point and then add all of those values. 
Next, we divide by three. This value is our variance for the data set. 

When we take the square root, we find our standard deviation. 


So for this data set, we find we had an average of eight and now we know the standard
deviations is 4.32. Standard deviation, you now have an idea of what it is and how it's
calculated, but I'm guessing you still have questions. What's a good standard
deviation? How do you use a standard deviation? Let's tackle questions like that in the
next video.

How many standard deviations?


- Suppose I tell you that one data set has a standard deviation of five, and another had a
standard deviation of 50. Which of those is a better data set? The answer is that it really
depends. First consider what you're actually measuring. Is 50 the standard deviation of
the daily temperature in Celsius for Paris in the month of January? If so, that would be
huge. Or is 50 the standard deviation of the weight in kilograms for total daily cheese
consumed in Paris? In that case, 50 would likely be very small. In those examples, we
were considering the data set as a whole. But standard deviation is also used to
investigate individual data points. How do we do that? Well often, you'll hear someone
refer to a number of standard deviations from the mean. They might say that
something is within two standard deviations from the mean, three standards from the
mean, or perhaps 1.5 standard deviations from the mean. Let's look at a simple
example so we can discover what this might mean. Let's use this data set:

the weights of 10 men that visited a doctor's office today. The mean weight is 189
pounds. The standard deviation is about 90 pounds. One standard deviation from the
mean would be 90 pounds lower than 189 all the way up to 90 pounds heavier than 189
pounds. Roughly from 99 pounds to 279 pounds. When we look at our individual data
points, we can see that the first nine data points are within one standard deviation from
the mean. How about that last data point? Let's try 1.5 standard deviations. So again,
about 135 pounds in either direction from the mean. 54 to 324. Nope, 425 is not within
1.5 standard deviations from the mean. Here's two standard deviations. How about 2.5
standard deviations? No, 425 is still not within that range. And yes, we now have gotten
below zero on the low end, about negative 36 pounds. Only when we get to three
standard deviations does the last data point fall within our limits. I guess the next
question is, what does this mean? Is any of this significant? Well, for data that would be
considered symmetrical, which means that we have a nice bell-like distribution centered
at the mean, it is estimated that about 68% of your data points should fall within one
standard deviation of the mean. We had 90% fall within one standard deviation. Better
than expected. Most of our patients are probably somewhat similar. With our small data
set, we can see that this is true. If we had a huge data set and 90% of our data was
within one standard deviation from the mean, we'd probably feel pretty good. How
about that last data point? Well, as I said, 68% of the data points within one standard
deviation is considered normal. How about for two standard deviations? Well, we would
expect about 95% of our data points to be within two standard deviations of the
mean. And for three standard deviations of the mean, 99.7%. And that is where our last
data point lies. That would seem to be fairly extreme. So, that last data point is
definitely what we would consider an outlier. As you can see, standard deviation can be
a very helpful tool in understanding data sets and their individual data points. More
importantly, standard deviation will help us generate interesting questions about the
data collection methods, the entire pool of data, and even the individual data points.

Outliers
- Nowadays we hear the term outlier quite a bit. Perhaps a certain athlete is incredibly
talented and productive, much more so than any other competitor in the league. Their
statistics far exceed those of any other single player. They might be labeled an
outlier. Maybe one 10 year old in your city is taking a course in calculus. Or perhaps the
opposite occurs, another 10 year old child struggles with basic addition. In an effort to
center the discussion on the masses, educators may exclude these children as
outliers. But how about a child that scores a perfect score on a nationally standardized
test for 10 year olds? Are they an outlier? Or how about our athlete? If they average 45
points per game in basketball, at what point is our star athlete an outlier? If no one else
is above 40 points per game, is our star an outlier? If no one else is above 35 points per
game? How about if our star player averages 45, the second best player scores 40 points
per game, and the next best player is at 32 points per game? Do we have two outliers or
do we not have any outliers? So what exactly is an outlier? The most common answer
you'll get is that an outlier is a data point that is an abnormal distance from the other
values in the data set. This brings about a few questions. First, what's abnormal? There is
no set definition, but I think it's important to understand that the term outlier is not a
very specific term. So, it's less about absolutely identifying outliers, rather it's about
motivating discussions of what is normal, about what is possible. Perhaps talking about
outliers will help important issues surface. So, how can we identify outliers? Tables and
charts can be useful. Sometimes they make outliers stand out. Perhaps we use standard
deviation. Maybe we say that anything more than two standard deviations from the
mean is a statistical outlier. Sometimes an outlier is just something new, something that
we've never even considered, which brings us to the next question. What should we do
with outliers? Should we just throw them out, not consider them at all? In general, I'd
say no. Most consider outliers as freaks or freakish events that are not likely to be seen
again, as a result they're ignored, not considered worth investigating since they're so
odd. Instead though, they should be considered as opportunities. Are they the
beginning of a new trend? Does this person know something that we don't? Is it
possible others will learn from this outlier, and we might see a massive change in
behavior? Was there a special circumstance for that particular person? Why did they do
so poorly? Perhaps this person got very ill and had to leave the program. Or why did
they perform so well? Did they get extra help? Did they have additional training? Should
we consider extra training for everyone? As you encounter outliers in your data, at the
office, or even in your daily life, ask good questions. Is this really an outlier? How did this
happen? What can we learn? What needs to change? A mass of closely distributed data
points can be very instructive, but sometimes the lone outlier can provide us with a
brand new perspective.

Z-score: Measuring by using standard deviations


- Suppose you're given a data set. Perhaps you're even given many of the key statistical
tools needed to start forming a statistical opinion, mean, median, range, standard
deviation. By now, we should understand that these tools help us start asking important
questions about our data set. We also learned that these tools help us in studying not
just the entire data set, but also each individual value in the data set. In particular, the
standard deviation allows us to see whether or not individual data points might be
considered outliers. Is your data point within one standard deviation of the mean, two
standard deviations? How would I know if it is exactly 2.37 standard deviations from the
median? Luckily, there's a very simple formula to help us find just how many standard
deviations our data points lie from the mean. This is what we call our Z-Score,
and as I said, the formula to find your Z score is quite simple. As you can see, all you
need is the data set's mean and standard deviation. Then, all you do is plug in one of
the values in the data set. Let's plug in our largest value in the data set, 231. For this
data set, we have a mean of 130.1 and a standard deviation of 47.85. As we can see, we
get a Z-Score of 2.11. That means the data point, 231, is 2.11 standard deviations from
the mean in the positive direction. Let's do this same calculation for our lowest
value. The only thing we change for this calculation is that we switch out 50 for
231. Notice what happens here. Now our Z-Score is negative 1.67. This means that this
data point is 1.67 standard deviations from our mean in the negative direction, which, of
course, makes sense since 50 is well below our mean value, 130.1. Next time someone
says that a certain outcome is 2.8 standard deviations from the mean, not only will you
know what that means, you'll also know how it was calculated. Plus, knowing this simple
formula will be helpful in determining whether an individual data point might be
considered an outlier.

Empirical rule: What symmetry tells us


- The empirical rule, that sounds a bit scary, doesn't it? But is it really that
intimidating? It's not. Actually, it is quite simple and in the end, very useful for
understanding the distribution of data points in data sets. So, what is this empirical
rule? First, it's important to note that this rule works for symmetrically distributed data, 
often illustrated by a bell-shaped curve centered on the data set's mean. Once we
understand this, the empirical rule explains how this symmetrically distributed data
follows a pattern whereby most of the data points fall within three standard deviations
of the mean. 

For this reason, it is also sometimes referred to as three sigma rule, where sigma stands
for standard deviation. The rule goes further though. It explains that about 68% of all
the data points will lie within one standard deviation of the mean. 
Notice how this is illustrated on our bell-shaped curve. The empirical rule then goes on
to say that 95% of all data points fall within two standard deviations of the mean. Again,
notice how this is illustrated in our bell-shaped curve. Finally, the empirical rule tells
us that when you have the bell-shaped curve, often referred to as a normal
distribution, 99.7% of the data points in the data set will fall within three standard
deviations. 
So, as you can see, now almost all of the area under that bell-shaped curve has been
accounted for. One very important note, this works when we have the well-
centered, symmetrical bell-shaped curve. The rule begins to lose value the farther our
data set strays from the classic normal distribution. That said, in most cases, the 68-95-
99.7 rule, which we call the empirical rule, holds up pretty well. With this knowledge, you
can, hopefully, better evaluate data points. If someone says that a data point has a Z-
score of 1.8, you know it's within two standard deviations of the mean, and thus it is
likely among 95% of all the data points in the data set. That said, if something had a Z-
score above 3.0, we could be pretty confident that this data point was a true
outlier, since it is likely not among 99.7% of all the data in our data set. In terms of our
bell-shaped curve, this data point would likely be out of this region. No matter if you call
it the empirical rule, the three sigma rule, or the 68-95-99.7 rule, we now better
understand the normal distribution of data in a way that allows us to better understand
our data set and the data points that lie within that data set.

Calculating percentiles: Where do you stand?


- Perhaps a person is trying to get into a prestigious university like Harvard or
Stanford. Maybe the university requires that the student takes a standardized test like
the SAT. The student takes the exam and they later receive their results. Would this
student rather have their score be in the top 4% of all scores or should they want their
score to be in the 96th percentile? It's a trick question. Both are the same. A score in the
top 4% would be the same as a score in the 96th percentile. Similarly, a billionaire with a
net worth in the 99th percentile is what many would call a one percenter, a person
whose worth was in the top 1% of all individual net worths. Now, a pretty simple
concept to grasp, but there is one common error that many people will make. Often
with exam scores, a person may confuse the score on the exam with a student's
percentile rank. For example, suppose a student gets an 85% on an exam. This does not
mean that this student is automatically in the 85th percentile. Why? Well, let's look at
this distribution of 10 exam scores. 

Our student got an 85% on the exam, but that was the third lowest score in the
class. Conversely, we can look at this set of 10 exam scores. Here, our student also
scored an 85%, but now, they had the highest score in the class. Which then begs the
question, how do we calculate the student's percentile rank? To calculate percentile
rank, we use this formula. 
As you can see, we need to count up the total number of values in our data set. For our
most recent example of 10 exam scores, we have 10 values. Next, we count up how
many values are below the score of 85%. 

For this data set, our student had the third lowest score. Only two exam scores were
below this student's score. So this is how we set up our formula. According to this, our
student scored an 85% on the exam, but she is in the 25th percentile. Let's do the
calculation again for other data set. Here, our student had the highest grade in the
class. Nine other scores were below her score. In this data set, our student is in the 95th
percentile. Did you notice what just happened there? We had 10 exam scores, yet the
highest percentile that could be achieved was 95. Why? Well, a 100th percentile is not
possible, since that would be like saying that you are in the top zero percent of your
class. The best that one person could say is that they are in the top 1% or the 99th
percentile. Still, why not put the student in the 99th percentile? And if you use a
spreadsheet like Excel, they have their own internal formula for calculating
percentiles. In any case, the numbers are often close enough that it should not present a
significant problem, and often, as the size of the data set gets bigger, these differences
become negligible. So for the competitive folks out there, you now have two ways to
frame your goal. You can either tell people that you are shooting to be in the top 1% of
your organization or instead, you can tell them that one day, you hope to be in the
organization's 99th percentile.

Defining probabilty
- What are the chances I can flip a coin and get heads three times in a row? What's the
likelihood a certain basketball player will make a free throw? What are the odds that it
will rain in Berlin tomorrow? What are the chances a child born today will live to be 90
years old? Everyone is interested in probability. Science, sports, business, gambling, all
of them rely on probability in an effort to make informed decisions. But, what is
probability? I guess the most basic definition would be the likelihood that some event
will occur. Typically it's measured via a ratio. The desired outcome divided by all possible
outcomes. 

So let's go back to my very first question. What are the chances I can flip a coin and get
heads three times in a row? So, we have a random experiment: tossing a coin. The
sample space, which is a list of all the possible outcomes is this.

As you can see, we have eight possible outcomes. Only one of those is the desired
outcome. So, you can see the probability of getting heads three times in a row is one in
eight, or 12.5%. Let's try one more. This time, we can roll a pair of dice. What are the
odds I will roll two sixes? Our random experiment, rolling a pair of dice. The sample
space is given here. 
So this time we have 36 possible outcomes. Only one of those is the desired
outcome. Our probability of rolling double sixes is one in 36, or about 2.8%. Dice and
coins are easy though. They tend to be fair, and thus, fairly predictable. Most of life
doesn't work that way, though. It's not always easy quantify all the possible
outcomes. For example, what are the odds your boss will wear a black dress to work
tomorrow? It would depend on who your boss is, how many black dresses they
have, how many other outfits they might own. Perhaps, we also need to understand the
events of that work day. Some of the things we might know. Some we would not
know. Maybe only some of those things are important. Perhaps, there are other
factors we have not even considered. The formula for basic probability may be
simple, but that doesn't mean calculating probability is easy. Let's also consider this
scenario. Let's say, you and I bet. 
I tell you, that so long as you do not roll a double six with two dice, you will win the
bet. You have a 97.2% chance of winning. Suppose you roll the dice, and you roll a
double six, does this mean the probability was wrong? No, it simply means you were
every unlucky. Probability does not guarantee an outcome. It simply tries to inform you
on the possibilities. Next time someone provides you with a probability consider how it
was calculated, whether or not you trust the probability, and what the number actually
means. Understanding probability, both its strengths and its weaknesses, definitely
increases your odds of making good decisions.

Examples of probability
Some probabilities are easy to understand and calculate. Even odds across the
board. What are the odds of flipping a coin and getting heads? One in two, 50%. What
are the odds of rolling a six sided die and getting a four? The odds are one in six. The
nice thing about these two examples is that each possible outcome is equally as
likely. By that I mean, the odds of getting heads is 50%, the odds of getting tails is 50%. 

The odds of rolling any of these outcomes with the six sided die are equal. The odds of
getting any one of these outcomes is one in six. It's not quite as easy to calculate the
probability of a rainy day in Los Angeles tomorrow. It's not like we can say that the odds
of rain tomorrow is 50% and the odds of no rain are 50%. 
In Los Angeles, typically there are only about 20 to 30 days per year when it rains. And in
London, there are typically over 100 days per year when it rains. In these cases we say
that the odds are weighted. 

The odds that it might rain on any given day in Los Angeles is about 7%, and about 29%
in London. Then again, we also need to remember that these probabilities are stated on
the basis of an entire year. If I instead say, what are the odds that it will rain on
December 10th in London? The probability of rain on that particular day may be much
higher since December is traditionally London's wettest month. That said, the sum of the
probabilities of all possible outcomes must add up to 100%. So for our coin, 50% heads,
50% tails, 100% total. For our six sided die, each of our six outcomes has a probability of
one in six. When we add up all six outcomes our probability is 1.0 or 100% 
So if we define each day as either a day with rain or a day without rain, in London the
probability of rain on any given day might be stated as 29%. So the probability of a dry
day must be 71%. That again, sums up to 100%. Let's consider a scenario where I
put two, red ping pong balls in a container. That's it, there is nothing else in the
container. What are the odds that if you take one ball out of the container that it will be
red? Obviously, since both balls in the container are red, the ball you take out of the
container must be red. So the probability is 100%. On the other hand, what are the odds
that the ball you take out of the container will be white? There are no white balls in the
container, so the odds are 0%. This outcome is not possible. These are both simple
scenarios, but they help us understand a few basic things. The highest probability for
any scenario is 100%. The lowest probability for any scenario is 0%. So, to recap, the
probability of all possible outcomes must sum to 100%. Sometimes the probability of
every possible outcome is equally as likely. Sometimes some outcomes are more likely
than others. But no matter what, the probability of an outcome can never be less than
0%, nor can it be greater than 100%.

Types of probability
- There's a 50% chance that the result of a coin flip will be heads. There's an 80%
chance that the best basketball player on your team will make a free throw. There's a
75% chance that the unemployment rate in the United States will drop next year. Not
only are these three probabilities about three very different events, these are also three
different categories of probabilities. The coin flip is an example of classical
probability. The free throw example is an example of empirical probability. Both of these
are objective probabilities, meaning they are based on calculations. The unemployment
example is an example of subjective probability. Here, there are no calculations. So
what's the difference between these three different types of probabilities, and when is
each appropriate? Let's begin with the coin flip, an example of classical
probability. Assuming this is a fair coin, we have two possible outcomes, heads and tails,
both equally as likely. 
Let's say you win the coin flip if heads is the result. What's the probability of
victory? How do we calculate this probability? One outcome is a winner, and we divide it
by two, the total number of outcomes. Your chance of winning is 50%. We could do this
with dice also. 

We have a six-sided die, six possible outcomes. You win if you roll a one or a two. Two
winning outcomes divided by six possible outcomes. Your probability of winning here is
33%. As you can see, classical probability works well when you know all possible
outcomes and all the possible outcomes are equally likely to occur. When things are fair
and equal, when we understand every possible outcome, classical probability works
well. But what happens when not everything is fair and equal? For this, we turn to
empirical probability. 

In this case, we're trying to understand the probability that a particular basketball


player will get a free throw. To calculate this probability, we simply divide the number of
free throws this player has made this season divided by the number of free throws they
attempted. This is different from classical probability, because here each free throw is
different. Consider all the factors that go into each free throw. Player health, player
fatigue, game situation, early in the game, late in the game, playoff game, home game
or away game. Since every situation is different, we can only count on the
observations we have made up to this point. Obviously, the more observations we
have, the safer we feel about the validity of our calculated probability. Early in a
season, you may want to rely on last season's free throw data. In the playoffs, you may
want to rely only on free throws made in pressure situations. Empirical probability is not
perfect. But when you have some data for repeatable situations, sometimes empirical
probability can give you a nice idea of what to expect. But what happens when reliable
data just isn't available? How can we possibly know if the unemployment rate in the
United States will drop next year? There are so many factors to consider. And what
happened last year, or 10 years ago, or even what's happened over the last 100
years, may not be a good indicator of what will happen next year. In the case of
unemployment probability, people often use their opinions, their experiences, and
perhaps some related data to influence their statements about probability. So yes,
people sort of just guess. Some guesses might be better than others, though. An
economist's opinion may be more valuable than a lawyer's opinion. A CEO's opinion
may be more valuable than a rock star's opinion. Next time you read a probability
somewhere, consider what type of probability it is. Is it a classical probability, one that
you know is fair and reliable? Is it an empirical probability, one that is based on past
data? Or is it a subjective probability, simply an opinion-based probability based on
someone's experience and knowledge? If you understand what type of
probability you've been given, you can better understand how reliable it might be.
Probability of two events: Either event? Both events?
- Sometimes there are multiple outcomes that would lead us to the same
conclusion. For example, suppose we flip two coins, you win if one or both of the
coins turns up heads, what are your odds of winning? Well, let's look at all the possible
outcomes. 

We can get heads on flip one, tails flip two. Heads flip one, heads flip two. Tails flip one,
tails flip two. And finally, tails flip one, heads flip two. We can see here that two different
events will win this contest for you. Event one, heads of flip one, two out of the four
scenarios provide that result. Event two, heads on flip two, same here, two out of the
four scenarios provide that result, but we also see that there is an overlap here.

We wanna be sure not to double-count that outcome. So to calculate the probability of


getting heads on at least one of the two coin flips we add the probability of event
one plus the probability of even two, 

but we subtract the overlap, which is when both event one and event two occur. We can
see that our probability of getting heads on at least one of two coin flips is 75%. This is
what is called the addition rule. Let's up the difficulty level just a tiny bit. Let's do this
with a pair of six-sided dice. You win if you roll either a six with roll one or roll two. Here
are all the 36 possible outcomes. 
There are six outcomes where we roll six on the first die. There are six outcomes where
we roll six on the second die, but one of those outcomes overlaps. So to calculate the
total probability we add 6/36ths plus 6/36ths and subtract the overlap, 1/36ths, thus the
probability is 11/36ths or 30.56%. Sometimes there are no overlapping scenarios, for
example, what are the odds that the sum of two rolled dice will sum to either seven or
11? Here are the scenarios where the dice add up to seven, here are the two scenarios
where the dice add up to 11, notice there is no overlap. So here there is no need to
subtract anything, just add the odds of rolling a seven, 6/36ths, and add the odds of
rolling 11, 2/36ths. Here the probability is 8/36ths or 22.2%. We can also flip this
around. What are the odds of not rolling a seven or 11? Again, here are the eight
scenarios where the sum of the two dice would be seven or 11. We can count all the
other scenarios, but since we know eight of 36 scenarios sum to seven or 11, we know
that 28 of 36 scenarios do not sum up to seven or 11. The probability is 77.8%. These
were all relatively simple scenarios, but hopefully as you go forward you'll remember
you can add probabilities and you can also subtract probabilities.

Explanation of conditional probability: If X happens, then...


Suppose I have the names of six people, and I put one person's name on each of six cards.

I'm gonna pick two cards at random and award each person whose names are on those two cards
$100. Let's say Jose and Sally are two of the six people on my list, what are the odds I will pick
both of their names? Well, here are the 15 possible outcomes, 

only one of which contains both Jose and Sally. So, the probability that both will win together is
1/15 or 6.67%. Let's say I change the way we play the game. Let's say that instead of picking
both cards at the same time, I will pick one name and then pick the second name a few minutes
later. What are the odds of both Sally and Jose winning if the first card I pick has the name
Audrey? For this, we use the concept of conditional probability. This helps us answer questions
like this one. In this case, now that Audrey is one of the winners, it is impossible for both Sally
and Jose to win. The probability that both Sally and Jose are the two winners has dropped from
6.67% before we picked Audrey's name to zero after Audrey's name was picked. But, what about
if Sally's name was on the first card? What are the odds that both Sally and Jose win
now? Remember these were all of the possible outcomes. But now that Sally is out of the pool of
names, here are the five possible outcomes that remain. 

As we can see, initially the odds that Sally and Jose both win were 6.67%. Once Sally's name is
chosen, the probability went up to 20% since Sally and Jose were one of five possible
outcomes. Often it helps to draw probability trees to visualize what's happening. Here's a
probability tree for three coin flips. 

As you can see, the odds of getting tails three times in a row is initially 12.5%. We get that by
multiplying the probabilities along this branch. 
Once we flip, the first tails, though, the odds of getting tails three times in a row increases to
25%, only two parts of that branch remain. 

After the second tails is flipped, then the odds jump to 50%. 


Let's look at one more problem that does not include cards or coins. 

Here's a set of health-related data: 1,000 people, how long they lived, and whether or not they
exercised at least three days per week. Consider these two events: event A, people that lived
more than 85 years, event B, people that exercised at least 30 minutes three or more days per
week. How would we find the probability that someone lived more than 85 years given that they
exercised at least three days per week? Let's build the tree for this scenario. 
The given event is that this person exercised three days per week. So this is like the first coin
flip. 240 out of 1,000 people exercised three days per week. 760 of 1,000 people did not
exercise at least three days per week. Now comes the second event, how long did they live? 40 of
the 240 exercisers lived less than 75 years, 16.6%. 70 of the 240 exercisers lived 75 to 85 years,
29.2%. And 130 of the 240 exercisers lived more than 85 years, 54.2%. Here's what the other
branch on that tree would look like. So, from all the data we can see that only 13%, 130 of the
1,000 people lived more than 85 years and exercised at least three days per week. But once we
are given the fact that the person in question worked out at least three days per week, the
probability that this person lived more than 85 years is 54.2%. According to this data, it looks
like exercising three days per week might have its advantages. I guess we can say that working
out is an enormous factor in living past 85. Not so fast. Don't go overboard. People who are
committed to working out three days a week likely have other good habits. So, the exercise alone
may not be the only contributing factor. Understanding how to calculate conditional
probabilities is very important in the world of statistics. But understanding what those numbers
might mean and knowing which questions to ask, that might even be more important.

Explanation of conditional probability: If X happens, then...


Suppose I have the names of six people, and I put one person's name on each of six cards. I'm
gonna pick two cards at random and award each person whose names are on those two cards
$100. Let's say Jose and Sally are two of the six people on my list, what are the odds I will pick
both of their names? Well, here are the 15 possible outcomes, only one of which contains both
Jose and Sally. So, the probability that both will win together is 1/15 or 6.67%. Let's say I
change the way we play the game. Let's say that instead of picking both cards at the same time, I
will pick one name and then pick the second name a few minutes later. What are the odds of both
Sally and Jose winning if the first card I pick has the name Audrey? For this, we use the concept
of conditional probability. This helps us answer questions like this one. In this case, now that
Audrey is one of the winners, it is impossible for both Sally and Jose to win. The probability that
both Sally and Jose are the two winners has dropped from 6.67% before we picked Audrey's
name to zero after Audrey's name was picked. But, what about if Sally's name was on the first
card? What are the odds that both Sally and Jose win now? Remember these were all of the
possible outcomes. But now that Sally is out of the pool of names, here are the five possible
outcomes that remain. As we can see, initially the odds that Sally and Jose both win were
6.67%. Once Sally's name is chosen, the probability went up to 20% since Sally and Jose were
one of five possible outcomes. Often it helps to draw probability trees to visualize what's
happening. Here's a probability tree for three coin flips. As you can see, the odds of getting
tails three times in a row is initially 12.5%. We get that by multiplying the probabilities along
this branch. Once we flip, the first tails, though, the odds of getting tails three times in a row
increases to 25%, only two parts of that branch remain. After the second tails is flipped, then the
odds jump to 50%. Let's look at one more problem that does not include cards or coins. Here's a
set of health-related data: 1,000 people, how long they lived, and whether or not they exercised
at least three days per week. Consider these two events: event A, people that lived more than 85
years, event B, people that exercised at least 30 minutes three or more days per week. How
would we find the probability that someone lived more than 85 years given that they exercised at
least three days per week? Let's build the tree for this scenario. The given event is that this
person exercised three days per week. So this is like the first coin flip. 240 out of 1,000 people
exercised three days per week. 760 of 1,000 people did not exercise at least three days per
week. Now comes the second event, how long did they live? 40 of the 240 exercisers lived less
than 75 years, 16.6%. 70 of the 240 exercisers lived 75 to 85 years, 29.2%. And 130 of the 240
exercisers lived more than 85 years, 54.2%. Here's what the other branch on that tree would look
like. So, from all the data we can see that only 13%, 130 of the 1,000 people lived more than 85
years and exercised at least three days per week. But once we are given the fact that the person in
question worked out at least three days per week, the probability that this person lived more than
85 years is 54.2%. According to this data, it looks like exercising three days per week might have
its advantages. I guess we can say that working out is an enormous factor in living past 85. Not
so fast. Don't go overboard. People who are committed to working out three days a week likely
have other good habits. So, the exercise alone may not be the only contributing
factor. Understanding how to calculate conditional probabilities is very important in the world of
statistics. But understanding what those numbers might mean and knowing which questions to
ask, that might even be more important.
Statistics Foundations: 2

Welcome
- As a person that loves statistics, or perhaps as someone who just appreciates
statistics, you're probably comfortable with the basics: means, medians, standard
deviations, probabilities, and normal distributions. They're all part of your stats
vocabulary. But perhaps for you, stats appreciation is not enough. You want to collect
your own data. You want to make reasonable predictions. You'd like to test statistical
assumptions. You've come to the right place because that is what this course is all
about. My name is Eddie Davila, and I'm a university instructor with degrees in business
and engineering. I write ebooks, and of course I develop online educational content. I'm
a huge sports fan. I love to follow the entertainment industry. And I'm passionate about
science and health. And I can tell you that in every important facet of my life, having a
better understanding of statistics allows me to improve my performance and often to
find a greater level of satisfaction whether I'm working or playing. This course, Statistics
Fundamentals Part Two, is the second of a three-part series that I'm hoping will
empower you better to understand the numbers you will encounter in your life. In this
course, you'll discuss the collection of data and the importance of the simple random
sample. You'll look at confidence intervals. We'll explore what margins of error
mean. We'll discover the importance of hypothesis testing in the fields of science,
business, and beyond. And I'll tell you, even if you know what many of these things are, I
think you'll walk away with a new perspective. Actually, I'm hoping you'll never look at
these concepts the same way again. You won't just understand the power of data and
statistics. You'll know their inherent weaknesses too. Welcome to Statistics
Fundamentals Part Two. Improved performance and increased satisfaction are just
around the corner.

You might also like