You are on page 1of 40

ECE 510 Quality and Reliability Engineering

Lecture 1. Introduction, Monte Carlo

Scott Johnson
Glenn Shirley

• Introduction to the ECE 510 course
• Introduction to quality and reliability concepts
• A sample exercise showing the sort of exercise we will do most days in this

7 Jan 2013 ECE 510 S.C.Johnson, C.G.Shirley 2

Course Introduction

7 Jan 2013 ECE 510 S.C.Johnson, C.G.Shirley 3

Instructor Biographies
• C. Glenn Shirley
– PhD in Physics (ASU). 7 years at Motorola, 23 at Intel mostly in TD Q&R.
Retired in 2007. Joined PSU ECE in the IC Design and Test Lab in 2008 as an
Adjunct Prof.
• Scott C. Johnson
– PhD in Physics (U of CA Davis). 11 years at Intel as a reliability engineer and
statistician. Before that, BS in Electrical Engineering (U of IL) and 5 years at
• Course will be co-taught by Glenn and Scott.

7 Jan 2013 ECE 510 S.C.Johnson, C.G.Shirley 4

G.Johnson. • Develop skills to use statistical methods and tools such as SQL and Excel to analyze data.Shirley 5 . C.C. 7 Jan 2013 ECE 510 S. develop models and make decisions. Course Goals • Provide knowledge and skills required of a QRE for integrated circuit – Product design – Product development – Technology development – Manufacturing • Understand the essential trade-offs among – Cost – Performance – Quality and Reliability – Producer and Customer risks • Learn how to assess risks and write qualification plans.

MTTF.Johnson. C. • Compute test statistics such coverage. test time. • Design a reliability validation plan. • Be able to build system-level quality and reliability models from component-level models. yield. etc. • Use the reliability model to calculate figures of merit for various design “what-ifs” and hypothetical manufacturing and use specifications.C. • Handle large datasets using SQL and Excel. • Design a reliability experiment and fit data to a model.Shirley 6 . • Use Monte-Carlo simulation to model complicated real-world scenarios which are intractable by classical statistical methods. and so. Skills • Calculate failure rates. customer defect level from test data.G. including confidence limits.. 7 Jan 2013 ECE 510 S. • Design a statistical manufacturing monitor or control chart with specified producer and customer risk levels.

• Recommended textbook is Tobias & Trindade. and lecture recording. – Other spreadsheet programs (e. 7 Jan 2013 ECE 510 S.Shirley 7 . • Watch the web site for – Schedule changes. Applied Reliability. • Lecture will be recorded and accessible online. 3rd ed. Open Office. – Presentation materials. • Presentation materials will be posted to internet right after the class in which they are presented.G. • In-class exercises usually will require a computer running Microsoft Excel. Monday and Wednesday. – Later in the course we may use SQLite Expert.Johnson. But in-class activities will not be recorded. – Reading assignments. C. including links to extra materials.C. 10 weeks • Format is lecture and in-class exercises intermingled. Google Docs) might work.g. Logistics • 4 credits.. 5:00pm–6:50pm.

not the text unless referred to in presentation materials. but offline . • Exams. – Unless noted. – Due by 3P on day one week following assignment. open Excel. 25% of grade. – Attendance for class exercises is important for understanding. – Open-book.C.Shirley 8 .G. – Final exam will cover entire course (1/3 pre-midterm. 40% of grade. Discussion is OK. 25% of grade. – 16 exercises. C.Johnson. – Best 14 of 16 will be used to determine “Homework” part of grade. – All homework will be submitted electronically. Mid-Term. 2/3 post-midterm). – Material covered in exams will be from lecturer’s presentation materials. but copying is not. • Attendance. – Up to 2 unexcused absences with no loss of credit for attendance (18/20). Grading Model • Homework. homework is expected to be an individual effort. 10% of grade. 7 Jan 2013 ECE 510 S. etc.closed to internet. Final. open to slides. one per class with exceptions for exam week.

Syllabus Date Week Day Title Owner 7-Jan Mon Introduction and Excel Exercise Scott & Glenn 1 9-Jan Wed Plotting and Fitting Data 1 Scott 14-Jan Mon Introduction to Quality & Reliability Scott & Glenn 2 16-Jan Wed Statistics 1 Scott 21-Jan Mon Statistics 2 Scott 3 23-Jan Wed Plotting and Fitting Data 2 Scott 28-Jan Mon Plotting and Fitting Data 3 Scott 4 30-Jan Wed Plotting and Fitting Data 4 Scott 4-Feb Mon Plotting and Fitting Data 5 Scott 5 6-Feb Wed Reliability Mechanisms 1 Scott 11-Feb Mon Reliability Mechanisms 2 Glenn 6 13-Feb Wed Midterm Exam 18-Feb Mon Reliability Mechanisms 3 Bill Roesch.G.Johnson. Glenn 25-Feb Mon Qual Methodology Glenn 8 27-Feb Wed Qual Methodology Glenn 4-Mar Mon Test Methodology I (Description of Test) Glenn 9 6-Mar Wed Test Methodology II Glenn 11-Mar Mon Test Methodology III Glenn 10 13-Mar Wed Review Session Glenn & Scott 18-Mar Exam Mon Final Exam Mon or Wed. 20-Mar Week Wed 7 Jan 2013 ECE 510 S. Scott.C. Triquint 7 20-Feb Wed Seminar (Follow-on from Bill and other Mechanisms lectures) Bill. C.Shirley 9 .

Introduction to Quality and Reliability Engineering 7 Jan 2013 ECE 510 S.Johnson.G. C.Shirley 10 .C.

C.G.Shirley 11 .Johnson. Semiconductor Manufacturing 7 Jan 2013 ECE 510 S.C.

• QRE role is control of customer quality and reliability risks. • Technology development and product design must occur in parallel. 7 Jan 2013 ECE 510 S.Shirley 12 .G.Johnson. • New products take 1. Synch point Synch point Synch point Technology Certification 18 months 18 months Technology Development Next Technology Development Pre-silicon Development Post-silicon Development Product Development Manufacturing Lead Product Qualification Product Development Manufacturing Follow-on Product Qualification Product Development Mfg • QRE is involved at all stages of the Technology and Product Lifecycles.5 to 2 years from conception to manufacture.C. How Moore’s Law Works • Transistor density doubles every 18 months (Moore’s Law). C.

fusing. Traditional Requirements Go/NoGo Milestones silicon Product Go/NoGo Qualification Qualification 7 Jan 2013 ECE 510 S.5 y ~ 0.Johnson. o Performance. C.25 y Data Market Technology Device/Ckt Test Vehicle Product Research Projections/Scaling Simulation Data Data Pre-Silicon Post-Silicon Product Feasibility Design Pre.Shirley 13 . Reliability Goals o DFM Features o Power o Containment • Technology Selection o Producer Mfg Goals o Performance o Feedback to design and o Capacity Planning o Reliability.C. o Logical description (RTL) o Test setpoints. o OEM DFM Features • OEM Datasheet o Gate-level o Product tuning. Requirements Synch point Information Design Production Definition Planning Early Design Late Design Validation Ramp High Volume ~ 1 year ~ 1. • Quality Systems Goals • Manufacturing Reqts • Refined projections o Process control Decisions o Quality.5 years ~ 0.G. manufacturing. Power Definition o Circuit/layout level. Product Lifecycle • Market/Feature Reqts • Design Goals • Design • Manuf’g Optimization o End User Features o Area/power budgets. limits. o Quality and reliability monitors.

& HVM.G. C.Johnson. Xfer.Shirley 14 . 10Ku’s • TV’s » Defect Sensitivity Charact’n • Data Auto Req’ts » Process Sensitivity Charact’n 1Ku’s • And plan for Post-Si » Debug 100’s Synch point Test Vehicles ~ 1-2 Quarters 7 Jan 2013 ECE 510 S.C. Technology Lifecycle Lead Technology Technology Product Factory Target Spec Certification Qualification Qualification Discovery Discovery Development Pilot Manufacturing 1+Mu’s Volume Pre-Silicon Post-Silicon 100Ku’s • Discovery Mfg Validation of Test Process • Risk Assessment Technology.

G. Quality 7 Jan 2013 ECE 510 S.C.Johnson.Shirley 15 . C.

Quality control of incoming materials and vendors. Product Qualification. – Product QRE. technology. Provides requirements to acquire needed data. 7 Jan 2013 ECE 510 S. Fault coverage analysis and modeling. Technology Certification. and – Gives Q&R Requirements. Product Qualification. Process Certification. and manufacturing. and manufacturing risks. – Builds Q&R Models. at all stages of the Product and Technology lifecycles. – Customer QRE.C. C. – Designs Experiments. – Assesses Risk. Assess product. burn-in development.G.Johnson. – Materials QRE. Role of the Q&R Engineer (QRE) • The QRE is a proxy for the customer. What-if models providing estimates of Q&R figures of merit to enable decision-making involving Q&R attributes of the product. Find customer Q&R req’ts. Provide quality and reliability requirements for products. technologies. – Writes Validation Plans. DRB. Monitors. Containment. MRB.Shirley 16 . Responsible for specific product. deal with Q&R “escapes”. Silicon and Package reliability models. – Manufacturing/Factory QRE. – Test QRE. • Kinds of QRE – Technology Development QRE. Test program requirements.

Singulation Desktop Temp DT BAM! Shipping Shock Temperature Bend Reflow Handling Temp. . RH Keypad Cycle User Drop press Handheld & Vibe 8/2-4/2011 Plastic Package Source: Reliability 2003 Eric Monroe. RH Power Cycle Cycle Bent Pins. 17 Slide by Scott C. Johnson. Life of an Integrated Circuit OEM/ODM Assembly Shipping Storage End-user Environment Assembly Temp DT Shipping Shock Temperature Bend Reflow Handling Temp. Singulation DT BAM! Shipping Shock Temperature Bend Reflow Handling Temp. RH User Drop Power Cycle Cycle & Vibe Mobile PC Bent Pins.

as does End Use. Feedback improves processes.G.Johnson. • Component Mfr and System Mfr generates failures. Physical Manufacturing/Use Flow • The QRE must keep in mind the entire physical life of the component. • The failure rates are key Q&R figures of merit. Failed units Information Component Manufacturer System Manufacturer Burn-In/ System System Fab/Sort Assembly End User Test Assembly Test 1-2% 500 PPM ~ 0. Physical units • Failures are analyzed.Shirley 18 .C.5 % 10-20% Debug/ Analysis Repair Analysis Analysis Analysis 7 Jan 2013 ECE 510 S. C.

500 PPM. through time. – Eg. usually the System Mfr. C. Out of box experience. 7 Jan 2013 ECE 510 S. 1% in 7 years. Quality vs Reliability • Quality.C. 10 %/kh. For discussion: What “population” do we mean? – Eg. Brand image. 0. 2%. 500 DPM in 30 days. 300 DPM. – Measures of Quality (Figures of Merit) • How (which attributes) unit fails to meet specification. – Impact: Warranty cost to system Mfr.Shirley 19 . 100 Fits.Johnson. • Reliability – Conformance to specification usually at the End User.1% in warranty period. Brand image of System Mfr. Brand image of Component Mfr to System Mfr. – Eg.G. • Fraction of the population failing to meet specification. – Conformance to specification at the customer. to End User. 300 DPPM. – Impact: Rework at System Manufacturer. • Fraction of the population failing to meet specification in a specified time interval. – Measures of Reliability (Figures of Merit) • How (which attributes) and when unit fails to meet specification.

C.Shirley 20 . Measures of Reliability • Equivalent failure rate units Fail % per 1000 hrs FIT Fraction per Hour 0. C.01 100 0.00000001 0.000000001 0.1 1.0000001 0.000 0.000001 0.001 10 0.000 0.0001 1 • Conversion Factors – Fail fraction per hour x 105 = % per Khr – Fail fraction per hour x 109 = FIT – % per Khr x 104 = FIT FITs = “Failures in Time” 7 Jan 2013 ECE 510 S.G.00001 1.0 10.Johnson.

3M Die 1M Good Die Speed Bin the Good Die 7 Jan 2013 ECE 510 S. Quality Assurance • Quality assurance is achieved by thorough test.Johnson. bins. power. • Test occurs throughout the manufacturing process. 300K Bad Die Screen out the Bad Die (defects 300K) Manufacture 1.C. High Voltage • Purpose of Test Hot – Screen grossly defective and out-of-spec parts. – Classify parts into various speed. Tests “Use”-Like Monitor Post-Burn Wafer Sort Assembly Burn In In Test “Use” Fabric’n aka “Class” Pre-Burn Wafer-level In Test Unit-level Unit-level Cold Stress Hot Unit-level Very Hot. not just at the end.G.Shirley 21 . etc. C.

– End-of-Life < 3%.Shirley 22 . 0 – 30 days. • Comparison of FOMs with Targets determines whether or not a process or product passes or fails a Process Certification or Product Qualification. – Warranty < 1%. – Targets are usually “do-not-exceed” limits for FOMs. Competition. Q&R FOMs and Targets • Measures of quality and reliability called Figures of Merit (FOMs) are compared with “Targets” (sometimes called “Goals”). • Targets are set at the highest corporate level based on Cost models. 0 – 7 years Not representative of any particular product. 7 Jan 2013 ECE 510 S. • Other Targets are published to the customers and are highly visible anyway (eg.Johnson. Yield Loss). shipped DPM).C. within warranty.G. Brand image. C. • Typical Targets – Yield Loss < 10-20% – Class < 1-2 % – Customer Line Fallout < 500 DPM – Early Life Reliability < 500 DPM. • Some Targets are internal and highly proprietary (eg.

html Infant Mortality Wear-out Constant fail rate 8/2-4/2011 Plastic Package Reliability 23 Reliability Goals from ITRS 2009 http://www.

C.Johnson.G.Shirley 24 . Reliability 7 Jan 2013 ECE 510 S.C.

The Reliability Problem • Quality fails can be handled by thorough testing • Reliability is harder because the fails come long after we’ve sold the product – How can we tell which parts are going to fail in the future? 7 Jan 2013 ECE 510 S.Johnson.G.C.Shirley 25 . C.

G.C. C. The Reliability Solution • The answer is that we must do research and determine – How the parts fail (the fail mechanisms) – What stresses made the part fail • Then we must – Design our product so that none of the fail mechanisms are activated by the stresses encountered during normal use – Specify those use conditions clearly – Test our product under accelerated stress conditions 7 Jan 2013 ECE 510 S.Shirley 26 .Johnson.

humidity.000. C. Understanding Reliability: A Keyboard • How might a keyboard key fail? (mechanisms) – Material that gives tactile “click” might fatigue and break – Electric contacts might corrode or become blocked with dirt • What might cause these fails? (stresses) – Being pressed too many times (wearout) – Heat. spills.Shirley 27 .000 times – Before that.) – Find what breaks and make that (and only that) stronger 7 Jan 2013 ECE 510 S.C. being pressed too hard • How can we test a key’s entire life? (stress test) – Use a machine to press it 1.Johnson. heat it and shake it with dirt and water • How can we make it more reliable? (design for rel. dust. dirt.G.

Reliability Example: Aircraft Reliability • De Havilland Comet was an early commercial jet • A few crashes were initially unexplained • Thorough research led to understanding of metal fatigue (fail mechanism) from cabin pressurization (stress) • Great 1-hour video (TV show) about it: – http://www.Shirley 28 . 7 Jan 2013 ECE 510 S.G.

C. radiation. A burn-in system and HAST system perform stress tests on chips 7 Jan 2013 ECE 510 S.G. fatigue) Reliability is assured by a qualification process. where statistical samples of products are stressed to simulate a lifetime. mechanical stress • Mechanisms: transistors (degradation). package (corrosion.Johnson.Shirley 29 . Reliability Assurance The stresses and fail mechanisms for integrated circuits are different. temperature cycling. interconnects (cracking). but the concepts are the same: • Stresses: voltage. temperature. current.C. humidity.

Johnson.G.Shirley 30 .We will now give an example of the sort of exercise we will do most days in this class: Monte Carlo Methods 7 Jan 2013 ECE 510 S. C.C.

C. Europe • Answer: A famous casino. • Monte Carlo (MC) methods are numerical calculations using random numbers 7 Jan 2013 ECE 510 S. C.Johnson. Monaco.G. What is Monte Carlo? Monte Carlo.Shirley 31 .

G. What Can Monte Carlo Do? • Answer: Give approximate numerical solutions to problems we can’t solve exactly • Best suited for: – Distributions – Integrals (areas under a curve) when we have – Many parameters (= many dimensions) – Complicated or non-analytic functional forms • Many simulations of the manufacturing process are ideal for MC 7 Jan 2013 ECE 510 S. C.C.Shirley 32 .Johnson.

or any programming language – JMP or many other math-related software programs 7 Jan 2013 ECE 510 S.G.Shirley 33 .141 • Can be done using – Excel (shown here.Johnson. area=3. – For 4000 random points. C. C++. formulas and results) – VB.115 – Actual area =  = 3. Finding an Area Using MC y 1 x -1 1 -1 • Here is a MC example: find the area of a circle.C.

Shirley 34 .G. C. f  N i 1 7 Jan 2013 ECE 510 S. Thinking Like an Integral y 1 1 inside circle f (x.C.y) { = 0 outside x -1 1 -1 V = blue square = 4 • Many of our simulations can be interpreted as integrals (area under a curve) • The area under this function is  f dV  V f N  f xi  1 Angle brackets are an average.Johnson.

• Be prepared to give your numerical answer (x points out of 4000 are in the circle) in class.Johnson.G. • Turn in your spreadsheet by email before beginning of class 1 week from today.Shirley 35 . Use 4000 samples. Exercise 1.1 • Do a Monte Carlo calculation of Pi as shown in the previous slide. 7 Jan 2013 ECE 510 S.C. C.

C.Shirley 36 .Johnson. Exercise 1. 7 Jan 2013 ECE 510 S.C.1 Solution • Already shown two slides back.G.

Shirley 37 .Johnson. Distribution of Answers • Collect and plot answers from everyone in class 7 Jan 2013 ECE 510 S.G.C. C.

5 N=400 σ/μ=2.7% 0 σ=standard dev 100 1000 104 105 3 3.2 3.G.1 3.22% Percent N=4000 σ/μ=0.4 μ=mean N 40 trials per N 7 Jan 2013 ECE 510 S.3 3.88% 2.000 σ/μ=0.C.Johnson.Shirley 38 . Accuracy of MC 7.55 N=40. C.

C. a grid technique is better – No statistical uncertainty • For high-dimensional functions (many parameters). C. still take 10. MC is the only practical technique – Example: 20 parameters with a grid of 10 points per parameter – That is 1020 calculations – For a fast computer.Shirley 39 .000 years to evaluate • Bottom line: Use MC when you have many parameters 7 Jan 2013 ECE 510 S. like this). 300M calcs/sec.G.Johnson. When to Use MC y y 1 1 x x -1 -1 1 1 -1 -1 • For 2D functions (or shapes.

C. C.Johnson. The End 7 Jan 2013 ECE 510 S.Shirley 40 .G.