You are on page 1of 41
BUSINESS STATISTICS & ANALYTICS American Cancer Society estimates that about 1.7% of women have breast cancer. A study for the Cure Foundation states that mammography correctly identifies about 78% of the women who truly have breast cancer. ‘An article published in 2021 suggests that around 10% of all mammograms are false positive. Answer the following clearly stating your assumptions, if any, with appropriate justification. a. A woman has tested positive ina mammography. What is the probability that she has breast cancer? b. In order to double check the same woman undergoes a second test by the same mammography kit. This test too results in a positive. Does the probability of her having breast cancer change now or remain the same as earlier. Please justify your responses accordingly. Time Census ———* Accurate? Sample > Population Cost Population Sample 260 Information Subset Generalize Process of drawing a sample Population —€ Representative wy ‘ Ph Samj i i LE Probability Sampling Non Probability Sampling Sampling Frame A List of People Theory of Probability ‘What should this list contain? Is Sampling Frame same as Population? ‘Sampling Techniques Probability Sampling Simple Random Sampling Every Population Element has Population = a,b,c, d} What is Random? equal probability of getting selected Population size = N= 4 fb) Draw a sample of size 2 cE 3 ae bag Probability of each element = aie! +0 =IN=d4= Sample size = n= 2 ee InGeneral Probability of each 3/6 = 1/2 = 2/4 = WN = WINCn n/N Sampling Fraction jd Nin = 10 3 138) Systematic Random Sampling 234 * * * * 8 ee * * eo 8 * 7 * * se * * * ater «Se at 8 | ee ae ors * ae 8 eo FB Le ate * a” * «,*** eH, mege ae ae at tte af %, 7S ane S] a ee o® 3 fete” & | * mt ae # ee ty * a &% a8 Population Strata | | SI 2 3 NI N2 N3 al nd 3 Stratified Sampling sk NK N ak n © 3 7 aki c dl d2 d3 ak Selected Hamlets are called the Multi Stage Sampling Units First Stage Sampling Units ‘One Stage or Single Stage Clusters Cluster Sampling Survey every household in the selected cluster Draw a random sample of households Two Stage Cluster Sampling Ultimate Stage Sampling Units Ina rural area 10% households need to be sampled Probability of selection of a household 10% Scenario | Scenario 2 Sample half the hamlets Sample one-fifth of the hamlets Sample one-fifth of the households Sample half the households P(Ham) X P(Hous/Ham) P(Ham) X P(Hous/ham) US X W2= 10 12 X 15= 1/0 BUSINESS STATISTICS & ANALYTICS How well has the State Government handled the Covid Get the mean monthly food expenses of Higher Education Students in Anand City Situation in Gujarat? ‘What is the population? Good Bad Al Higher Eduction Students of Anand City A Census would be difficult Good ng Relative Frequency ngin 47% 53% N Population si pulation size Bad nb Relative Frequency nb/n ae eng sample of size n = 500 How do you draw a sample using SRS? Calculate Sample Mean = Rs. 5200 HIE Population Mean p Population Proportion ‘Sample Mean B Sample Proportion Parameter p Parameter Statistic B Statistic A parameter is unknown —Value of parameter can never be known with certainty Estimate the value of the parameter L__ Use the reterane sate Sample of size n = 500 Sample | > 5900 Sample 2 > 5200 Sample 3 > 4900 Sample 4 >5000 ‘The parameter is constant and unknown ea n The statistic is a variable Statistic G 1 G \\l B 0 G |i B \\O0 Estimates p, ameter a WS SRS WR wor E(xbar) AA statistic is an unbiased estimator of a parameter teats If E(Statistic) = Parameter Inbiase Estimator Pee I E(xbar) = WK biased Var (xbar) N>>n Var (xbar) = G4/n N Finite Population Correction (FPC) SD (xbar) = o/Vn_— Standard Error E(®) = EGdn) Var(B) =Var(x/n) BUSINESS STATISTICS & ANALYTICS E(xbar) rpcaes Unbiased Var (xbar) = in x Estimator eel > in SD (xbar) = o/Vn_— Standard Error of xbar_ Assuming random sampling, as increases, does standard error increase or decrease? E(®) = E(dn) Var (6) ‘Var(x/n) ‘Are there repeated trials? Are the trials independent? Does each trial result in one of two ossible outcomes? xis the number of elements having P* attribute of interest x=0,1, Is the probability of success fixed? x~B(nP) EG) = nP Vo) = nP( =P) =nPQ pop E(vn) = Win E(x) =WnXnP =P fare PHI Von) = WnPV(x) = Wn? X nP(I=P) = PCI — Pin SD) = VP(I-P\/n = \PQin Sample mean is a variable Is xbar a random variable? PAE Xbar Problxbar) xlbar x2bar Var (xbar) = On Standard Error of xbar_ = o/\n Probability distribution of xbar What is the PDF of xbar? Does it depend on something? Population distribution of the variable of interest Frequensy Mode = Median < Mean Mean = Median = Mode Mesokurtic not too flat and not too peaked | Platykurtic - flat distribution Leptokurtic - peaked distribution Draw 10000 samples each of size | Sample No. ' 2 3 ‘10000 Values 2500 3000 3125 4200 xbar 2500 3000 3125 4200 If population is N(q, 0) Distribution of xbar ae If population is NOT Normally distributed xbar ~ N(u, 0 Vn) xbar ~ N(u, on) Regardless of sample size Provided n is reasonably large For n= 30 Central Limit Theorem (CLT) Standard Normal Variable (xbar - yl) xbar ~ N(v,0 Nn) Nn For n> 30 Noon Fxbar) xbar ~ N(, on) <—_ oi ————> E@) Var(6) SD) PDF of B =P = P(I-Pyin = VP(I = Pin x~B(n.P) PDF of ® > x~B(a,P) =x/n nis large nP and n(I-P) 2 5 EQ) =nP Vix) = nP(I-P) x ~ N(AP, YnP(I — P)) =xin ~N(P\P(I = Py/n) Standard Normal Variable (xbar-H) = Z~N@O,1) oNn (p*-P) =Z~N@,!) \P(I —Pyin <—— Pil F@) B ~N(RVP(I =P)/n) BUSINESS STATISTICS & ANALYTICS Standard Normal Variable (xbar-H) = Z~N@O,1) oNn (p*-P) =Z~N@,!) \P(I —Pyin <—— Pil F@) B ~N(RVP(I =P)/n) Aa a 95% cs Distance of p’ from P S.E of p* (p%—P) = 1.96VP(I - P)in 2S. of P ph is within 2 S.E of P (PS=P) = £1,96VP(I ~ P)/n P is within 2 S.E of p® pht1.96\P(I — Pin Confidence Interval * — ZNP( - Py/n, p® + ZVP(I - Py + 1.96VP(1 Pn UCL © (1 — Pyin, p® + ZVP(I ~ Pin) = 1,96VP(1-Pyin LCL sa 99% interval better than a 95% interval? (p*-P) ra Ip*—P| Margin of Error YP(I — Py/n 0.09 Pramas = 05 [Bel rea veeaseati B2 = 4P(I —P)in =4P(| - P)/B,2 Bp = £1.96 VP(I —Pyin : 41 — Py M=X+196Y Un UCL Pz Z > >) Chi-Squared Distribution x With one degree of freedom zp Z, and Z, are independent ~X2 with ewo degrees of freedom Find the addition of any 5 numbers Choose 5 numbers that sum to 100 ‘Sample Variance S? X(x,-*) i =0 ‘Sample Variance has n - | df Degrees of Freedom df = 5 Degrees of Freedom df= 5-1 =4 n xbar >(x.-*) a a Joo, % | Ge) : %3 | (43-3) Xnt| (0. 3) xX, (x73) XP(K)/K; XPand X;? are independent F x1.Q= = AP(K)/Ky K, is numerator df denominator df

You might also like