Version 2

RAT-STATS Companion Manual
TABLE OF CONTENTS
Page
Random Numbers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-1 Single-Stage Random Numbers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-2 Sets of Two Numbers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-3 Sets of Three Numbers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-4 Sets of Four Numbers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-5 Frames - Single Stage . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-6 Frames - Sets of Two . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-8 RHC Sample Selection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-11 Procedure for Two-Stage RHC Sampling . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-11 Procedure for Three-Stage RHC Sampling . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-18 Summary of Input for RHC Sample Selection . . . . . . . . . . . . . . . . . . . . . . . . 1-20 Generating Spares for RHC Sampling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-24 Comparison of RHC and Multistage SRS . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-29 Attribute Appraisals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-1 Unrestricted Attribute Appraisal . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-2 Stratified Attribute Appraisal . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-6 Two-Stage Unrestricted . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-11 Three-Stage Unrestricted . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-19 RHC Two Stage . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-28 RHC Three Stage . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-36

Stratified Cluster . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-62 Stratified Multistage . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-68 Variable Appraisals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-1 Unrestricted Variable Appraisal . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-2 Stratified Variable Appraisal . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-8 Using a Stratified Sample . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-8 Strata Formation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-11 Two-Stage Unrestricted Variable Appraisal . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-17 Three-Stage Unrestricted Variable Appraisal . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-24 RHC Two Stage Variable Sampling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-35 RHC Three Stage Variable Sampling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-48 Stratified Cluster . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-77 Stratified Multistage . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-82 Post Stratification . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-91 Unknown Universe Size . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-99 Sample Size Determination . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-1 Variable Sample Size Determination . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-2 Unrestricted Using a Probe Sample . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-2 Unrestricted Using Estimated Error Rate . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-7 Stratified . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-11 Total Sample Size Unknown . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-11 Total Sample Size Known . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-16 Attribute Sample Size Determination . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-22

PREFACE
The purpose of this manual is to provide:

# an overview of each program in the Windows version of RAT-STATS, # examples illustrating the application of the software, # snapshots of data sets used by the programs, # some discussion regarding the program output, and # formulas used within the software.
The intent is for the auditor/specialist to use as much of this discussion as he/she finds helpful. While the RAT-STATS Users Guide gives descriptions of program input and output, this Companion Manual should provide insight as to how to better use the software and exactly how the program derives the results. The formulas are provided so that OAS has a single source for all formulas in the event that a question is raised as to exactly how a particular result was obtained.

We hope you find that the manual makes the OAS software easier to understand and easier to apply. Please pass on any suggestions or corrections to Office of Inspector General, Office of Public Affairs at paffairs@oig.hhs.gov.

will examine the mechanics and estimation procedures using such a sample in detail.Sets of Two # Frames # Frames # RHC Sample Selection (Rev. at some point in the data collection you will need one or more random samples. 10/2004) Page 1-1 . but first it is necessary to discuss procedures for generating a random sample.Single Stage . dealing with Unrestricted Random Sampling. namely: # Single-Stage # Sets # Sets # Sets Random Numbers of Two Numbers of Three Numbers of Four Numbers . A number of programs exist for such purposes. The next section.RAT-STATS Companion Manual Random Numbers RANDOM NUMBERS Whatever statistical sampling design you end up using (including stratified and/or multistage).

or a mixture of both. 556. 340. beginning with the smallest selected item number and proceeding to the largest item number. and 947. For values in random order. 10/2004) . Page 1-2 (Rev. For values in sequential order. 346. Random values (the output from this program) can be output in sequential order. A universe contains 1.000 payments and a simple random sample of 10 payments (with four spares) is needed. 641. 236. random order. the items are printed in the order in which they were selected by the program. 624. What items should be selected? Solution: Using this program and a seed value of 12345. you will see the values printed in sequential order. 658. Example 1. 497. 337. 927.Single-Stage Random Numbers RAT-STATS Companion Manual Single-Stage Random Numbers This program generates an unduplicated quantity of random numbers. the sampled payments are those numbered as follows: 9. and 884 The four spares are payments 404.

page number and line number).g. beginning with the smallest selected item number and proceeding to the largest item number. Which items should be selected? Solution: Using this program and a seed value of 12345. random order. Random values (the output from this program) can be output in sequential order. the items are printed in the order in which they were selected by the program. or a mixture of both. 10/2004) Page 1-3 . Values in sequential order will be printed in sequential order. the sampled payments are: PAGE: ITEM: The spares are: 224 23 258 37 PAGE: ITEM: 266 42 579 61 327 1 109 54 366 16 188 50 400 7 330 49 422 23 433 59 561 40 610 63 (Rev. Items are selected from a computer printout that had pages numbered 1 through 658 and had 66 lines on each page. Example 2. This is useful when sample items are selected through a two-step process (e.RAT-STATS Companion Manual Sets of Two Numbers Sets of Two Numbers This program will generate unduplicated pairs of random numbers.. A simple random sample of 10 items (with four spares) is needed. For values in random order.

g. This should be used when sample items are selected through a three-step process (e. Random values (the output from this program) can be output in sequential order. We need four sample items and two spares. Here the universe consists of 1 year's worth (12 months) of computer printouts. 10/2004) . month. beginning with the smallest selected item number and proceeding to the largest item number. Values in sequential order will be printed in sequential order. For values in random order. and line number). the items are printed in the order in which they were selected by the program. Example 3. Same as Example 2.. or a mixture of both. the sampled items are: MONTH: PAGE: ITEM: The two spares are: 3 224 23 5 266 42 6 6 37 12 623 57 8 582 43 8 258 37 MONTH: PAGE: ITEM: Page 1-4 (Rev. page. Solution: Using this program and a seed value of 12345. where the pages are numbered 1 through 658 for each month.Sets of Three Numbers RAT-STATS Companion Manual Sets of Three Numbers This program will generate unduplicated sets of three random numbers. random order.

. This should be used when sample items are selected through a four-step process (e. beginning with the smallest selected item number and proceeding to the largest item number.RAT-STATS Companion Manual Sets of Four Numbers Sets of Four Numbers This program will generate unduplicated sets of four random numbers. Solution: Using this program and a seed value of 12345.g. the sampled items are: YEAR: MONTH: PAGE: ITEM: The two spares are: 2 5 433 59 YEAR: MONTH: PAGE: ITEM: 3 1 366 16 4 5 266 42 5 12 561 40 2 7 400 7 (Rev. year. random order. Example 4. or a mixture of both. 10/2004) Page 1-5 . month. For values in sequential order. Same as Example 3. where the pages are numbered 1 through 658 for each month and year (total of 5 years). page. and line number). For values in random order. We need three sample items and two spares. the items are printed in the order in which they were selected by the program. you will see the values printed in sequential order. Random values (the output from this program) can be output in sequential order.

beginning with the smallest selected item number in the first frame (if any) and proceeding to the largest item number in the last frame (if any).452 8.9.Single Stage This program will generate an unduplicated set of random numbers which is useful when the universe of sampling items either (1) contains gaps of numbers or (2) the numbering system repeats within the universe. the universe of items consists of two frames. Example 5. For values in random order.050 (frame 1) and 8. random order.405 through 9.565. the three sample items in sequential order are: FRAME 1 2 2 ITEM NUMBER 20 8. For values in sequential order. Solution: Using this program and a seed number of 12345.405 . 10/2004) .584 Page 1-6 (Rev. or a mixture of both.Frames . you will see the values printed in sequential order.1.050 and 8. Three of these items should be in sequential order and the remaining two in random order. numbered 1 through 1.565 (frame 2) A sample of five items is needed. A universe of items that refer to payment of a particular medical procedure are numbered as follows: 1 . the items are printed in the order in which they were selected by the program.Single Stage RAT-STATS Companion Manual Frames . For instance. Random values (the output from this program) can be output in sequential order.

050) .405 + 1). item number 8.230 points to item number 8. 1. These values were 520 and 752. three values between 1 and 2.565 .Single Stage The two items in random order are FRAME 1 1 ITEM NUMBER 520 752 Explanation: For this example there are 2.8. the program generated two values between 1 and 2. Since 1. its location in the second frame would be [8.584.050 items in the first frame and (9. 10/2004) Page 1-7 .098 is outside the first frame. in particular. it is in the second frame.1.161 items. and 1.405 + (1. Similarly. Since both values are less than 1. Similarly. (Rev. i. These values are 20.452.050. for the two items in random order.211 items in the frame since there are 1.211 are generated.RAT-STATS Companion Manual Frames .e.1] that is. For the sequential items.230. these locations are items 520 and 752 in the first frame.098. in the second frame.098 ..211. the value of 1. 1.

10/2004) . For values in sequential order. Page 1-8 (Rev.832 In addition.100 1 . you will see the values printed in sequential order. Random values (the output from this program) can be output in sequential order. For values in random order.66 3 1 .Sets of Two RAT-STATS Companion Manual Frames . numbered as follows: FRAME 1 2 3 RANGE (Page Numbers) 1 .456 45 .Single Stage). or a mixture of both.Sets of Two This program is a combination of two programs. beginning with the smallest selected item number in the first frame (if any) and proceeding to the largest item number in the last frame (if any). A universe of transactions consists of three sets of computer pages.66 A sample of three items in sequential order and two items in random order is needed. The program will generate an unduplicated set of random numbers which should be used when (1) pairs of numbers are used to locate sample items (as in Sets of Two Numbers) and (2) the universe has gaps or the numbering system repeats (as in Frames . The range within each frame is: FRAME RANGE (Number of Lines) 1 1 . the items are printed in the order in which they were selected by the program. Example 6.Frames . within each frame there are an equal number of line items per page. Frames . random order.66 2 1 .Single Stage and Sets of Two Numbers.

696 are in the second frame.704 are generated.316. Multiply this by 66 (the number of lines per page in the first frame for this example) and round to the nearest integer. Random numbers between 1 and 88. 10/2004) Page 1-9 .008 = 88. (Rev. The integer part of this is 12.RAT-STATS Companion Manual Frames .45 + 1)(66) = 6. So.043.096 + 52. The three random values (not the spares) generated are 771.697 and 88.600 will be in the first frame. This is 45.601 and 6. this value is on page number (subframe) 12 of frame 1. LINE 2 216 64 2 357 36 Explanation: For this example. and 49. To find the value corresponding to 771: 1. 2.704 will come from the third frame.096 = 36.704 items.600 + 30.600 + 30. Find (771/66) + 1 = 12. Values between 1 and 6. and values between 36.Sets of Two Solution: Using this program and a seed value of 12345. This item is on line 45 of page 12.682. LINE 1 12 45 3 156 21 3 236 14 The two items in random order are: FRAME PAGE NO. the three sample items in sequential order are: FRAME PAGE NO.682. the frame consists of (100)(66) + (456)(66) + (832 . 44. The decimal part of this number is . values between 6.

Find [(49.318. Finally. this value is on page number (subframe) 156 of frame 2. Multiply this by 66 (the number of lines per page) and round to the nearest integer. This is 21. So.696)/66] + 45 = 236. This is 36.212 is 236.316 . 45 is the low number (input) for frame 3.696. The decimal part of this number is . Multiply this by 66 (the number of lines per page in the second frame for this example) and round to the nearest integer. The decimal part of this number is .36. The integer part of 236.36.318. 2. this value is on page number 236 of frame 3. This item is on line 14 of page 236.696)/66] + 1 = 156. This is 14.Sets of Two RAT-STATS Companion Manual To find the sample value corresponding to 44.316. The integer part of this value is 156. This item is on line 21 of page 156. Here. 1. Find [(44.043 . consider the third randomly generated value of 49.043: This value is larger than 36.212.Frames . 10/2004) .696 so it is in the third frame. So.212. 2. Page 1-10 (Rev. so it is in the third frame. 1.

s into n groups (no attention to size here).s (Rev.RAT-STATS Companion Manual RHC Sample Selection RHC Sample Selection The RHC selection/appraisal procedure is named after three statisticians -.U.s) in a three-stage design.U. H.s and three “small” ones.s must be selected. The RHC procedure is not pure pps sampling.s) in a two-stage design or primary and secondary units (S.U. Rao. Within each of the n groups. Hartley.and was originally proposed in 1962. Example 7. n = 3 1.U. and William Cochran -.U. If the sample size is eight. To understand why. Randomly put (partition) the N P. consider a situation in which a population contains 10 primary units with seven “large” P.J. It can be used to select primary units (P.U.U. 10/2004) Page 1-11 . 2. you cannot use pure probability proportional to size (pps) sampling when sampling without replacement. but comes very close while allowing the auditor to sample without replacement. Comment: Strictly speaking. The procedure is to: 1. select one P. each containing 5 P.s. using pps. Generate 3 groups.U. then one of the small P. Procedure for Two-Stage RHC Sampling Suppose that you have N P.s and you want a sample of n P. It provides a method of sample selection that allows sampling without replacement (the usual procedure) while “maintaining the flavor” of using probability proportional to size.U.N. This procedure is essentially the same as single-stage SRS sampling except that the size of each primary unit (cluster) is used to select the sample.K. regardless of its small size. N = 15.O.

#7 will be selected from group 3 with probability 100/1000 = . number of grants. In a particular region of the U. and UNIV75 This program will create an output file specified by the user (OutRHCsummary. next largest in location 12. UNIV28.TXT.1. The first group consists of P. number of S.S.. UNIV19. there are N = 90 universities with government research grants. UNIV49. 8.U. and 13.U. next largest in location 13. UNIV38.txt for this illustration) that is used as one of the input files by the RHC appraisal program.U.U.s 2. Suppose the smallest value is in location 8. UNIV5. ID of P. P. We know that there are a total of M = 4. size of P. 10/2004) . use the total grant dollars.e. 12. Output: The 10 universities to use in the sample (see last page of computer output) are: UNIV78. UNIV42..U. Continue. to get the remaining two groups. Size: As a measure of the size for each university. 2. Size of group 3 is 1. and the next largest in location 2. Each row of the data file will contain: University ID. named UNIVRHC.U.U.000 beds and size of P.U.s (universe) in this P. total grant dollars i.) contained in this data set. UNIV60. #7 is put into group 3.500 grants in all 90 universities. #7 is 100 beds. next largest in location 5. There are 90 rows of data (one for each P.RHC Sample Selection RAT-STATS Companion Manual One possibility: Generate 15 random numbers between 0 and 1. Dataset Page 1-12 (Rev. Rather than audit all grants at a selected university. it was decided to use a sample of n = 10 universities. it was decided (based on available resources) to audit roughly 20% of the grants at each selected university. Example 8. 5. Suppose P. Because these universities are so widespread. UNIV62.U..

txt are contained in the following pages.TXT UNIV1 UNIV2 UNIV3 UNIV4 UNIV5 UNIV6 UNIV7 UNIV8 UNIV9 UNIV10 UNIV11 UNIV12 UNIV13 UNIV14 UNIV15 UNIV16 UNIV17 UNIV18 UNIV19 UNIV20 UNIV21 UNIV22 UNIV23 UNIV24 UNIV25 UNIV26 UNIV27 UNIV28 UNIV29 UNIV30 42 21 63 74 51 43 57 49 63 18 64 56 19 44 20 34 25 38 72 46 44 64 45 55 29 36 40 78 49 60 8 4 13 16 11 9 11 10 13 4 13 11 4 9 4 7 6 9 16 10 9 13 9 11 7 7 9 18 10 12 < . and output file OutRHCsummary. 10/2004) Page 1-13 .000) µ This is the size of the university.> (1) (2) (3) < ..OFFICE OF AUDIT SERVICES GENERATION OF PRIMARY UNIT SAMPLE NAME OF INPUT FILE: C:\TEMP\UNIVRHC... DEPARTMENT OF HEALTH & HUMAN SERVICES OIG ..TXT. the program output...continued .TXT GROUPS OF PRIMARY UNITS Date: 10/15/2004 Time: 12:52 (Rev. Dataset UNIVRHC...> UNIV31 UNIV32 UNIV33 UNIV34 UNIV35 UNIV36 UNIV37 UNIV38 UNIV39 UNIV40 UNIV41 UNIV42 UNIV43 UNIV44 UNIV45 UNIV46 UNIV47 UNIV48 UNIV49 UNIV50 UNIV51 UNIV52 UNIV53 UNIV54 UNIV55 UNIV56 UNIV57 UNIV58 UNIV59 UNIV60 52 66 25 60 19 24 44 76 41 77 37 63 52 76 51 23 24 68 34 49 55 38 72 51 71 59 23 57 53 64 11 14 5 12 4 5 9 17 9 18 8 12 11 17 10 4 5 15 7 10 11 9 16 10 15 12 4 11 11 13 UNIV61 UNIV62 UNIV63 UNIV64 UNIV65 UNIV66 UNIV67 UNIV68 UNIV69 UNIV70 UNIV71 UNIV72 UNIV73 UNIV74 UNIV75 UNIV76 UNIV77 UNIV78 UNIV79 UNIV80 UNIV81 UNIV82 UNIV83 UNIV84 UNIV85 UNIV86 UNIV87 UNIV88 UNIV89 UNIV90 66 77 31 46 32 68 41 28 66 31 27 33 23 71 75 47 50 37 77 49 76 66 28 77 27 75 71 59 71 72 13 18 7 9 7 14 9 6 14 7 6 7 4 15 16 10 10 7 18 10 17 14 6 17 6 17 15 12 15 16 Columns: (1) primary unit ID (2) number of grants (3) grant dollar amount (x $100.RAT-STATS Companion Manual RHC Sample Selection UNIVRHC.continued .

selected 11 UNIV64 9 Page 1-14 (Rev.selected 7 UNIV21 9 UNIV4 16 UNIV54 10 UNIV61 13 UNIV77 10 GROUP TOTALS: 9 96 ********* GROUP 4 ********* PRIMARY UNIT PRIMARY UNIT IDENTIFICATION SIZE ============================== ============= UNIV73 4 UNIV50 10 UNIV58 11 UNIV57 4 UNIV82 14 UNIV23 9 UNIV5 <-. 10/2004) .selected 7 UNIV79 18 UNIV2 4 UNIV52 9 UNIV33 5 UNIV47 5 GROUP TOTALS: 9 90 SECONDARY UNIVERSE ============= 55 76 66 37 77 21 38 25 24 419 SECONDARY UNIVERSE ============= 43 63 32 77 51 42 49 24 31 412 SECONDARY UNIVERSE ============= 23 57 72 34 44 74 51 66 50 471 SECONDARY UNIVERSE ============= 23 49 57 23 66 45 51 46 ********* GROUP 2 ********* PRIMARY UNIT PRIMARY UNIT IDENTIFICATION SIZE ============================== ============= UNIV6 9 UNIV42 <-.selected 12 UNIV65 7 UNIV40 18 UNIV45 10 UNIV1 8 UNIV80 10 UNIV36 5 UNIV70 7 GROUP TOTALS: 9 86 ********* GROUP 3 ********* PRIMARY UNIT PRIMARY UNIT IDENTIFICATION SIZE ============================== ============= UNIV46 4 UNIV7 11 UNIV90 16 UNIV49 <-.RHC Sample Selection RAT-STATS Companion Manual ********* GROUP 1 ********* PRIMARY UNIT PRIMARY UNIT IDENTIFICATION SIZE ============================== ============= UNIV51 11 UNIV44 17 UNIV32 14 UNIV78 <-.

selected 16 UNIV68 6 UNIV26 7 GROUP TOTALS: 9 89 ********* GROUP 6 ********* PRIMARY UNIT PRIMARY UNIT IDENTIFICATION SIZE ============================== ============= UNIV37 9 UNIV83 6 UNIV63 7 UNIV14 9 UNIV43 11 UNIV31 11 UNIV15 4 UNIV48 15 UNIV38 <-.selected 18 UNIV17 6 GROUP TOTALS: 9 92 (Rev. 10/2004) Page 1-15 .selected 17 GROUP TOTALS: 9 89 ********* GROUP 7 ********* PRIMARY UNIT PRIMARY UNIT IDENTIFICATION SIZE ============================== ============= UNIV69 14 UNIV18 9 UNIV25 7 UNIV59 11 UNIV30 12 UNIV10 4 UNIV24 11 UNIV62 <-.RAT-STATS Companion Manual RHC Sample Selection UNIV34 GROUP TOTALS: 9 12 84 60 420 SECONDARY UNIVERSE ============= 77 19 34 76 40 27 72 28 36 409 SECONDARY UNIVERSE ============= 44 28 31 44 52 52 20 68 76 415 SECONDARY UNIVERSE ============= 66 38 29 53 60 18 55 77 25 421 ********* GROUP 5 ********* PRIMARY UNIT PRIMARY UNIT IDENTIFICATION SIZE ============================== ============= UNIV84 17 UNIV35 4 UNIV16 7 UNIV81 17 UNIV27 9 UNIV85 6 UNIV19 <-.

selected 18 UNIV41 8 UNIV89 15 UNIV66 14 UNIV11 13 UNIV86 17 UNIV56 12 UNIV12 11 UNIV72 7 GROUP TOTALS: 9 115 SECONDARY UNIVERSE ============= 78 37 71 68 64 75 59 56 33 541 SECONDARY UNIVERSE ============= 27 49 41 63 64 47 71 63 46 471 ********* GROUP 9 ********* PRIMARY UNIT PRIMARY UNIT IDENTIFICATION SIZE ============================== ============= UNIV71 6 UNIV8 10 UNIV67 9 UNIV3 13 UNIV60 <-.selected 16 UNIV87 15 UNIV13 4 UNIV53 16 GROUP TOTALS: 9 110 SECONDARY UNIVERSE ============= 64 41 59 71 49 75 71 19 72 521 Page 1-16 (Rev. 10/2004) .selected 13 UNIV76 10 UNIV74 15 UNIV9 13 UNIV20 10 GROUP TOTALS: 9 99 ********* GROUP 10 ********* PRIMARY UNIT PRIMARY UNIT IDENTIFICATION SIZE ============================== ============= UNIV22 13 UNIV39 9 UNIV88 12 UNIV55 15 UNIV29 10 UNIV75 <-.RHC Sample Selection RAT-STATS Companion Manual ********* GROUP 8 ********* PRIMARY UNIT PRIMARY UNIT IDENTIFICATION SIZE ============================== ============= UNIV28 <-.

00 Time: 12:52 NUMBER OF PRIMARY UNITS IN THE POPULATION: NUMBER OF PRIMARY UNITS SAMPLED: SECONDARY UNIVERSE ============= 37 63 34 51 72 76 77 78 64 75 PRIMARY UNIT ID ========================= UNIV78 UNIV42 UNIV49 UNIV5 UNIV19 UNIV38 UNIV62 UNIV28 UNIV60 UNIV75 PRIMARY UNIT SIZE ============= 7 12 7 11 16 17 18 18 13 16 GROUP SIZE ============= 90 86 96 84 89 89 92 115 99 110 UNITS IN GROUP ===== 9 9 9 9 9 9 9 9 9 9 NOTE: In practice. Output file OutRHCsummary.OFFICE OF AUDIT SERVICES Date: 10/15/2004 GENERATION OF PRIMARY UNIT SAMPLE NAME OF OUTPUT FILE: C:\TEMP\OutRHCsummary.00 SECOND SEED NUMBER: 90 10 200. size of the group containing this primary unit number of universities (primary units) in this group (Rev. it is recommended that you not set the two seed values unless you are trying to duplicate prior results. 10/2004) Page 1-17 .txt FIRST SEED NUMBER: 100.RAT-STATS Companion Manual RHC Sample Selection DEPARTMENT OF HEALTH & HUMAN SERVICES OIG .000) µ This is the size of the primary unit.txt (1) UNIV78 UNIV42 UNIV49 UNIV5 UNIV19 UNIV38 UNIV62 UNIV28 UNIV60 UNIV75 (2) 37 63 34 51 72 76 77 78 64 75 (3) 7 12 7 11 16 17 18 18 13 16 (4) 90 86 96 84 89 89 92 115 99 110 (5) 9 9 9 9 9 9 9 9 9 9 Columns: (1) (2) (3) (4) (5) selected primary unit number of grants (secondary units) grant dollar amount (x $100.

where pps sampling is used for each group of primary units. No attention is paid to “size” here. 8. one secondary unit is chosen from each of the secondary groups. 10/2004) . A random sample of third-stage units is obtained for each of the chosen secondary units. and 10 using seed values of 100 and 200. The previous example was expanded to include geographical regions.RHC Three-Stage Sample Selection RAT-STATS Companion Manual Procedure for Three-Stage RHC Sampling 1.s in each group are chosen to be as nearly equal as possible. This is a random sample. 3. Example 9. The numbers of S. and the size of each secondary unit. 6. Using pps sampling. The size of the primary units is considered for this sample.TXT. 2. A sample of secondary units is obtained within each chosen primary unit by partitioning the primary unit into random groups of secondary units. A sample of primary units (clusters) is obtained as in the two-stage procedure. Page 1-18 (Rev. Primary units: 12 regions (select four) Secondary units: Universities (select 10 from each region) Third stage units: Grants (audit 20% from each university) Selection of Primary Units A file must be constructed containing (for each region) (1) the number of secondary units (universities) in this region and (2) the size of this region (total grant dollars). This file is GRANTSPU.U. The selected regions are 4.

10/2004) Page 1-19 .RAT-STATS Companion Manual RHC Three-Stage Sample Selection File GRANTSPU. After running the RHC Sample Selection program on each of these four regions. of third-stage units. the following universities were selected: (Rev.. size of S. each line of the corresponding file should contain: university ID.000) Selection of Secondary Units The three-stage RHC sample selection procedure requires the user to only obtain information for each selected primary unit (i. and 10 here). secondary unit ID. The information in each of these four files consists of the size of each secondary unit (university. here) and the number of thirdstage units in the universe for each secondary unit. number of grants at this university.e. 8.U. no..e.TXT contained in the previous two-stage RHC discussion.. regions 4. for each sampled P. total grant dollars i. Each of these files should resemble file UNIVRHC. 6.U. Consequently.TXT (1) REGION1 REGION2 REGION3 REGION4 REGION5 REGION6 REGION7 REGION8 REGION9 REGION10 REGION11 REGION12 (2) 117 63 91 123 107 116 102 118 122 85 94 62 (3) 1250 610 720 1320 1160 1240 960 1300 1320 640 930 550 Columns: (1) region ID (2) number of universities (secondary units) (3) size (total grant amount x $100.

these files can be joined to form one of the input files (the one containing primary/secondary unit information) for the three-stage RHC appraisal program which calculates the confidence interval. 75. 33. 64. The number of P. 115. 66. Each of these 40 samples (4 regions x 10 universities) is obtained randomly using the Single-Stage Random Numbers program. 70. 99 78. 46.U. 65. The user can set the number of S. 43. 65. NOTES: (1) The previous five program runs (one at the primary level and four at the secondary level) created five output files. (2) This example is examined in more detail in the three-stage RHC appraisal section. Next. 3.RHC Three-Stage Sample Selection RAT-STATS Companion Manual REGION 4 6 8 10 UNIVERSITIES 85. 99 112. 104. 2. equal to one if these are difficult to determine. run the RHC Two-Stage Sample Selection program. Store the output in a text file. 80 113. 6. 93. 39 Selection of Third-Stage Units Suppose that approximately 20% of the grants at each selected university are to be audited. 111. 89. 59.s in each universe P. 43. 10. 30. 73.U. 77. The size of all P. 34.s in the universe.s in the universe and the sample. 112. 7. Procedure: 1. 2. This is the middle column in file UNIVRHC.U. 55. 62. 7. Page 1-20 (Rev.U.TXT used in the previous illustration. 27. Using a word processor. 82. 78. 10/2004) . Summary of Input for RHC Sample Selection RHC Two-Stage The user must know: 1. 30.

2. 44. This is one of the input files to the RHC Three-Stage Appraisal program. ID b.U.U.U. RHC Three-Stage The user must know: 1. 5.U.U.U... 4. The number of S. Using a word processor or spreadsheet. .U. a.s in the universe within this P.U.U.s in each universe P.U.U. Merge the results from step 2 and each sampled P. (Rev. The user can set the number of S. 3. build a data file where each row consists of a. For each sampled P. The number of P. change the number of S.U.U. Number of third-stage units for this S. Using a word processor or spreadsheet. 7. c. The size of all S.. from 1 to the correct value.s within this P.TXT (below) for an example.s in the universe. equal to 1 if these are difficult to determine. into one file. Store the output in a text file. S.U.U. See PUSURHC3. . 54. (OK to use a value of 1 here and correct later). change the number of third-stage units for each sampled S.s in the universe and the sample.s to be sampled within each P. . 10/2004) Page 1-21 .RAT-STATS Companion Manual RHC Three-Stage Sample Selection 3. Run the RHC Two-Stage Sample Selection program. The size of all P.U. The number of S. Store the output in a text file. 4. For each sampled P. The values in the second column (123. For each sampled P.U. from one to the correct value.U. b. 6.s for each sampled P.) can be set to one and later changed to the correct values. from 1 to the correct value.U.s for each sampled P.U. Procedure: 1. 2. Using a word processor or spreadsheet.U. use the data set in step 4 as input to the RHC TwoStage Sample Selection program. Size of this S. change the number of S.U. 3.

8. the required size information must be known.RHC Three-Stage Sample Selection RAT-STATS Companion Manual NOTE: Although this procedure allows for substituting 1s for the number of second.TXT REGION4 123 UNIV85 54 UNIV46 44 UNIV7 77 UNIV82 52 UNIV30 54 UNIV34 50 UNIV27 76 UNIV66 76 UNIV65 62 UNIV80 70 REGION6 116 UNIV113 33 UNIV43 39 UNIV78 63 UNIV104 25 UNIV89 35 UNIV112 27 UNIV30 58 UNIV65 57 UNIV3 56 UNIV99 80 REGION8 118 UNIV112 75 UNIV6 34 UNIV7 51 UNIV93 54 UNIV75 52 UNIV111 84 UNIV62 64 UNIV115 59 UNIV70 65 UNIV99 60 REGION10 85 UNIV78 39 UNIV43 42 10 11 9 15 10 11 10 15 15 12 14 10 7 8 13 5 7 5 12 11 11 16 10 15 7 10 11 10 17 13 12 13 12 10 8 8 1320 3410 3 11 125 12 9 131 12 17 119 12 11 129 12 11 141 12 10 140 12 16 138 12 16 128 13 14 125 13 15 155 13 1240 3100 3 8 108 11 7 105 11 12 104 11 9 96 11 7 124 12 10 108 12 11 95 12 10 109 12 11 115 12 14 113 12 1300 3170 3 16 125 11 8 127 11 11 120 12 11 136 12 11 126 12 17 134 12 14 123 12 13 137 12 14 143 12 13 129 12 640 2320 3 7 62 8 7 68 8 Page 1-22 (Rev. File PUSURHC3. 10/2004) . This is the other input file required by the RHC three-stage appraisal program.TXT for this illustration) containing the sampled third-stage units. Build the data file (file PUSURHC3. Using a word processor or spreadsheet.and third-stage units in the original pass. the column of sample sizes (highlighted) was added to the files created by the five RHC Sample Selection programs.

10/2004) Page 1-23 .TXT .continued UNIV7 UNIV73 UNIV55 UNIV33 UNIV10 UNIV59 UNIV64 UNIV39 56 27 78 65 60 52 50 38 11 5 16 13 12 10 10 8 9 5 12 10 9 8 8 6 54 63 70 77 76 71 73 68 8 8 8 9 9 9 9 9 (Rev.RAT-STATS Companion Manual RHC Three-Stage Sample Selection File PUSURHC3.

s (universities) Sample: n = 5 P.U.U. Example 10. it turns out that universities 51 (in group 5) and 69 (in group 4) could not be used. Population: N = 90 P. UNIV47 UNIV64 UNIV52 UNIV69 UNIV51 24 46 38 66 55 5 9 9 14 11 191 175 185 203 196 18 18 18 18 18 The section of the output containing the contents of group 4 follows. PRIMARY UNIT ID ========================= UNIV47 UNIV64 UNIV52 UNIV69). 10/2004) . A similar approach can be used in a three-stage plan if one or more secondary units are nonusable within a selected primary unit.s The final section of the output using the RHC sample selection program is shown below. The following example will illustrate how to recover when one or more primary units are nonusable with a two-stage RHC sampling plan. There is a method of generating spares without having to start the sample selection process all over again. once the nonusable primary units have been identified. Page 1-24 (Rev.Generating Spares for RHC RAT-STATS Companion Manual Generating Spares for RHC Sampling One question that arises here is what to do if one or more of the selected primary units is unattainable or unusable for some reason. UNIV51)2)) can't use SECONDARY UNIVERSE ============= 24 46 38 66 55 PRIMARY UNIT SIZE ============= 5 9 9 14 11 GROUP SIZE ============= 191 175 185 203 196 UNITS IN GROUP ===== 18 18 18 18 18 The corresponding output file created from the first pass is shown below.

TXT.Selected. UNIV15 UNIV81 UNIV38 UNIV59 UNIV13 UNIV76 UNIV20 UNIV66 UNIV14 UNIV29 UNIV55 UNIV26 UNIV17 UNIV87 UNIV22 UNIV50 UNIV84 20 76 76 53 19 47 46 68 44 49 71 36 25 71 64 49 77 4 17 17 11 4 10 10 14 9 10 15 7 6 15 13 10 17 File TEMP1.TXT (Rev. notice that columns 2 and 3 above (i. This file (TEMP1..e.RAT-STATS Companion Manual Generating Spares for RHC 4 ********* PRIMARY UNIT PRIMARY UNIT IDENTIFICATION SIZE ============================== ============= UNIV15 4 UNIV81 17 UNIV38 17 UNIV59 11 UNIV13 4 UNIV76 10 UNIV20 10 UNIV66 14 UNIV14 9 UNIV29 10 UNIV55 15 UNIV26 7 UNIV17 6 UNIV87 15 UNIV22 13 UNIV50 10 UNIV84 17 UNIV69 <-.TXT) is shown below. 10/2004) Page 1-25 . PRIMARY UNIT SIZE and SECONDARY UNIVERSE) need to be switched. can't use 14 GROUP TOTALS: 18 203 ********* GROUP SECONDARY UNIVERSE ============= 20 76 76 53 19 47 46 68 44 49 71 36 25 71 64 49 77 66 957 Remove UNIV69 from the population and this group.TXT) using this group only. Construct a data file (same format as UNIVRHC. NOTE: When constructing this file. This was done correctly in TEMP1.

10/2004) .00 NUMBER OF PRIMARY UNITS IN THE POPULATION: NUMBER OF PRIMARY UNITS SAMPLED: SECONDARY UNIVERSE ============= 64 PRIMARY UNIT ID ========================= UNIV22 PRIMARY UNIT SIZE ============= 13 GROUP SIZE ============= 189 Page 1-26 (Rev. This generates another P. run the RHC Sample Selection program.Selected 13 UNIV29 10 UNIV84 17 UNIV55 15 UNIV26 7 UNIV66 14 UNIV38 17 UNIV14 9 UNIV17 6 GROUP TOTALS: 17 189 SECONDARY UNIVERSE ============= 76 47 20 46 53 19 49 71 64 49 77 71 36 68 76 44 25 891 Date: 10/15/2004 Time: 13:17 NAME OF OUTPUT FILE: C:\TEMP\OUTTEMP1.Generating Spares for RHC RAT-STATS Companion Manual Next. Your input file is TEMP1. DEPARTMENT OF HEALTH & HUMAN SERVICES OIG .00 SECOND SEED NUMBER: 17 1 UNITS IN GROUP ===== 17 200.TXT and your sample size is 1.txt GROUPS OF PRIMARY UNITS ********* GROUP 1 ********* PRIMARY UNIT PRIMARY UNIT IDENTIFICATION SIZE ============================== ============= UNIV81 17 UNIV76 10 UNIV15 4 UNIV20 10 UNIV59 11 UNIV13 4 UNIV50 10 UNIV87 15 UNIV22 <--. UNIV22 was selected. The output from this program is shown below.OFFICE OF AUDIT SERVICES GENERATION OF PRIMARY UNIT SAMPLE NAME OF INPUT FILE: C:\TEMP\TEMP1. (university) from this group.TXT FIRST SEED NUMBER: 100.U.

U. Remove UNIV51 from the population and this group. This generates another P. run the RHC Sample Selection program.TXT. DEPARTMENT OF HEALTH & HUMAN SERVICES OIG .OFFICE OF AUDIT SERVICES GENERATION OF PRIMARY UNIT SAMPLE NAME OF INPUT FILE: C:\TEMP\Temp2. UNIV86 was selected. NOTE: As before. be sure to switch columns 2 and 3 when building this file.TXT and the sample size is 1. from this group.RAT-STATS Companion Manual Generating Spares for RHC Next.txt GROUPS OF PRIMARY UNITS ********* GROUP 1 ********* PRIMARY UNIT PRIMARY UNIT IDENTIFICATION SIZE ============================== ============= UNIV18 9 UNIV90 16 UNIV63 7 UNIV65 7 UNIV31 11 UNIV56 12 UNIV23 9 UNIV86 <--. Construct data file TEMP2. UNIV63 UNIV18 UNIV58 UNIV31 UNIV56 UNIV90 UNIV65 UNIV12 UNIV16 UNIV2 UNIV79 UNIV74 UNIV8 UNIV86 UNIV53 UNIV23 UNIV33 31 38 57 52 59 72 32 56 34 21 77 71 49 75 72 45 25 7 9 11 11 12 16 7 11 7 4 18 15 10 17 16 9 5 File TEMP2.Selected 17 UNIV53 16 UNIV2 4 UNIV33 5 SECONDARY UNIVERSE ============= 38 72 31 32 52 59 45 75 72 21 25 Date: 10/15/2004 Time: 13:24 (Rev. 10/2004) Page 1-27 . The output from this program is shown below. The input file is TEMP2.TXT Again. repeat this for group 5.

00 NUMBER OF PRIMARY UNITS IN THE POPULATION: NUMBER OF PRIMARY UNITS SAMPLED: SECONDARY UNIVERSE ============= 75 PRIMARY UNIT ID ========================= UNIV86 PRIMARY UNIT SIZE ============= 17 GROUP SIZE ============= 185 Finally. Final output file (input file to RHC appraisal program) UNIV47 UNIV64 UNIV52 UNIV22 UNIV86 24 46 38 64 75 5 9 9 13 17 191 175 185 189 185 18 18 18 17 17 Discussion: RHC Three-Stage sampling A similar procedure can be used to generate “spare” secondary units. This is shown below. This file is one of the input files to the RHC Two-Stage Appraisal program.continued > UNIV79 UNIV74 UNIV12 UNIV58 UNIV16 UNIV8 GROUP TOTALS: 17 18 15 11 11 7 10 185 77 71 56 57 34 49 866 NAME OF OUTPUT FILE: C:\TEMP\OUTTEMP2. 10/2004) .TXT FIRST SEED NUMBER: 100.00 SECOND SEED NUMBER: 17 1 UNITS IN GROUP ===== 17 200. Page 1-28 (Rev.RHC Versus SRS RAT-STATS Companion Manual < OUTPUT -. For example. another secondary unit can be selected from this group using the procedure outlined above. if one of the secondary units within a selected primary unit is nonusable. be sure to update the original output file shown earlier to reflect the two new selected universities.

this correlation rule must also apply within each of the sampled primary units. at the secondary unit level. Comparison of RHC and Multistage SRS In general. Spares for this stage can be obtained in the usual manner using the single-stage random number generator software (Single Stage Random Numbers). . 10/2004) Page 1-29 . REGION10 REGION11 REGION12 (2) 117 63 91 (3) 1250 610 720 Columns: (1) unit ID (2) number of units (3) size of unit 85 94 62 640 930 550 For this example. consider the file containing the primary unit information used in the three-stage RHC illustration. provided there is a significant correlation between the second and third columns (Number of Units and Size of Unit) of each file using the RHC sample selection procedure. (Rev. the correlation between Size of Unit and Number of Units is .RAT-STATS Companion Manual RHC Versus SRS Discussion: Final-stage units At the second stage for RHC Two-Stage sampling and the third stage for RHC Three-Stage sampling.958. For a three-stage procedure. you can expect greater precision with the RHC procedure. and we would expect a two-stage RHC procedure to work quite well. a random sample of units is obtained. . To illustrate. (1) REGION1 REGION2 REGION3 .

T • $ . • • relatively simple and straightforward computations. This implies that when unbiased and stable point estimate of the universe total (T $ . $ ) . on the average.RHC Versus SRS RAT-STATS Companion Manual The benefits of RHC sampling include: • • increased precision if the above correlation rule is satisfied. the lower confidence limits will exhibit relatively small variation. since pps sampling is used to select a unit from each random group. This implies that when sampling indefinitely. T $ will exhibit relatively small variation. 10/2004) . and sampling indefinitely. Page 1-30 (Rev. T. maintaining the flavor of pps sampling. producing more reliable confidence a staple point estimate of the variance of T intervals. is equal to the actual universe total.

it is the proportion of the universe items that meet (or do not meet) a specified set of criteria. This proportion is typically an error rate (proportion of the universe in error) but. 10/2004) Page 2-1 . each sample item is either a yes response (met the criteria) or no response (did not meet the criteria). Also of interest may be the total number of items in the universe (Np) that meet the criteria. # Unrestricted # Stratified # Two-Stage Unrestricted Unrestricted # Three-Stage # RHC # RHC Two Stage Three Stage Cluster Multistage # Stratified # Stratified (Rev. This version of RAT-STATS contains eight modules that can be used to appraise an attribute sample.RAT-STATS Companion Manual Attribute Appraisals ATTRIBUTE APPRAISALS An attribute appraisal is carried out to estimate a particular universe proportion (p) and its corresponding sampling error. In an attribute sample. more generally. These sampling strategies are listed below and described in the sections to follow.

729 to 2. 10/2004) .000)(.205 (i.29 + 24. Consequently. The 90% confidence interval for the universe error rate (p) is from 17.03%.03)/2 = 20.000. then the estimated number of universe items in error is (10. If the universe size is N = 10. rather than the normal approximation.Unrestricted Attribute Appraisal RAT-STATS Companion Manual Unrestricted Attribute Appraisal An unrestricted sample is the same as a simple random sample. the corresponding 90% confidence interval for the total number of universe items in error is from 1.e. Example 1. every sample of size n has the same chance of being selected.03% but it is not in the center of this interval. The resulting 95% confidence interval for p is 16.29% to 24. Page 2-2 (Rev.66%.5%).29% and 24. The reason for this result is that this estimation procedure is based on the exact hypergeometric distribution.5% is between 17. a sample of size n is randomly obtained and the number of sample elements meeting the criteria (say. The sample error rate is then 82/400 = . For an unrestricted sample. The center of the 90% confidence interval is (17. 82 of the items did not contain the proper signature (were in error). Notice that the (point) estimate of 20.73% to 24.673 to 2.205) = 2.470. This is the estimate of p. the error rate for the entire universe.70% and for Np (the total number of errors in the universe) is from 1. In the sample.050 items.403. 20. x) is recorded. An unrestricted sample of 400 documents was obtained and examined to determine if they had the proper approval signature.. Using the RAT-STATS software.

There were x = 82 sample items in error. In the event that no items having the characteristic(s) of interest are found in the sample. k1 to k2. Consequently.025.90) / 2 = . 2.729/10. Consider the 90% confidence interval. Using these definitions of k1 and k2 for a 90% confidence interval. 10/2004) Page 2-3 . 3.000 = .1729 (i.. To find the upper limit of the 90% confidence interval. the user can be assured that the actual confidence level is at least 90%.e. the program determines the largest value of k (say.. A similar argument applies to the 95% confidence interval.05 is the value of TAIL.29%). The universe size (N) is declared to be a long integer in the RAT-STATS program.403 with a corresponding error rate of 2. 4. The corresponding error rate is 1. the user has the option of having the program determine both confidence limits or only the upper confidence limits. say. k2) for which the probability of observing x = 82 or more errors is > TAIL = . 31 −1 = (Rev. Define TAIL = (1 .2403 (i. 24. In the event that the number of items having the characteristic(s) of interest in the sample is the same as the sample size.05. where now the value of TAIL is .000 = ..e.147. This is k2 = 2.647. the user has the option of having the program determine both confidence limits or only the lower confidence limits.729.05 The 90% confidence interval for Np is.05. This also applies to 80% and 95% confidence intervals. so (referring to the Formulas section below) k1 is the smallest value of k for which the probability of observing 82 or fewer errors is > .03%). This value of k is k1 = 1.483. where . 17. NOTES: 1.403/10. the largest allowable universe size is N = 2 2.RAT-STATS Companion Manual Unrestricted Attribute Appraisal Discussion.

the value of TAIL is . 3. The procedure used to derive this confidence interval can be found in the following article. define TAIL = (1 .05 ⎛ N⎞ i= 0 ⎜ ⎟ ⎝n ⎠ x ⎜ ⎟⎜ where N = universe size n = sample size k = total number of universe items in error x = number of sample items in error Lower Limit: Let k1 = smallest value of k for which ∑ ⎛ k⎞ ⎛ N − k⎞ ⎟ ⎝i ⎠⎝n− i ⎠ > . use the same two equations. For an 80% confidence interval.05 ⎛ N⎞ i= x ⎜ ⎟ ⎝n ⎠ n ⎜ ⎟⎜ The resulting 90% confidence interval for the total number of universe items in error is from k1 to k2 and the corresponding 90% confidence interval for the error rate (p) is k1/N to k2/N.05. 41. 215-218. Upper Limit: Let k2 = largest value of k for which ∑ ⎛ k⎞ ⎛ N − k⎞ ⎟ ⎝i ⎠⎝n− i ⎠ > .025. John P.Unrestricted Attribute Appraisal RAT-STATS Companion Manual FORMULAS To determine a 90% confidence interval for the total number of universe items in error.90)/2 = . Buonaccorsi (1987). No. 10/2004) . For a 95% confidence interval.” The American Statistician.10. Vol. “A Note on Confidence Intervals for Proportions in Finite Populations. where .. Page 2-4 (Rev.05 is replaced with TAIL = .

RAT-STATS derives an exact confidence interval based on the hypergeometric distribution. Other software packages use this standard error to derive an approximate confidence interval based on the normal distribution. 10/2004) Page 2-5 .RAT-STATS Companion Manual Unrestricted Attribute Appraisal Standard Errors For universe proportion: Standard Error = $ (1 − p $) ⎛ p n⎞ $ = x/n. (Rev. n −1 ⎝ N⎠ NOTE: RAT-STATS does not use the preceding standard errors when deriving a confidence interval for the universe proportion and universe total. ⎜ 1 − ⎟ where p n −1 ⎝ N⎠ For universe total: Standard Error = N • $ (1 − p $) ⎛ p n⎞ ⎜1 − ⎟ .

500 outpatient claims. p. p. A universe of 2. the universe is divided into two or more nonoverlapping categories (strata). Page 2-6 (Rev. As with an unrestricted sample.000 inpatient claims and N2 = 1.000 were obtained for stratum 1 and 100 random numbers between 1 and 1.500 Medicare claims is stratified into inpatient (Stratum 1) and outpatient (Stratum 2) claims. 10/2004) . as the “error rate. A random sample of n1 = 100 inpatient claims revealed x1 = 2 errors and a random sample of n2 = 100 outpatient claims uncovered x2 = 6 errors.Stratified Attribute Appraisal RAT-STATS Companion Manual Stratified Attribute Appraisal In a stratified attribute sampling plan. The universe sizes are N1 = 1.” Example 2. The program will develop estimates for each stratum as well as for the entire universe. This plan involves obtaining a random sample from each of the strata. Of interest is the proportion. NOTE: In the discussion to follow. we will refer to the proportion. of claims in error (containing improper charges). the intent is to make a statistical estimate for a universe proportion (p) or a universe total (Np) that meets a specified set of criteria.500 were obtained for stratum 2. The program will request the number of universe items in each stratum and these values must be known. NOTE: Both random samples were obtained using the Single-Stage Random Numbers program whereby 100 random numbers between 1 and 1.

400% 1. The corresponding precision at the 90% confidence level is 2. A look at the inpatient stratum: The estimated error rate is 2%.000% 4.793% 2. The strata sample error rates are 2% and 6%.500) = 90.OFFICE OF AUDIT SERVICES Date: 2/7/2004 STRATIFIED ATTRIBUTE APPRAISAL AUDIT/REVIEW: Attribute .RAT-STATS Companion Manual Stratified Attribute Appraisal The following output was obtained from the stratified attribute appraisal program.616% 4.499% 158 6.500) x 100% = 4.955% 1.907% 37 1.519% 2. The term “precision” refers to the amount that is added and subtracted to the point estimate (2%.439% 49 1.839% PRECISION AT 95% CL ========= 2. 10/2004) Page 2-7 .4% (highlighted). The projected number of inpatient claims in error is (.QUANTITY PERCENT Discussion.QUANTITY PERCENT UPPER LIMIT .901% 62 2.961% 171 6.000) = 20 and the projected number for the outpatient stratum is (.196% (highlighted).000 1.500 37 PRECISION AT 90% CL ========= 2.02)(1.Stratified STRATUM ======= 1 2 COMBINED SAMPLE ====== 100 100 200 *ITEMS** ======== 2 6 8 **RATIO* ======== 2. (Rev. Consequently. ITEMS IN UNIVERSE =========== 20 90 110 STANDARD ERROR: STRATUM ======= 1 2 COMBINED LOWER LIMIT .06)(1.500 2. DEPARTMENT OF HEALTH & HUMAN SERVICES OIG .000% 6.711% 2.493% 183 7. here) in deriving a confidence interval.307% Time: 10:55 PROJ. the projected value for the universe is 20 + 90 =110 (highlighted) with a corresponding error rate of (110/2.301% *UNIVERSE* ========== 1.483% PRECISION AT 80% CL ========= 1.196% 3.

1.793% (highlighted). that is. 2. NOTE: These confidence intervals are not actually contained in the program output.439%.519% (highlighted). once again setting the lower limit equal to zero. 0% to 4. A look at the overall precision: The precision at the 90% level is 2. 1. Notice that these values are rounded to the nearest integer.616% (highlighted). Page 2-8 (Rev.481% to 10. Since the lower limit is negative. the 95% confidence interval for the proportion of inpatient claims in error is 2% ± 2.196% to 4. Using the precision at the 95% confidence level (i.e. -0. Multiplying these two values by 2. the corresponding 90% confidence interval for the total number of universe claims in error is 49 to 171. the 90% confidence interval for the proportion of inpatient claims in error is 2% ± 2. that is.207% to 9.793%. these confidence intervals are not actually contained in the program output..196%. Continuing the discussion from the inpatient stratum.839%.961% to 6. that is. that is.500 (and dividing by 100). As before. A look at the outpatient stratum: The estimated error rate is 6%. 2. The corresponding 95% confidence interval is 6% ± 4. it may be set equal to zero. Similarly.196%.439% (highlighted) and so the resulting 90% confidence interval for the universe proportion of claims in error is 4.Stratified Attribute Appraisal RAT-STATS Companion Manual Consequently.907%). the 95% confidence interval in the previous output can be obtained.519%. the 90% confidence interval for the proportion of outpatient claims in error is 6% ± 3.4% ± 2. that is.616%. 10/2004) .

that is confidence level) is 1. The Projected Items in Universe for stratum i is ( p $i )( Ni ) where value of Ratio is p Ni is the number of universe items in stratum i. $ (1 − p $i ) Ni − ni p ⋅ i Ni ni − 1 To obtain the Precision at 95% (80%) CL value for the i-th stratum.281551565545). where $= p $i ∑ ⎛⎜⎝ Ni ⎞⎟⎠ p N i= 1 L the summation is over all of the L strata and N = ENi is the total universe size.644853626951 $i is with 1.644853626951 A SE( p (Rev. 10/2004) Page 2-9 . The estimated $ is standard error of p $) = SE ( p ∑ L i=1 ⎛ Ni ⎞ $i )]2 ⎜ ⎟ [ SE ( p ⎝ N⎠ 2 $).644853626951 times the standard error of p 1644853626951 .959963984540 (1. The estimated standard error of p $i ) = SE ( p $ (1 − p $i ) Ni − ni p ⋅ i Ni ni − 1 $ (under the Ratio Overall estimates: The estimate of the universe proportion (error rate) is p heading). The Precision at 90% CL value is 1. replace 1. The PRECISION AT 90% CL (CL stands for $i .RAT-STATS Companion Manual Stratified Attribute Appraisal FORMULAS $i where p $i = xi / ni and where xi is the number of The estimated proportion for stratum i is p sample elements in stratum i in error and ni is the number of sample items from stratum i. The $i x 100%.

959963984540 in the above formula and for an 80% confidence interval. N.644853626951 by 1. multiply both ends of the confidence interval for the error rate by the universe size.644853626951 is replaced by 1. 10/2004) .281551565545. The resulting confidence intervals for the universe proportion (error rate) are $ ± (PRECISION) p To obtain the confidence intervals for the universe total. Page 2-10 (Rev. and round to the nearest integer. 1.Stratified Attribute Appraisal RAT-STATS Companion Manual The Precision at 95% CL value is obtained by replacing 1.

Example 3.U.U. Notice that at the first stage.s.s) So. there are N = 90 universities with government research grants. Then. Put another way. widespread universes.U. obtain a random sample of S. For a two-stage procedure. Rather than audit all grants at a selected university.” Example: 1st Stage: Carriers (P. General Comments 1. The goal of multistage sampling is to get the most precise results per unit of examination cost. 2. Because these universities are so widespread. etc. You can estimate cost overpayments for the entire universe with multistage sampling. This is a very convenient sampling procedure for many situations because you don't have to visit all the locations. it is very useful for large.s) 2nd Stage: Hospitals (S. 3. multistage sampling is cost effective when it is more costly to get to the sampling unit than it is to audit the sampling unit. These are called clusters. Multistage sampling is a very cost-effective sampling procedure when (1) obtaining a frame that lists all elements in the universe is very costly or impossible.U. it was decided (Rev. it was decided to use a two-stage sample using 10 universities. In a particular region of the U. clusters are the sampling unit (sampling units are not always individual people.s within each selected P. or (2) the cost of obtaining observations increases as the distance separating the elements increases. 10/2004) Page 2-11 .). records.U. the procedure is to first obtain a random sample of P..S.RAT-STATS Companion Manual Two-Stage Unrestricted Attribute Appraisal Two-Stage Unrestricted This is a special case of multistage sampling. the universe can be broken down into “subgroups.

Page 2-12 (Rev.75 20. Univ. the Two-Stage Unrestricted program estimates the universe proportion using a ratio estimator. The following data were obtained. M may be known or unknown.250 .50 15.375 . 1 2 3 4 5 6 7 8 9 10 Mi 50 65 45 48 52 58 42 66 40 56 522 mi 10 13 9 10 10 12 8 13 8 11 104 ai 4 5 2 3 5 3 3 4 2 4 pi . No estimate of the universe total (total number of grants containing improper charges) is available.308 .385 . and Mi is the total number of grants in the audit universe at the i-th university.00 10. mi is the number of audited (sampled) grants at the i-th university.31 10.00 14. If M is unknown.364 Mipi 20.222 .32 Define M to be the total number of secondary units (grants) in the universe.40 26.00 14. 10/2004) . This is illustrated in the computer output to follow where M is unknown.400 .00 25. In practice. where ai (pi) is the number (proportion) of grants in the sample from the i-th university containing such charges.Two-Stage Unrestricted Attribute Appraisal RAT-STATS Companion Manual (based on available resources) to audit roughly 20% of the grants at each selected university to estimate the proportion of grants containing charges after the scheduled completion of the grant.250 .500 .36 176.300 .00 20.

77% 36.50% 25.TXT SAMPLE SIZE =========== 10 9 10 8 8 13 10 12 13 11 104 SAMPLE ITEMS WITH CHARACTERISTIC(S) ================= 4 2 5 3 2 5 3 3 4 4 35 90 33.85% 80 PERCENT 30.22% 50.000812.36% Time: 14:13 PRIMARY UNIT ======= 1 2 3 4 5 6 7 8 9 10 TOTALS UNIVERSE ============ 50 45 52 42 40 65 48 58 66 56 522 TOTAL PRIMARY UNITS IN THE UNIVERSE OVERALL RATIO STANDARD ERROR CONFIDENCE LEVEL LOWER LIMIT FOR PROPORTION UPPER LIMIT FOR PROPORTION 95 PERCENT 28.0285)2 = .00% 30. 33.RAT-STATS Companion Manual Two-Stage Unrestricted Attribute Appraisal Date: 1/31/2004 DEPARTMENT OF HEALTH & HUMAN SERVICES OIG . v( p NOTE: $ r ) in the formula section.00% 25.3636)] / (50 + 45 + AAA + 56) = 176.OFFICE OF AUDIT SERVICES TWO STAGE UNRESTRICTED ATTRIBUTE APPRAISAL AUDIT/REVIEW: Example DATA FILE: C:\Temp\DATA2STG.00% 22.4000) + (45)(.00% 37.36% Discussion.78%) $ r is The estimated variance of p $ r ) = (Standard Error)2 = (. There is a formula for v ( p (Rev.00% 38. is $ r = (estimated number of grants containing improper charges in the sampled universities) / p (number of grants in the sampled universities) = [(50)(. p.47% RATIO ===== 40.13% 37.78% 2.32 / 522 = .19% 39.3378 (that is. The estimate of the universe proportion.09% 38.2222) + AAA + (56)(.46% 30. 10/2004) Page 2-13 .43% 90 PERCENT 29.

Two-Stage Unrestricted Attribute Appraisal

RAT-STATS Companion Manual

The corresponding approximate 95% confidence interval for the universe proportion is .3378 ± (Precision at 95% Confidence Level) The Precision at 95% Confidence Level value is the same as 1.959963984540 times the (Standard Error). So, the resulting 95% confidence interval can also be written .3378 ± (1.959963984540)(Standard Error) .3378 ± (1.959963984540)(.0285) .3378 ± .0559 .2819 to .3937 (28.19% to 39.36%).

$ r ) (the case here), it is NOTE 1: When the value of M is unknown in the formula for v ( p acceptable to replace this value with the average value of M for the sample (as was done in this illustration). This value is 522/10 = 52.2. This is an advantage of using this estimator, since it does not require knowledge of M. If M is known, the user has two choices: (1) use the above ratio estimator, where now M is known or (2) use an unbiased estimator of p, illustrated in Example 4.
NOTE 2: If the value of M is known, the RAT-STATS software uses the unbiased estimator, illustrated in Example 4.

Example 4. Suppose that in Example 3, it is known that there is a total number of M = 4,500 grants (secondary units) in all 90 universities. As a result, M is known and is equal to M/N = 4,500/90 = 50. The following output is obtained. Notice that estimated (projected) totals for each sampled university (primary unit) and for the entire universe are provided.

Page 2-14

(Rev. 10/2004)

RAT-STATS Companion Manual

Two-Stage Unrestricted Attribute Appraisal

Date: 1/31/2004

DEPARTMENT OF HEALTH & HUMAN SERVICES OIG - OFFICE OF AUDIT SERVICES TWO STAGE UNRESTRICTED ATTRIBUTE APPRAISAL AUDIT/REVIEW: Example DATA FILE: C:\Temp\DATA2STG.TXT SAMPLE SIZE =========== 10 9 10 8 8 13 10 12 13 11 104 SAMPLE ITEMS WITH CHARACTERISTIC(S) ================= 4 2 5 3 2 5 3 3 4 4 35 35.26% 3.67% 80 PERCENT 30.56% 39.97% 1,375 1,799 90 PERCENT 29.22% 41.31% 1,315 1,859 RATIO ===== 40.00% 22.22% 50.00% 37.50% 25.00% 38.46% 30.00% 25.00% 30.77% 36.36%

Time: 13:52

PRIMARY UNIT ======= 1 2 3 4 5 6 7 8 9 10 TOTALS

UNIVERSE ============ 50 45 52 42 40 65 48 58 66 56 522 4,500

PROJECTED ========= 20 10 26 16 10 25 14 15 20 20

OVERALL TOTALS 90 STANDARD ERROR

1,587 165 95 PERCENT 28.06% 42.47% 1,263 1,911

CONFIDENCE LEVEL LOWER LIMIT FOR PROPORTION UPPER LIMIT FOR PROPORTION LOWER LIMIT FOR TOTAL UPPER LIMIT FOR TOTAL

⎛ N ⎞ ⎛ A⎞ Discussion. An unbiased estimator of the universe proportion, p, is ⎜ ⎟ ⎜ ⎟ ⎝ M ⎠ ⎝ B⎠
where A = estimated number of grants containing improper charges in the sampled universities B = number of sampled universities and A/B = the projected average number of grants containing improper charges for the sampled universities So, A/B = [(50)(.4000) + (65)(.3846) + AAA + (56)(.3636)] / 10 = 176.32 / 10 = 17.632 grants

(Rev. 10/2004)

Page 2-15

Two-Stage Unrestricted Attribute Appraisal

RAT-STATS Companion Manual

The projected number of grants containing improper charges for the universe is NA(A/B); that is, (90)(17.632) = 1586.88 grants. Since there are 4500 grants in the universe, then the estimated proportion of grants containing improper charges is 1586.88/4,500 = .3526 (35.26%).

The corresponding approximate 95% confidence interval for the universe proportion is .3526 ± (Precision at 95% Confidence Level), which is the same as .3526 ± 1.959963984540(Standard Error). So, the resulting 95% confidence interval is .3526 ± (1.959963984540)(Standard Error) .3526 ± (1.959963984540)(.0367) .3526 ± .0720 .2806 to .4247 (28.06% to 42.47%). The corresponding 95% confidence interval for the total number of grants in the universe containing improper charges is 1587 ± 324; that is, 1263 to 1911 grants.

$ u and the corresponding confidence interval are contained in the NOTE: Formulas for p formula section.

FORMULAS
Case 1: When the total number of secondary units in the universe (M) is unkown, the ratio

$ r . Define: estimator for the universe proportion is used. This estimator will be called p
Mi = number of secondary units in the universe for the i-th sampled primary unit, mi of which are sampled

Page 2-16

(Rev. 10/2004)

RAT-STATS Companion Manual

Two-Stage Unrestricted Attribute Appraisal

$i = proportion of secondary units having the attribute of interest in the i-th sampled p
primary unit n = number of sampled primary units N = number of primary units in the universe (must be known) M = number of secondary units in the universe (may be known or unknown)

M = average number of secondary units per primary units in the universe. This is equal to M/N if M is known. It can be estimated using m if M is unknown, where m is the
average number of secondary units in the sampled primary units. -------------------------------------------------------------The estimate of the universe proportion having the attribute of interest is

$r = i = 1 p n

$i ∑ Mi p

n

∑ Mi
i=1

$ r is The estimated variance of p
⎛ n ⎞ 2 ⎜ 2 $i − p $r ) ⎟ Mi ( p n ⎜ ⎟ $ ⎛ ⎞ $ 1 ⎛ N − n⎞ ⎛ 1 ⎞ ⎜ i= 1 2 ⎜ Mi − mi ⎟ pi (1 − pi ) ⎟+ $r ) = ⎜ M v( p ⎟⎜ ⎟ i ⎝ N ⎠ ⎝ nM 2 ⎠ ⎜ ⎟ nNM 2 n− 1 ⎝ Mi ⎠ mi − 1 i= 1 ⎜ ⎟ ⎜ ⎟ ⎝ ⎠

NOTE:

$ r is the square root of v ( p $r ) . The standard error of p

(Rev. 10/2004)

Page 2-17

Two-Stage Unrestricted Attribute Appraisal

RAT-STATS Companion Manual

Case 2: When the total number of secondary units in the universe (M) is known, an unbiased

$u . estimator for the universe proportion is used. This estimator will be called p n

$u = p

$i Mi p ∑ N
i= 1

M

n

$ u is The estimated variance of p
⎞ ⎛ n 2⎟ ⎜ $ $ ( Mi pi − Mpu ) n ⎟ ⎜ ⎛ ⎞ $ $ N − n 1 1 ⎛ ⎞ ⎛ ⎞ = i 1 2 ⎜ Mi − mi ⎟ pi (1 − pi ) ⎟+ ⎜ $u ) = ⎜ M v( p ⎟⎜ ⎟ i ⎝ N ⎠ ⎝ nM 2 ⎠ ⎜ ⎟ nNM 2 n− 1 ⎝ Mi ⎠ mi − 1 i=1 ⎟ ⎜ ⎟ ⎜ ⎠ ⎝

$ u is the square root of v ( p $ u ). NOTE 1: The standard error of p
NOTE 2: When estimating the total number of secondary units in the universe having the $ u and the standard error of p $ u are multiplied by M. attribute of interest, both p The Precision at the 95% Confidence Level value for the universe total is $ u ). For the Precision at the 90% (1.959963984540)(M)(standard error of p Confidence Level value, replace 1.959963984540 with 1.644853626951 and for the Precision at the 80% Confidence Level value, replace 1.959963984540 with 1.281551565545.

Page 2-18

(Rev. 10/2004)

The situation discussed in Example 4 was extended the following year to a three stage procedure by defining: Stage 1: REGION (select 4 out of 12 regions) Stage 2: UNIVERSITY (select 10 from each selected region) Stage 3: GRANT (select approximately 20% of all grants at each university) Using the random number module (Single-Stage Random Numbers). regions 5. and 10 were selected as the sampled primary units. 8. and ai is the number of grants in the sample from the i-th university containing charges after the scheduled completion of the grant (in error). 10/2004) Page 2-19 .RAT-STATS Companion Manual Three-Stage Unrestricted Attribute Appraisal Three-Stage Unrestricted Example 5. mi is the number of audited grants at each university (chosen to be roughly 20% of Mi). 1 2 3 4 5 6 7 8 9 10 Mi 47 51 45 46 46 50 50 57 54 64 mi 9 10 9 9 9 10 10 11 11 13 ai 3 2 4 1 3 1 4 3 4 2 (Rev.Stage Random Numbers) from the available universities in each of the four selected regions. where Mi is the number of grants in the universe for each university. 10 to be sampled) Univ. 10 universities (secondary units) were randomly selected (again using program Single. Next. REGION 5 (contains 90 secondary units. The following data were obtained. 7.

1 2 3 4 5 6 7 8 9 10 Mi 45 39 43 34 54 54 34 59 49 43 mi 9 8 9 7 11 11 7 12 10 9 ai 3 2 4 1 2 3 1 1 4 2 REGION 10 (contains 120 secondary units. 1 2 3 4 5 6 7 8 9 10 Mi 59 68 57 72 70 73 83 89 73 77 mi 12 14 11 14 14 15 17 18 15 15 ai 2 6 3 6 1 2 5 4 3 2 The resulting data set in called DATA3ST. The corresponding computer output using the Three-Stage Unrestricted program immediately follows. For this illustration.Three-Stage Unrestricted Attribute Appraisal RAT-STATS Companion Manual REGION 7 (contains 110 secondary units. 10 to be sampled) Univ. the total number of third-stage units in the universe (S) is unknown.TXT and is shown on the next page. 10 to be sampled) Univ. Page 2-20 (Rev. 10 to be sampled) Univ. 10/2004) . 1 2 3 4 5 6 7 8 9 10 Mi 53 59 52 67 59 73 51 75 66 58 mi 11 12 10 13 12 15 10 15 13 12 ai 2 5 1 3 1 6 3 2 1 4 REGION 8 (contains 85 secondary units.

(Rev.TXT --REGION UNIV1 UNIV2 UNIV3 UNIV4 UNIV5 UNIV6 UNIV7 UNIV8 UNIV9 UNIV10 REGION UNIV1 UNIV2 UNIV3 UNIV4 UNIV5 UNIV6 UNIV7 UNIV8 UNIV9 UNIV10 REGION UNIV1 UNIV2 UNIV3 UNIV4 UNIV5 UNIV6 UNIV7 UNIV8 UNIV9 UNIV10 REGION UNIV1 UNIV2 UNIV3 UNIV4 UNIV5 UNIV6 UNIV7 UNIV8 UNIV9 UNIV10 5 90 47 9 51 10 45 9 46 9 46 9 50 10 50 10 57 11 54 11 64 13 7 110 53 11 59 12 52 10 67 13 59 12 73 15 51 10 75 15 66 13 58 12 8 85 45 9 39 8 43 9 34 7 54 11 54 11 34 7 59 12 49 10 43 9 10 120 59 12 68 14 57 11 72 14 70 14 73 15 83 17 89 18 73 15 77 15 10 ))))))) 3 2 4 1 3 1 4 3 4 2 10 2 ))))))) 5 1 3 1 6 3 2 1 4 10 3 2 4 1 2 3 1 1 4 2 10 2 6 3 6 1 2 5 4 3 2 There are 90 secondary units (universities) in this primary unit (region).Data set DATA3ST. there were 53 grants (third-stage units).RAT-STATS Companion Manual Three-Stage Unrestricted Attribute Appraisal --. 10/2004) Page 2-21 . In UNIV1. 10 were audited. Eleven of these grants were sampled and two contained improper charges.

00% 22.18% 41.27% 42.33% 29.36% 15.33% 7.86% 7.44% 11.00% Page 2-22 (Rev.38% 18.33% 20.08% 8.11% 33.67% 42.TXT NEXT STAGE UNIVERSE ============ 90 47 51 45 46 46 50 50 57 54 64 510 110 53 59 52 67 59 73 51 75 66 58 613 85 45 39 43 34 54 54 34 59 49 43 454 120 59 68 57 72 70 73 83 89 73 SAMPLE SIZE =========== 10 9 10 9 9 9 10 10 11 11 13 101 10 11 12 10 13 12 15 10 15 13 12 123 10 9 8 9 7 11 11 7 12 10 9 93 10 12 14 11 14 14 15 17 18 15 MEETING CRITERIA ======== 3 2 4 1 3 1 4 3 4 2 27 2 5 1 3 1 6 3 2 1 4 28 3 2 4 1 2 3 1 1 4 2 23 2 6 3 6 1 2 5 4 3 Time: 15:18 FIRST STAGE SECOND STAGE ============================ REGION 5 UNIV1 UNIV3 UNIV3 UNIV4 UNIV5 UNIV6 UNIV7 UNIV8 UNIV9 UNIV10 TOTALS REGION 7 UNIV1 UNIV2 UNIV3 UNIV4 UNIV5 UNIV6 UNIV7 UNIV8 UNIV9 UNIV10 TOTALS REGION 8 UNIV1 UNIV2 UNIV3 UNIV4 UNIV5 UNIV6 UNIV7 UNIV8 UNIV9 UNIV10 TOTALS REGION 10 UNIV1 UNIV2 UNIV3 UNIV4 UNIV5 UNIV6 UNIV7 UNIV8 UNIV9 RATIO ======= 33.00% 13.00% 23.41% 22.86% 27.33% 25.OFFICE OF AUDIT SERVICES THREE STAGE ATTRIBUTE APPRAISAL AUDIT/REVIEW: Example NAME OF INPUT FILE: C:\Temp\DATA3ST.67% 10.00% 44.33% 33.22% 16.18% 27.22% 20.44% 14.Three-Stage Unrestricted Attribute Appraisal RAT-STATS Companion Manual Date: 1/31/2004 DEPARTMENT OF HEALTH & HUMAN SERVICES OIG .00% 40.33% 40.33% 40.00% 27.69% 33.33% 10.27% 14.29% 8.00% 30. 10/2004) .29% 18.14% 13.27% 36.00% 44.

the following results were obtained: (1) Estimate of the proportion of grants in the universe containing improper charges is . 10/2004) Page 2-23 . (4) The 95% confidence interval for the total number of grants containing improper charges is from 12.06% 1.70% 12.115 20.2142 to .27% 13.210.78% 14.304 90 PERCENT 21.210 2. Discussion.182 24.33% is 3/9 (2) 405 = 90 (in Region 5) + 110 (in Region 7) + 85 (in Region 8) + 120 (in Region 10) (3) 2.35% 17. OVERALL POINT ESTIMATE OF THE PROPORTION OVERALL STANDARD ERROR (PROPORTION) OVERALL POINT ESTIMATE OF UNIVERSE TOTAL OVERALL STANDARD ERROR (TOTAL) CONFIDENCE LEVEL LOWER LIMIT FOR PROPORTION UPPER LIMIT FOR PROPORTION LOWER LIMIT FOR TOTAL UPPER LIMIT FOR TOTAL 80 PERCENT 22.33% OVERALL TOTALS UNIVERSE SAMPLED ======= ====== ======== ======= FIRST STAGE 12 4 SECOND STAGE 405{} 40 THIRD STAGE 2.942 Highlighted values: (1) 33. This uses Equation 5 in the Formulas and Definitions section.477 21.477 to 21.RAT-STATS Companion Manual Three-Stage Unrestricted Attribute Appraisal UNIV10 TOTALS 77 721 15 145 2 34 13.415 95 PERCENT 21.238 21.298{} 462 SAMPLED ITEMS MEETING CRITERIA 112 {} UNIVERSE SIZES FOR THE SECOND AND THIRD STAGES REPRESENT THE UNIVERSES FOR THE SAMPLED PRIOR STAGE.42% 26. (Rev. Based on the preceding output.2406 (24.298 = (47 + AAA + 64) in Region 5 + (53 + AAA + 58) in Region 7 + (45 + AAA + 43) in Region 8 + (59 + AAA + 77) in Region 10. (3) Estimate of the total number of grants in the universe containing improper charges is 17.06%).84% 26.2670.942. (2) The 95% confidence interval for the universe proportion is from .33% 25. This uses Equation 1 in the Formulas and Definitions section.

Three-Stage Unrestricted Attribute Appraisal

RAT-STATS Companion Manual

The standard error of the proportion estimate is .0135 (1.35%) and the 95% confidence interval is .2406 ± (1.959963984540)(.0135); that is, .2142 to .2670. The value of .0135 is found by taking the square root of the value obtained using Equation 2. For the universe total, the standard error is 2,415 (using the square root of Equation 6) and the 95% confidence interval is 17,210 ± (1.959963984540)(2,415); that is, 12,477 to 21,942.

Is the total number of third-stage units in the universe known or unknown?
Let S = the total number of third-stage units in the universe. Two cases will be considered: Case 1: S is unknown. Case 2: S is known. For Case 1:

$ r ) . To estimate the To estimate the proportion, p, use the ratio (biased) estimator ( p
number in the population (T) having the attribute of interest, use the unbiased estimator

$ ). (T u
For Case 2:

$ u ). To estimate the number To estimate the proportion, p, use the unbiased estimator ( p

$ ). in the population (T) having the attribute of interest, use the unbiased estimator ( T u
NOTE:

$ was used where T $ = 17,210. In the preceding example, the unbiased estimator T u u
$ r ) is from case 1 (Equation 1 Here S was unknown and so the proportion estimator ( p $ r (.0135) is the in the Formulas and Definitions section). The standard error of p square root of the value obtained from Equation 2.

Page 2-24

(Rev. 10/2004)

RAT-STATS Companion Manual

Three-Stage Unrestricted Attribute Appraisal

FORMULAS
Definitions S = total number of third-stage units in the universe N = number of primary units in the universe n = number of primary units in the sample Mi = number of secondary units (universe) in i-th primary unit mi = number of secondary units (sample) in i-th primary unit Bij = number of third-stage units (universe) in j-th secondary unit within i-th primary unit bij = number of third-stage units (sample) in j-th secondary unit within i-th primary unit

$ij = proportion of bij sampled third-stage units in error p
$ r (Case 1) and p $ u (Case 2) Formulas for p $ r (Equation 1) The ratio estimator p mi n Mi $ij Bij p mi i=1 j= 1 $r = p mi n Mi Bij mi i=1 j=1

$r Estimated variance of p

(Equation 2)

$ ⎞ $ ⎞⎤ ⎡⎛ ⎛ T B u u $ $ $ $r ) = ⎜ ⎟ ⎜ ⎟⎥ v( p T − − − R B ⎢ i i 2 N N ⎝ ⎠ ⎝ ⎠⎦ n(n − 1) NS i = 1 ⎢ ⎥ ⎣ N−n

n

2

$ ⎞ $ ⎞⎤ ⎡⎛ ⎛ B T 1 Mi ( Mi − mi ) i i $ $ ⎜ ⎟ ⎜ ⎟⎥ R B − − − T + ⎢ ij ij 2 M M ⎝ ⎠ ⎝ nNS i = 1 mi (mi − 1) j = 1 ⎢ i i⎠⎥ ⎣ ⎦


n

n

mi

2

+

1

nNS 2 i = 1

∑ ∑
Mi mi

mi

j= 1

Bij ( Bij − bij ) $ij (1 − p $ij ) p bij − 1

(Rev. 10/2004)

Page 2-25

Three-Stage Unrestricted Attribute Appraisal

RAT-STATS Companion Manual

$ = N where T u n

$ = T i

i= 1 mi Mi

∑ T$i
$ij ∑ Bij p
j=1

n

mi

$ = B p T ij ij $ ij

$ = N B u n
$ = Mi B i mi

$ i ∑B
i= 1
mi

n

∑ Bij
j=1

S=

S N $ T $ R = $u Bu

Notes:

$= p $r (1) In Equation 2, R
(2) To estimate S , use the sample estimate s where mi n $ 1 B M i s= u = Bij N n mi j=1 i= 1

∑ ∑
mi

$u The unbiased estimator p n

(Equation 3)

N $u = p nS

$ij ∑ ∑ Bij p
i= 1

Mi mi

j=1

Page 2-26

(Rev. 10/2004)

RAT-STATS Companion Manual

Three-Stage Unrestricted Attribute Appraisal

$u Estimated variance of p

(Equation 4)

$ = 0 in equation 2. Consequently, This variance can most easily be determined by setting R
$ ⎞2 ⎛$ T $u ) = v( p ⎜T − u ⎟ 2 ∑ i N⎠ n(n − 1) NS i =1 ⎝ N −n
n

+

1 nNS 2


+

n

i =1

$ ⎞2 M i ( M i − mi ) mi ⎛ $ T ⎜ Tij − i ⎟ ∑ mi (mi − 1) j =1 ⎝ Mi ⎠

1

nNS 2 i = 1

∑ ∑
Mi mi

n

mi

j= 1

Bij ( Bij − bij ) $ij (1 − p $ij ) p bij − 1

$ The unbiased estimator T u $ = S⋅ p $ u then Since T u n
mi

(Equation 5)

$ = N T u n

$ij ∑ ∑ Bij p
i=1

Mi mi

j= 1

NOTE:

The value of S is not needed here. (Equation 6)

$ Estimated variance of T u

$ ) = S 2 ⋅ v( p $ u ) , then Since v (T u

$ ) = N ( N − n) v (T u n(n − 1)
+ N n

n

i=1

$ ⎞ ⎛ T $ ⎜ Ti − u ⎟ N⎠ ⎝

2


n

n

i= 1

Mi ( Mi − mi ) mi (mi − 1)

mi

j=1

$ ⎞ ⎛ $ − Ti ⎟ ⎜T ij Mi ⎠ ⎝

2

N + n
NOTE:

∑ ∑
i=1

Mi mi

mi

j= 1

Bij ( Bij − bij ) $ij (1 − p $ij ) p bij − 1

The value of S is not needed here.

(Rev. 10/2004)

Page 2-27

beds (for hospitals). In general. An audit was carried out for state-supported university grants in a particular region. is considered rather than obtaining a simple random sample of P.U.s are selected using the RHC SAMPLE SELECTION program. The universities (P. In other words.U.s within each P.s having the attribute of interest (e.s having a larger size should contain a larger number of S.U. 10/2004) .U. and the number of S.s.s) were Page 2-28 (Rev.U. The P.U. in error) is recorded.s are selected.g. and the number of S. A random sample is then obtained for each selected P. it was decided to employ a two-stage sample using three of the 27 state-supported universities.. Example 6. Because these universities are so widespread. Rather than audit all the vouchers at a selected university. refer to the RHC SAMPLE SELECTION section on page 1-11. the size of each P. you can expect improved precision using the RHC procedure if there is a high correlation between the size of each P.s.U.U. it was decided (based on available resources) to audit 250 vouchers at each selected university to estimate the proportion of vouchers containing improper charges.U. When the P. contained in the RANDOM NUMBERS section of this manual. The size of each P. P.RHC Two Stage Attribute Appraisal RAT-STATS Companion Manual RHC Two Stage For a discussion on the motivation behind the RHC sampling procedure..U.U. and so forth. The universe consisted of all charge vouchers recorded for these grants. dollars.U. It provides a method of sample selection that allows sampling without replacement while “maintaining the flavor” of sampling using probability proportional to size. is rather arbitrary and can be the number of people.U.

10/2004) Page 2-29 . the following output is produced: (Rev. The following file (RHC2STAGE.000) Using the RHC SAMPLE SELECTION program.TXT) was constructed: (1) UNIV1 UNIV2 UNIV3 UNIV4 UNIV5 UNIV6 UNIV7 UNIV8 UNIV9 UNIV10 UNIV11 UNIV12 UNIV13 UNIV14 UNIV15 UNIV16 UNIV17 UNIV18 UNIV19 UNIV20 UNIV21 UNIV22 UNIV23 UNIV24 UNIV25 UNIV26 UNIV27 (2) 14928 12454 17404 18700 15989 15046 16696 15754 17404 12100 17522 16578 12218 15164 12336 13986 12925 14457 18464 15400 15164 17522 15282 16461 13396 14222 14693 (3) 8 4 13 16 11 9 11 10 13 4 13 11 4 9 4 7 6 9 16 10 9 13 9 11 7 7 9 Data file RHC2STAGE.TXT Columns: (1) unit ID (2) number of vouchers (3) size of university (dollar amount of grants x $10.RAT-STATS Companion Manual RHC Two Stage Attribute Appraisal to be selected using the RHC procedure where the “size” of each university was the total grant dollars awarded to that university.

Selected 11 UNIV21 9 UNIV4 16 UNIV5 11 UNIV16 7 GROUP TOTALS: 9 84 SECONDARY UNIVERSE ============= 14.OFFICE OF AUDIT SERVICES GENERATION OF PRIMARY UNIT SAMPLE NAME OF INPUT FILE: C:\TEMP\RHC2STAGE.404 17.218 140.522 13.404 17.336 14.400 15.723 ********* GROUP 2 ********* PRIMARY UNIT PRIMARY UNIT IDENTIFICATION SIZE ============================== ============= UNIV19 16 UNIV20 <-.461 12.578 15.989 13.Selected 10 UNIV3 13 UNIV9 13 UNIV22 13 UNIV25 7 UNIV13 4 GROUP TOTALS: 9 90 Page 2-30 (Rev.886 SECONDARY UNIVERSE ============= 12.100 15.457 16.700 15.txt GROUPS OF PRIMARY UNITS Time: 14:29 ********* GROUP 1 ********* PRIMARY UNIT PRIMARY UNIT IDENTIFICATION SIZE ============================== ============= UNIV27 9 UNIV2 4 UNIV6 9 UNIV1 8 UNIV7 <-.464 15.046 14.928 16.522 16.396 12.282 133.656 SECONDARY UNIVERSE ============= 18.693 12.RHC Two Stage Attribute Appraisal RAT-STATS Companion Manual Date: 10/22/2004 DEPARTMENT OF HEALTH & HUMAN SERVICES OIG .986 137.754 17.222 12. 10/2004) .Selected 10 UNIV14 9 UNIV26 7 UNIV15 4 UNIV18 9 UNIV24 11 UNIV10 4 UNIV23 9 GROUP TOTALS: 9 79 ********* GROUP 3 ********* PRIMARY UNIT PRIMARY UNIT IDENTIFICATION SIZE ============================== ============= UNIV17 6 UNIV11 13 UNIV12 11 UNIV8 <-.454 15.925 17.164 14.696 15.164 18.

10/2004) Page 2-31 .696 15.TXT UNIV7 UNIV20 UNIV8 16696 15400 15754 11 10 10 84 79 90 9 9 9 Using these two files.RAT-STATS Companion Manual RHC Two Stage Attribute Appraisal DEPARTMENT OF HEALTH & HUMAN SERVICES OIG .TXT.754 PRIMARY UNIT ID ========================= UNIV7 UNIV20 UNIV8 PRIMARY UNIT SIZE ============= 11 10 10 GROUP SIZE ============= 84 79 90 UNITS IN GROUP ===== 9 9 9 The selected universities are UNIV7.OFFICE OF AUDIT SERVICES Date: 10/22/2004 GENERATION OF PRIMARY UNIT SAMPLE NAME OF OUTPUT FILE: C:\TEMP\OutRHCsummary. the following output is generated by the TWO-STAGE RHC program: (Rev. UNIV8.400 15.00 Time: 14:29 NUMBER OF PRIMARY UNITS IN THE POPULATION: NUMBER OF PRIMARY UNITS SAMPLED: SECONDARY UNIVERSE ============= 16. and UNIV20.TXT The final portion of the preceding output was stored by the sample selection program in file RHC2PU.00 SECOND SEED NUMBER: 27 3 200.TXT) required by the appraisal program (TWO-STAGE RHC) and is shown below: 1 2 3 250 250 250 8 12 5 <-. A sample of 250 vouchers is obtained at each university with the following results: Number of sampled vouchers 250 250 250 Number of vouchers in error 8 12 5 University UNIV7 UNIV20 UNIV8 This information is recorded in the data file (RHC2DATA.Data file RHC2DATA. This file is shown below: Primary unit file RHC2PU.txt FIRST SEED NUMBER: 100.

59 9. 10/2004) .000 31 253 POINT ESTIMATE ============= 4.OFFICE OF AUDIT SERVICES TWO STAGE RHC ATTRIBUTE APPRAISAL AUDIT/REVIEW: Example NAME OF DATA FILE: C:\TEMP\RHC2DATA.907.TXT NAME OF PRIMARY UNIT FILE: C:\TEMP\RHC2PU.251.95 772.30 27 TOTALS: P.09 TOTAL VARIANCE ======== 293.132.696 15.791.15 173.68 338. NBR ==== 1 2 3 UNITS SECONDARY PRIMARY IN PRIMARY UNIT ID UNIVERSE UNIT SIZE GROUP SIZE GROUP ========================= ============= ============= ============= ===== UNIV7 16.RHC Two Stage Attribute Appraisal RAT-STATS Companion Manual Date: 10/22/2004 DEPARTMENT OF HEALTH & HUMAN SERVICES OIG .054 PRIMARY UNITS SAMPLED: PRIMARY UNITS NOT SAMPLED: PRIMARY UNITS IN POPULATION: PROJECTED QUANTITY IN UNIVERSE: STANDARD ERROR: Page 2-32 (Rev.02 P.64 8.41 3.755 3.696 11 84 9 UNIV20 15.U.636 7.25 3.707.033.329.754 47.87 5.034.846.19 4.400 10 79 9 UNIV8 15.400 15. NBR ==== 1 2 3 TOTALS: WITHIN VARIANCE ======== 260.03 .850 --.097.U.132.TXT OUTPUT FILE: C:\TEMP\OutRHC2attr.900 9.541.78 BETWEEN VARIANCE ======== 32.72 12.850 SAMPLE SIZE ====== 250 250 250 750 SECONDARY UNIVERSE ============= 16.835.68 2.245.755.618.90 5. NBR ==== 1 2 3 TOTALS: SAMPLE MEAN ============== .754 10 90 9 47.U.557.txt Time: 14:40 PRIMARY UNIT ======= 1 2 3 TOTALS SAMPLE SIZE ====== 250 250 250 750 == ATTRIBUTE == SAMPLE TOTAL ============= 8 12 5 25 P.VARIANCE COMPONENTS --SIZES RATIO ======== 7.839.575.290.079.87 3 24 27 12.187.05 .

132. The PRECISION PERCENT is 39.779 vouchers.281551565545 90 PERCENT 7.90 + 5. The total variance is V1 + V2 = 9. This accounts for the variation within the primary units (universities). 10/2004) Page 2-33 .048. the estimate for the total number of vouchers containing improper charges is: $ = (84/11)(16696)(.755 (rounded) $ . this can be reduced by sampling a larger number of P.769 18.575.U. obtained by multiplying the standard error by 1.914 30. the first component of this variance is the “within variance” To determine the variance of T equal to V2 = 772.87 and the estimated standard error of $ is 9.s.644853626951)(3054)/12755. Referring to For this example. The larger variance component is the variation between the primary units measured by V1 = 8.02) T = 4.329.644853626951 95 PERCENT 6.707.644853626951 and dividing by the point estimate (expressed as a percentage). T (Rev.048) + (90/10)(15754)(.779 5.670 3.054. In general.09.93% 1.054.69% 1.020. Discussion $ 1 = 8/250 = .329.68 + 2.742 5. that is (100)(1.39%.839.RAT-STATS Companion Manual RHC Two Stage Attribute Appraisal CONFIDENCE LEVEL LOWER LIMIT UPPER LIMIT PRECISION AMOUNT PRECISION PERCENT Z-VALUE USED 80 PERCENT 8.731 17.959963984540 Final results: The point estimate of the total number of vouchers in error is 12.987 46.024 39.557.78.731 to 17.841 16. The resulting 90% confidence interval is from 7.72 = 12.835.755 with a corresponding standard error of 3.87 = 3. p $ 2 = 12/250 = .032) + (79/10)(15400)(.707. Notice that this a very wide confidence interval. and p $ 3 = 5/250 = . p the Formula section on the next page.079.39% 1.032.

stands for primary unit and S. 10/2004) . P.U.959963984540)(3. 6.s in the sample Mi = number of S. 9. 5. 4. is secondary unit Ai = size of i-th P. 2.755 ± (1.U.U.U. that is 6. where p $ Estimated variance of T $ ) = V1 + V2 where v (T Page 2-34 (Rev.U.s in the i-th sampled P. (sample) Estimator of population total (T) n ⎛B ⎞ $ $i T = ∑ ⎜ i ⎟ Mi p A ⎝ ⎠ i =1 i $ i = proportion of mi sampled S.s having the attribute of interest.054) . 3. 7.U.RHC Two Stage Attribute Appraisal RAT-STATS Companion Manual The 95% confidence interval for T is 12.U.U.U.U. 8.U. (population) mi = number of S. Si = (size of i-th P.)/(size of entire population) = Ai/(size of entire population) Bi = total size for i-th group Bi = (total size for i-th group)/(size of entire population) = Bi/(size of entire population) N = number of P. 10.U.s in the population Ni = number of P. FORMULAS Definitions 1.769 to 18.s in the i-th group n = number of P.s in the i-th sampled P.742.

281551565545. v (T NOTE: $ is The estimated standard error of T Approximate 95% confidence interval for the population total (T) $ ± 1959963984540 $) T . (Rev.644853626951 and for an 80% confidence interval.959963984540 with 1. v (T NOTE: For a 90% confidence interval.RAT-STATS Companion Manual RHC Two Stage Attribute Appraisal V1 = ∑N i =1 2 n 2 i n −N N − ∑ N i2 i =1 $ ⎛M p $⎞ πi ⎜ i i − T ⎟ ∑ S ⎝ ⎠ i =1 i n 2 and V2 = ∑ π i i =1 n $ (1 − p $i ) Mi p ( M i − mi ) i Si mi − 1 $) . 10/2004) Page 2-35 . replace 1.959963984540 with 1. replace 1.

3.and two-stage procedures. and 10 and the output file created by the RHC SAMPLE SELECTION program is GRANTSPUOUT.TXT. The selected regions are 4. A sample of secondary units is obtained within each chosen primary unit by partitioning the primary unit into random groups. Using pps sampling. 6. For this example. A sample of primary units (clusters) is obtained as in the one. No attention is paid to “size” here. The steps for such a procedure are the following: 1. The size of the primary units is considered for this sample.TXT. This file is GRANTSPU. and the size of each secondary unit. the stages are: Stage 1: REGION (select 4 out of 12 regions) Stage 2: UNIVERSITY (select 10 from each selected region) Stage 3: GRANT (select approximately 20% of all grants at each university) Selection of Primary Units A file must be constructed containing (for each region) (1) the number of secondary units (universities) in this region and (2) the size of this region (total dollars of grants).RHC Three Stage Attribute Appraisal RAT-STATS Companion Manual RHC Three Stage The RHC sampling procedure can used for a three-stage design. one secondary unit is chosen from each of the secondary groups. The group sizes are chosen to be as nearly equal as possible. 10/2004) . Example 7. The situation discussed in Example 5 in the THREE-STAGE UNRESTRICTED section will be appraised using the RHC methodology. A random sample of third-stage units is obtained for each of the chosen secondary units. Prior to running the appraisal program. 2. Page 2-36 (Rev. This is a random sample. the user must run the RAT-STATS RHC SAMPLE SELECTION program. 8. where pps sampling is used for each group of primary units.

RAT-STATS Companion Manual RHC Three Stage Attribute Appraisal Data set GRANTSPU.TXT (the highlighted values). Columns: (1) unit ID (2) number of universities (S. The actual number of S.s must be known for the selected P.U.s.s must then be inserted into file GRANTSPUOUT.s [column (2)] equal to one in this file.TXT --REGION6 REGION4 REGION8 REGION10 116 123 118 85 1240 1320 1300 640 3100 3410 3170 2320 3 3 3 3 (Rev.000) --.U.s) (3) size (total grant dollar amount x $100. 10/2004) Page 2-37 .U.U.TXT (1) REGION1 REGION2 REGION3 REGION4 REGION5 REGION6 REGION7 REGION8 REGION9 REGION10 REGION11 REGION12 (2) 117 63 91 123 107 116 102 118 122 85 94 62 (3) 1250 610 720 1320 1160 1240 960 1300 1320 640 930 550 NOTE: It is okay to set the number of S. The correct number of S.Data set GRANTSPUOUT.U.

TXT GROUPS OF PRIMARY UNITS Time: 15:09 ********* GROUP 1 ********* PRIMARY UNIT PRIMARY UNIT IDENTIFICATION SIZE ============================== ============= REGION2 610 REGION6 <-.320 SECOND SEED NUMBER: 100.RHC Three Stage Attribute Appraisal RAT-STATS Companion Manual Date: 10/22/2004 DEPARTMENT OF HEALTH & HUMAN SERVICES OIG .Selected 1.100 SECONDARY UNIVERSE ============= 63 116 117 296 SECONDARY UNIVERSE ============= 123 107 94 324 SECONDARY UNIVERSE ============= 62 118 122 302 SECONDARY UNIVERSE ============= 91 102 85 278 200.Selected 1.Selected 640 GROUP TOTALS: FIRST SEED NUMBER: 3 2.410 ********* GROUP 3 ********* PRIMARY UNIT PRIMARY UNIT IDENTIFICATION SIZE ============================== ============= REGION12 550 REGION8 <-.170 ********* GROUP 4 ********* PRIMARY UNIT PRIMARY UNIT IDENTIFICATION SIZE ============================== ============= REGION3 720 REGION7 960 REGION10 <-. do not set these seed values.160 REGION11 930 GROUP TOTALS: 3 3.320 GROUP TOTALS: 3 3.250 GROUP TOTALS: 3 3.00 ********* GROUP 2 ********* PRIMARY UNIT PRIMARY UNIT IDENTIFICATION SIZE ============================== ============= REGION4 <-.320 REGION5 1. NUMBER OF PRIMARY UNITS IN THE POPULATION: NUMBER OF PRIMARY UNITS SAMPLED: 12 4 Page 2-38 (Rev.240 REGION1 1.OFFICE OF AUDIT SERVICES GENERATION OF PRIMARY UNIT SAMPLE NAME OF INPUT FILE: C:\TEMP\GRANTSPU. 10/2004) .Selected 1.00 In practice.300 REGION9 1.

170 2. 78. 77.TXT Selection of Secondary Units The input for three-stage RHC program can be greatly simplified if you only obtain information for each selected primary unit (i.320 UNITS IN GROUP ===== 3 3 3 3 NOTE: This is file GRANTSPUOUT. REGION8.TXT. in that order. 43. 65. 70. The results are: REGION 4 6 8 10 UNIVERSITIES 85. 93. (Rev. 3. 112. 104. 8.e. 6. university) and the number of third-stage units in the universe for each secondary unit (it is acceptable to set these equal to one and change later). REGION6. This input is shown in files REGION4.TXT. 59.. 82.410 3.RAT-STATS Companion Manual RHC Three Stage Attribute Appraisal < Program output . Each line in the files contains the number of third-stage units (grants) in the universe and the size of that secondary unit (total grant $ x $100. After each of these four files is the computer output using the RHC SAMPLE SELECTION program.TXT.TXT. Using a word processor or spreadsheet.continued > PRIMARY UNIT ID ========================= REGION6 REGION4 REGION8 REGION10 SECONDARY UNIVERSE ============= 116 123 118 85 PRIMARY UNIT SIZE ============= 1. 10. 73. 89.300 640 GROUP SIZE ============= 3. The file for this example is PUSURHC3. 66.000). 99 78. 80 113. 46. 10/2004) Page 2-39 . 7. 27. 34. 99 112.100 3. regions 4. 62. 33. 75. 111. 6. and REGION10. 115. A sample of 10 universities is selected for each region.320 1.TXT.240 1. 43. 65. 55. and 10 here). 64. The information consists of the size of each secondary unit (here. 7. these files can be joined to form one of the input files (the one containing primary/secondary unit information) for the three-stage RHC program which calculates the confidence interval. 39 The previous five program runs (one at the primary level and four at the secondary level) created five output files. 30. 30.

10/2004) ....continued .. Columns: (1) unit ID (2) number of grants (3) size of university (grant amount x $100.continued .> UNIV51 UNIV52 UNIV53 UNIV54 UNIV55 UNIV56 UNIV57 UNIV58 UNIV59 UNIV60 UNIV61 UNIV62 UNIV63 UNIV64 UNIV65 UNIV66 UNIV67 UNIV68 UNIV69 UNIV70 UNIV71 UNIV72 UNIV73 UNIV74 UNIV75 UNIV76 UNIV77 UNIV78 UNIV79 UNIV80 UNIV81 UNIV82 UNIV83 UNIV84 UNIV85 UNIV86 UNIV87 UNIV88 UNIV89 UNIV90 UNIV91 UNIV92 UNIV93 UNIV94 UNIV95 UNIV96 UNIV97 UNIV98 UNIV99 UNIV100 62 52 56 70 41 65 76 30 75 27 36 61 58 61 62 76 71 34 62 23 28 46 62 67 25 24 57 44 73 70 45 52 34 59 54 31 69 22 47 57 31 73 52 22 22 29 56 74 43 57 13 11 11 15 9 14 16 7 16 7 8 13 12 13 14 16 15 8 13 6 7 10 14 14 6 6 12 10 16 15 10 11 8 12 11 7 14 6 10 12 7 15 11 6 6 7 12 16 9 12 < ..> UNIV101 UNIV102 UNIV103 UNIV104 UNIV105 UNIV106 UNIV107 UNIV108 UNIV109 UNIV110 UNIV111 UNIV112 UNIV113 UNIV114 UNIV115 UNIV116 UNIV117 UNIV118 UNIV119 UNIV120 UNIV121 UNIV122 UNIV123 34 28 73 65 68 28 55 37 54 47 44 24 50 52 66 50 66 34 73 37 42 59 45 8 7 15 14 14 7 11 9 11 10 9 6 10 11 14 10 14 8 16 8 9 12 11 (2) (3) 52 37 38 20 69 69 77 32 49 73 21 62 55 59 55 36 51 26 25 73 71 47 34 25 39 49 76 21 33 54 45 74 69 50 29 56 64 66 63 57 71 45 21 46 48 44 71 67 23 54 11 9 9 5 15 15 17 7 10 15 5 13 11 12 11 8 11 7 6 15 15 10 8 6 9 10 16 5 8 11 10 16 14 10 7 12 14 14 14 12 15 10 5 10 10 9 15 14 6 11 NOTE: This file has 123 lines.TXT (1) UNIV1 UNIV2 UNIV3 UNIV4 UNIV5 UNIV6 UNIV7 UNIV8 UNIV9 UNIV10 UNIV11 UNIV12 UNIV13 UNIV14 UNIV15 UNIV16 UNIV17 UNIV18 UNIV19 UNIV20 UNIV21 UNIV22 UNIV23 UNIV24 UNIV25 UNIV26 UNIV27 UNIV28 UNIV29 UNIV30 UNIV31 UNIV32 UNIV33 UNIV34 UNIV35 UNIV36 UNIV37 UNIV38 UNIV39 UNIV40 UNIV41 UNIV42 UNIV43 UNIV44 UNIV45 UNIV46 UNIV47 UNIV48 UNIV49 UNIV50 < ...000) Page 2-40 (Rev.RHC Three Stage Attribute Appraisal RAT-STATS Companion Manual Data set REGION4..

Selected 15 3RD STAGE UNIVERSE ============= 56 25 45 37 68 56 73 74 70 (Rev.Selected 11 UNIV103 15 UNIV86 7 UNIV2 9 UNIV81 10 UNIV58 7 UNIV36 12 UNIV49 6 GROUP TOTALS: 12 125 3RD STAGE UNIVERSE ============= 76 67 29 55 54 73 31 37 45 30 56 23 576 3RD STAGE UNIVERSE ============= 52 69 44 62 37 46 54 42 52 21 69 63 611 ********* GROUP 2 ********* SECONDARY UNIT SECONDARY UNIT IDENTIFICATION SIZE ============================== ============= UNIV52 11 UNIV6 15 UNIV46 <-.TXT GROUPS OF SECONDARY UNITS Time: 14:21 ********* GROUP 1 ********* SECONDARY UNIT SECONDARY UNIT IDENTIFICATION SIZE ============================== ============= UNIV57 16 UNIV48 14 UNIV35 7 UNIV107 11 UNIV85 <-.RAT-STATS Companion Manual RHC Three Stage Attribute Appraisal Date: 10/25/2004 DEPARTMENT OF HEALTH & HUMAN SERVICES OIG .OFFICE OF AUDIT SERVICES GENERATION OF SECONDARY UNIT SAMPLE NAME OF INPUT FILE: C:\TEMP\REGION4. 10/2004) Page 2-41 .Selected 9 UNIV69 13 UNIV108 9 UNIV44 10 UNIV50 11 UNIV121 9 UNIV1 11 UNIV43 5 UNIV87 14 UNIV39 14 GROUP TOTALS: 12 131 < GROUPS 3 THROUGH 9 ARE OMITTED HERE > ********* GROUP 10 ********* SECONDARY UNIT SECONDARY UNIT IDENTIFICATION SIZE ============================== ============= UNIV53 11 UNIV24 6 UNIV42 10 UNIV120 8 UNIV105 14 UNIV97 12 UNIV119 16 UNIV32 16 UNIV80 <-.

RHC Three Stage Attribute Appraisal RAT-STATS Companion Manual UNIV96 UNIV13 UNIV62 UNIV59 GROUP TOTALS: 13 7 11 13 16 155 29 55 61 75 724 DEPARTMENT OF HEALTH & HUMAN SERVICES OIG . 10/2004) .00 NUMBER OF SECONDARY UNITS IN THE POPULATION: NUMBER OF SECONDARY UNITS SAMPLED: 3RD STAGE UNIVERSE ============= 54 44 77 52 54 50 76 76 62 70 SECONDARY UNIT ID ========================= UNIV85 UNIV46 UNIV7 UNIV82 UNIV30 UNIV34 UNIV27 UNIV66 UNIV65 UNIV80 SECONDARY UNIT SIZE ============= 11 9 17 11 11 10 16 16 14 15 GROUP SIZE ============= 125 131 119 129 141 140 138 128 125 155 UNITS IN GROUP ===== 12 12 12 12 12 12 12 13 13 13 Page 2-42 (Rev.00 SECOND SEED NUMBER: 123 10 Time: 14:21 200.OFFICE OF AUDIT SERVICES Date: 10/25/2004 GENERATION OF SECONDARY UNIT SAMPLE NAME OF OUTPUT FILE: C:\TEMP\OutRegion4.txt FIRST SEED NUMBER: 100.

(Rev. 10/2004) Page 2-43 .continued --> UNIV59 UNIV60 UNIV61 UNIV62 UNIV63 UNIV64 UNIV65 UNIV66 UNIV67 UNIV68 UNIV69 UNIV70 UNIV71 UNIV72 UNIV73 UNIV74 UNIV75 UNIV76 UNIV77 UNIV78 UNIV79 UNIV80 UNIV81 UNIV82 UNIV83 UNIV84 UNIV85 UNIV86 UNIV87 UNIV88 UNIV89 UNIV90 UNIV91 UNIV92 UNIV93 UNIV94 UNIV95 UNIV96 UNIV97 UNIV98 UNIV99 UNIV100 UNIV101 UNIV102 UNIV103 UNIV104 UNIV105 UNIV106 UNIV107 UNIV108 UNIV109 UNIV110 UNIV111 UNIV112 UNIV113 UNIV114 UNIV115 UNIV116 67 56 33 40 68 70 57 40 54 65 62 28 56 41 31 31 46 38 62 63 50 53 39 39 39 25 67 47 54 50 35 66 65 71 29 74 66 71 43 62 80 57 22 33 78 25 76 39 48 54 63 28 69 27 33 52 33 23 13 10 7 8 13 13 10 7 10 12 12 5 10 8 6 6 9 7 12 12 9 9 7 7 7 5 13 9 10 9 7 13 12 13 6 14 13 13 8 11 14 11 5 6 5 9 8 5 5 7 12 8 8 10 8 7 15 10 Data set REGION6.RAT-STATS Companion Manual RHC Three Stage Attribute Appraisal UNIV1 UNIV2 UNIV3 UNIV4 UNIV5 UNIV6 UNIV7 UNIV8 UNIV9 UNIV10 UNIV11 UNIV12 UNIV13 UNIV14 UNIV15 UNIV16 UNIV17 UNIV18 UNIV19 UNIV20 UNIV21 UNIV22 UNIV23 UNIV24 UNIV25 UNIV26 UNIV27 UNIV28 UNIV29 UNIV30 UNIV31 UNIV32 UNIV33 UNIV34 UNIV35 UNIV36 UNIV37 UNIV38 UNIV39 UNIV40 UNIV41 UNIV42 UNIV43 UNIV44 UNIV45 UNIV46 UNIV47 UNIV48 UNIV49 UNIV50 UNIV51 UNIV52 UNIV53 UNIV54 UNIV55 UNIV56 UNIV57 UNIV58 56 27 56 23 72 24 61 65 68 40 64 66 80 53 36 53 47 73 41 58 45 43 56 35 34 65 78 35 31 58 29 76 57 42 69 58 31 33 40 51 60 78 39 46 58 59 53 57 28 63 31 60 30 30 40 26 24 44 10 5 11 5 13 5 11 12 13 8 12 13 14 9 7 10 9 14 8 11 9 8 10 7 7 13 14 7 6 11 6 14 10 8 13 11 6 6 8 9 11 14 7 9 11 11 10 10 6 12 6 11 6 6 8 5 5 8 <-.TXT NOTE: This file has 116 lines.

Selected 8 UNIV85 13 UNIV109 12 UNIV87 10 UNIV2 5 UNIV80 9 UNIV53 6 GROUP TOTALS: 11 108 3RD STAGE UNIVERSE ============= 60 58 76 47 33 67 63 54 27 53 30 568 3RD STAGE UNIVERSE ============= 57 57 24 39 65 60 59 56 51 50 58 576 ********* GROUP 2 ********* SECONDARY UNIT SECONDARY UNIT IDENTIFICATION SIZE ============================== ============= UNIV33 10 UNIV48 10 UNIV6 5 UNIV43 <-.txt GROUPS OF SECONDARY UNITS Time: 13:57 ********* GROUP 1 ********* SECONDARY UNIT SECONDARY UNIT IDENTIFICATION SIZE ============================== ============= UNIV52 11 UNIV45 11 UNIV32 14 UNIV86 9 UNIV113 <-.Selected 14 UNIV13 14 UNIV60 10 3RD STAGE UNIVERSE ============= 58 43 40 69 57 31 76 50 80 80 56 Page 2-44 (Rev.Selected 7 UNIV68 12 UNIV41 11 UNIV46 11 UNIV1 10 UNIV40 9 UNIV88 9 UNIV36 11 GROUP TOTALS: 11 105 < GROUPS 3 THROUGH 9 ARE OMITTED HERE > ********* GROUP 10 ********* SECONDARY UNIT SECONDARY UNIT IDENTIFICATION SIZE ============================== ============= UNIV20 11 UNIV22 8 UNIV39 8 UNIV111 8 UNIV100 11 UNIV29 6 UNIV105 8 UNIV79 9 UNIV99 <-.RHC Three Stage Attribute Appraisal RAT-STATS Companion Manual Date: 10/25/2004 DEPARTMENT OF HEALTH & HUMAN SERVICES OIG . 10/2004) .OFFICE OF AUDIT SERVICES GENERATION OF SECONDARY UNIT SAMPLE NAME OF INPUT FILE: C:\TEMP\REGION6.

10/2004) Page 2-45 .00 SECOND SEED NUMBER: 116 10 Time: 13:57 200.OFFICE OF AUDIT SERVICES Date: 10/25/2004 GENERATION OF SECONDARY UNIT SAMPLE NAME OF OUTPUT FILE: C:\TEMP\OutRegion6.00 NUMBER OF SECONDARY UNITS IN THE POPULATION: NUMBER OF SECONDARY UNITS SAMPLED: 3RD STAGE UNIVERSE ============= 33 39 63 25 35 27 58 57 56 80 SECONDARY UNIT ID ========================= UNIV113 UNIV43 UNIV78 UNIV104 UNIV89 UNIV112 UNIV30 UNIV65 UNIV3 UNIV99 SECONDARY UNIT SIZE ============= 8 7 12 9 7 10 11 10 11 14 GROUP SIZE ============= 108 105 104 96 124 108 95 109 115 113 UNITS IN GROUP ===== 11 11 11 11 12 12 12 12 12 12 (Rev.txt FIRST SEED NUMBER: 100.RAT-STATS Companion Manual RHC Three Stage Attribute Appraisal UNIV54 GROUP TOTALS: 12 6 113 30 670 DEPARTMENT OF HEALTH & HUMAN SERVICES OIG .

10/2004) .continued ....continued .TXT UNIV1 UNIV2 UNIV3 UNIV4 UNIV5 UNIV6 UNIV7 UNIV8 UNIV9 UNIV10 UNIV11 UNIV12 UNIV13 UNIV14 UNIV15 UNIV16 UNIV17 UNIV18 UNIV19 UNIV20 UNIV21 UNIV22 UNIV23 UNIV24 UNIV25 UNIV26 UNIV27 UNIV28 UNIV29 UNIV30 UNIV31 UNIV32 UNIV33 UNIV34 UNIV35 UNIV36 UNIV37 UNIV38 UNIV39 UNIV40 UNIV41 UNIV42 UNIV43 UNIV44 UNIV45 UNIV46 UNIV47 UNIV48 UNIV49 UNIV50 72 44 43 55 27 34 51 42 54 25 82 65 33 48 32 82 35 54 34 62 26 31 58 61 61 54 53 56 57 26 25 37 79 60 57 27 31 75 26 36 36 49 83 71 31 42 62 54 31 80 15 10 10 12 7 8 11 10 12 6 17 14 8 10 8 17 8 12 8 14 6 7 13 13 14 12 11 12 12 6 5 9 16 13 12 7 7 15 6 9 9 10 17 15 7 10 14 11 7 16 < ....> UNIV101 UNIV102 UNIV103 UNIV104 UNIV105 UNIV106 UNIV107 UNIV108 UNIV109 UNIV110 UNIV111 UNIV112 UNIV113 UNIV114 UNIV115 UNIV116 UNIV117 UNIV118 24 26 40 77 27 65 61 36 26 38 84 75 26 45 59 59 57 58 5 6 10 16 6 15 13 9 6 9 17 16 6 10 13 13 12 12 NOTE: This file has 118 lines..RHC Three Stage Attribute Appraisal RAT-STATS Companion Manual Data set REGION8. Page 2-46 (Rev..> UNIV51 UNIV52 UNIV53 UNIV54 UNIV55 UNIV56 UNIV57 UNIV58 UNIV59 UNIV60 UNIV61 UNIV62 UNIV63 UNIV64 UNIV65 UNIV66 UNIV67 UNIV68 UNIV69 UNIV70 UNIV71 UNIV72 UNIV73 UNIV74 UNIV75 UNIV76 UNIV77 UNIV78 UNIV79 UNIV80 UNIV81 UNIV82 UNIV83 UNIV84 UNIV85 UNIV86 UNIV87 UNIV88 UNIV89 UNIV90 UNIV91 UNIV92 UNIV93 UNIV94 UNIV95 UNIV96 UNIV97 UNIV98 UNIV99 UNIV100 77 36 75 68 34 55 42 36 36 66 61 64 72 65 58 49 30 75 33 65 55 38 36 60 52 65 49 27 48 36 66 62 70 68 53 38 35 36 26 26 51 25 54 56 81 73 44 50 60 31 16 9 16 15 8 12 10 9 9 15 13 14 15 14 13 11 7 16 8 14 12 9 9 13 11 14 10 7 10 9 15 14 15 15 11 9 8 9 6 6 11 5 11 12 17 15 10 11 13 7 < .

TXT GROUPS OF SECONDARY UNITS Time: 14:03 ********* GROUP 1 ********* SECONDARY UNIT SECONDARY UNIT IDENTIFICATION SIZE ============================== ============= UNIV54 15 UNIV46 10 UNIV33 16 UNIV86 9 UNIV112 <-.Selected 8 UNIV44 15 UNIV68 16 UNIV42 10 UNIV48 11 UNIV1 15 UNIV41 9 UNIV89 6 UNIV37 7 GROUP TOTALS: 11 127 < GROUPS 3 THROUGH 9 ARE OMITTED HERE > ********* GROUP 10 ********* SECONDARY UNIT SECONDARY UNIT IDENTIFICATION SIZE ============================== ============= UNIV21 6 UNIV23 13 UNIV40 9 UNIV110 9 UNIV100 7 UNIV30 6 UNIV104 16 UNIV81 15 UNIV99 <-.RAT-STATS Companion Manual RHC Three Stage Attribute Appraisal Date: 10/25/2004 DEPARTMENT OF HEALTH & HUMAN SERVICES OIG .Selected 13 UNIV13 8 UNIV60 15 3RD STAGE UNIVERSE ============= 26 58 36 38 31 26 77 66 60 33 66 (Rev.Selected 16 UNIV85 11 UNIV108 9 UNIV87 8 UNIV2 10 UNIV55 8 UNIV34 13 GROUP TOTALS: 11 125 3RD STAGE UNIVERSE ============= 68 42 79 38 75 53 36 35 44 34 60 564 3RD STAGE UNIVERSE ============= 62 80 34 71 75 49 54 72 36 26 31 590 ********* GROUP 2 ********* SECONDARY UNIT SECONDARY UNIT IDENTIFICATION SIZE ============================== ============= UNIV47 14 UNIV50 16 UNIV6 <-.OFFICE OF AUDIT SERVICES GENERATION OF SECONDARY UNIT SAMPLE NAME OF INPUT FILE: C:\TEMP\REGION8. 10/2004) Page 2-47 .

00 NUMBER OF SECONDARY UNITS IN THE POPULATION: NUMBER OF SECONDARY UNITS SAMPLED: 3RD STAGE UNIVERSE ============= 75 34 51 54 52 84 64 59 65 60 SECONDARY UNIT ID ========================= UNIV112 UNIV6 UNIV7 UNIV93 UNIV75 UNIV111 UNIV62 UNIV115 UNIV70 UNIV99 SECONDARY UNIT SIZE ============= 16 8 11 11 11 17 14 13 14 13 GROUP SIZE ============= 125 127 120 136 126 134 123 137 143 129 UNITS IN GROUP ===== 11 11 12 12 12 12 12 12 12 12 Page 2-48 (Rev.RHC Three Stage Attribute Appraisal RAT-STATS Companion Manual UNIV56 GROUP TOTALS: 12 12 129 55 572 DEPARTMENT OF HEALTH & HUMAN SERVICES OIG .txt FIRST SEED NUMBER: 100. 10/2004) .OFFICE OF AUDIT SERVICES Date: 10/25/2004 GENERATION OF SECONDARY UNIT SAMPLE NAME OF OUTPUT FILE: C:\TEMP\OutRegion8.00 SECOND SEED NUMBER: 118 10 Time: 14:03 200.

10/2004) Page 2-49 . (Rev.TXT UNIV1 UNIV2 UNIV3 UNIV4 UNIV5 UNIV6 UNIV7 UNIV8 UNIV9 UNIV10 UNIV11 UNIV12 UNIV13 UNIV14 UNIV15 UNIV16 UNIV17 UNIV18 UNIV19 UNIV20 UNIV21 UNIV22 UNIV23 UNIV24 UNIV25 UNIV26 UNIV27 UNIV28 UNIV29 UNIV30 UNIV31 UNIV32 UNIV33 UNIV34 UNIV35 UNIV36 UNIV37 UNIV38 UNIV39 UNIV40 UNIV41 UNIV42 UNIV43 UNIV44 UNIV45 34 32 69 23 60 72 56 28 38 60 58 37 70 37 81 53 63 32 33 37 77 52 63 41 45 34 61 70 34 22 66 69 65 26 43 65 80 74 38 43 47 59 42 54 73 6 5 10 4 9 11 9 5 6 9 9 6 10 6 12 9 10 5 5 6 11 8 10 7 8 6 10 10 5 4 10 10 10 4 7 10 12 11 6 7 8 9 7 9 11 <--continued --> UNIV46 UNIV47 UNIV48 UNIV49 UNIV50 UNIV51 UNIV52 UNIV53 UNIV54 UNIV55 UNIV56 UNIV57 UNIV58 UNIV59 UNIV60 UNIV61 UNIV62 UNIV63 UNIV64 UNIV65 UNIV66 UNIV67 UNIV68 UNIV69 UNIV70 UNIV71 UNIV72 UNIV73 UNIV74 UNIV75 UNIV76 UNIV77 UNIV78 UNIV79 UNIV80 UNIV81 UNIV82 UNIV83 UNIV84 UNIV85 78 72 30 47 52 24 26 22 57 78 62 57 68 52 54 41 61 79 50 54 53 40 44 39 72 76 34 27 40 41 25 41 39 58 71 37 30 78 59 29 12 11 5 8 8 4 4 4 9 12 10 9 10 8 9 7 10 12 8 9 9 7 7 7 11 11 5 5 7 7 4 7 7 9 11 6 5 12 9 5 Note: This file has 85 lines.RAT-STATS Companion Manual RHC Three Stage Attribute Appraisal Data set REGION10.

Selected 7 UNIV2 5 UNIV50 8 UNIV34 4 UNIV46 12 GROUP TOTALS: 8 62 3RD STAGE UNIVERSE ============= 54 69 41 39 32 52 26 78 391 3RD STAGE UNIVERSE ============= 72 42 61 47 34 43 58 65 422 ********* GROUP 2 ********* SECONDARY UNIT SECONDARY UNIT IDENTIFICATION SIZE ============================== ============= UNIV6 11 UNIV43 <-.RHC Three Stage Attribute Appraisal RAT-STATS Companion Manual Date: 10/25/2004 DEPARTMENT OF HEALTH & HUMAN SERVICES OIG .Selected 7 UNIV62 10 UNIV41 8 UNIV1 6 UNIV40 7 UNIV79 9 UNIV36 10 GROUP TOTALS: 8 68 < GROUPS 3 THROUGH 9 ARE OMITTED HERE > ********* GROUP 10 ********* SECONDARY UNIT SECONDARY UNIT IDENTIFICATION SIZE ============================== ============= UNIV71 11 UNIV9 6 UNIV21 11 UNIV23 10 UNIV39 <-.TXT GROUPS OF SECONDARY UNITS Time: 13:49 ********* GROUP 1 ********* SECONDARY UNIT SECONDARY UNIT IDENTIFICATION SIZE ============================== ============= UNIV44 9 UNIV32 10 UNIV77 7 UNIV78 <-. 10/2004) .OFFICE OF AUDIT SERVICES GENERATION OF SECONDARY UNIT SAMPLE NAME OF INPUT FILE: C:\TEMP\REGION10.Selected 6 UNIV29 5 UNIV72 5 UNIV13 10 UNIV51 4 GROUP TOTALS: 9 68 3RD STAGE UNIVERSE ============= 76 38 77 63 38 34 34 70 24 454 Page 2-50 (Rev.

00 SECOND SEED NUMBER: 85 10 Time: 13:49 200. 10/2004) Page 2-51 .00 NUMBER OF SECONDARY UNITS IN THE POPULATION: NUMBER OF SECONDARY UNITS SAMPLED: 3RD STAGE UNIVERSE ============= 39 42 56 27 78 65 60 52 50 38 SECONDARY UNIT ID ========================= UNIV78 UNIV43 UNIV7 UNIV73 UNIV55 UNIV33 UNIV10 UNIV59 UNIV64 UNIV39 SECONDARY UNIT SIZE ============= 7 7 9 5 12 10 9 8 8 6 GROUP SIZE ============= 62 68 54 63 70 77 76 71 73 68 UNITS IN GROUP ===== 8 8 8 8 8 9 9 9 9 9 (Rev.OFFICE OF AUDIT SERVICES Date: 10/25/2004 GENERATION OF SECONDARY UNIT SAMPLE NAME OF OUTPUT FILE: C:\TEMP\OutRegion10.RAT-STATS Companion Manual RHC Three Stage Attribute Appraisal DEPARTMENT OF HEALTH & HUMAN SERVICES OIG .txt FIRST SEED NUMBER: 100.

TXT). The 10 lines after each REGIONx line consist of the output file created when selecting the universities from each region (OUTREGION4. A value of 10 (the number of selected universities for that region) is added to the end of each of these lines.Data set PUSURHC3.. Page 2-52 (Rev. .RHC Three Stage Attribute Appraisal RAT-STATS Companion Manual --.TXT).TXT --REGION4 UNIV85 UNIV46 UNIV7 UNIV82 UNIV30 UNIV34 UNIV27 UNIV66 UNIV65 UNIV80 REGION6 UNIV113 UNIV43 UNIV78 UNIV104 UNIV89 UNIV112 UNIV30 UNIV65 UNIV3 UNIV99 REGION8 UNIV112 UNIV6 UNIV7 UNIV93 UNIV75 UNIV111 UNIV62 UNIV115 UNIV70 UNIV99 REGION10 UNIV78 UNIV43 UNIV7 UNIV73 UNIV55 UNIV33 UNIV10 UNIV59 UNIV64 UNIV39 123 54 44 77 52 54 50 76 76 62 70 116 33 39 63 25 35 27 58 57 56 80 118 75 34 51 54 52 84 64 59 65 60 85 39 42 56 27 78 65 60 52 50 38 1320 11 9 17 11 11 10 16 16 14 15 1240 8 7 12 9 7 10 11 10 11 14 1300 16 8 11 11 11 17 14 13 14 13 640 7 7 9 5 12 10 9 8 8 6 3410 125 131 119 129 141 140 138 128 125 155 3100 108 105 104 96 124 108 95 109 115 113 3170 125 127 120 136 126 134 123 137 143 129 2320 62 68 54 63 70 77 76 71 73 68 3 12 12 12 12 12 12 12 13 13 13 3 11 11 11 11 12 12 12 12 12 12 3 11 11 12 12 12 12 12 12 12 12 3 8 8 8 8 8 9 9 9 9 9 10 10 10 10 NOTE: This is the data file constructed using the RHC SAMPLE SELECTION program to select the primary units (regions) and. OUTREGION10. . within each selected primary unit. the 10 secondary units (universities). 10/2004) . The four lines beginning with REGIONx are from the output file created during the primary unit selection (GRANTSPUOUT. .TXT.

10/2004) Page 2-53 .RAT-STATS Companion Manual RHC Three Stage Attribute Appraisal Selection of Third-Stage Units Since approximately 20% of the grants at each university in the sample are to be audited. the following sample sizes are determined: Region 4: University UNIV85 UNIV46 UNIV7 UNIV82 UNIV30 UNIV34 UNIV27 UNIV66 UNIV65 UNIV80 University UNIV113 UNIV43 UNIV78 UNIV104 UNIV89 UNIV112 UNIV30 UNIV65 UNIV3 UNIV99 University UNIV112 UNIV6 UNIV7 UNIV93 UNIV75 UNIV111 UNIV62 UNIV115 UNIV70 UNIV99 Grants in universe 54 44 77 52 54 50 76 76 62 70 Grants in universe 33 39 63 25 35 27 58 57 56 80 Grants in universe 75 34 51 54 52 84 64 59 65 60 Grants in universe 39 42 56 27 78 65 60 52 50 38 Number to be audited 11 9 15 10 11 10 15 15 12 14 122 Number to be audited 7 8 13 5 7 5 12 11 11 16 95 Number to be audited 15 7 10 11 10 17 13 12 13 12 120 Number to be audited 8 8 11 5 16 13 12 10 10 8 101 Region 6: Region 8: Region 10: University UNIV78 UNIV43 UNIV7 UNIV73 UNIV55 UNIV33 UNIV10 UNIV59 UNIV64 UNIV39 (Rev.

TXT. (*) (*) To illustrate. File RHC3DATA.8 1.7 2.1 1.8 2.2 4.6 4.5 3.4 2..9 2.6 1. and (3) the number of grants containing improper charges (in error).. Page 2-54 (Rev.4 1. the fifth university in the fourth sampled P.10 15 7 10 11 10 17 13 12 13 12 8 8 11 5 16 13 12 10 10 8 1 3 2 4 3 6 0 1 2 3 5 1 5 3 3 4 0 2 3 3 Note: This file has 40 lines.2 1.1 3.3 2. Each line contains (1) a counter. The output from this program is shown at the end of this section.3 1.1 4.2 2.9 3.5 2. (2) the number of sampled (audited) secondary units (grants)..9 1.8 4.5 1. 10/2004) . (Region 10) had 16 grants (thirdstage units) audited and three of them contained improper charges.10 11 9 15 10 11 10 15 15 12 14 7 8 13 5 7 5 12 11 11 16 2 4 3 2 5 2 2 4 1 3 2 4 5 1 3 2 3 2 4 3 < .U.7 3.3 3.7 1.4 4.10 2.10 4.3 4.RHC Three Stage Attribute Appraisal RAT-STATS Companion Manual The data file containing the errors for these 438 audited grants is RHC3DATA.TXT 1. the three-stage RHC program is run to generate a confidence interval for the universe total using input files PUSURHC3.6 3..TXT.continued .2 3.8 3.7 4. Finally.6 2.4 3.> 3.1 2.TXT and RHC3DATA.5 4.9 4.

The (Rev.RAT-STATS Companion Manual RHC Three Stage Attribute Appraisal Summary of results The estimate for the number of grants in error for the universe (all 12 regions) is the OVERALL POINT ESTIMATE of 15. The 95% confidence interval for the number of grants in error is 11.531 and the estimated number of grants in error for the group of nine universities containing UNIV85 is 112. UNIV85 in Region 4 has a size of 11 and is in a group of size 125 (look at file REGION4. The SIZES RATIO here is 125/11 = 11. For example.200 to . 10/2004) Page 2-55 . the lower limit of 11.3636.034 (3. if the total number of grants in all 12 regions is 59.335). (region) and for each of the groups of S.200). 59.857.861 with a corresponding estimated OVERALL STANDARD ERROR of 2. you can convert the point estimate into a proportion.4%).861) to obtain the corresponding confidence interval.200 = .U.8%) with a corresponding standard error of 2. NOTE: This estimate does not require knowing the number of grants in the universe.861/59. For example. If this value is known.TXT). The program also provides estimates for the number of grants in error for each sampled P.200 = .U.861. this interval can be converted into an interval for the proportion of grants in error by dividing both limits by this value (here. The PRECISION AMOUNT is the amount added and subtracted to the point estimate (15. To illustrate.039 grants. If the total number of third-stage units in the universe is known (say.268 (26.865 to 19. then the point estimate for the proportion of grants in error is 15.200.s (universities) within each sampled region.996 from 15. The SIZES RATIO refers to the ratio of the size of the group containing this university to the size of this university. In the 95% confidence interval.865 is obtained by subtracting the precision amount of 3. . the estimated number of grants in error for Region 4 is 1.039/59.

txt PRIMARY/SECONDARY UNIVERSE FILE USED: C:\TEMP\PUSURHC3. DEPARTMENT OF HEALTH & HUMAN SERVICES OIG . WITH SIZE ATTRIBUTE ====== ============ 11 9 15 10 11 10 15 15 12 14 7 8 13 5 7 5 12 11 11 16 15 7 10 11 10 17 13 12 13 12 8 8 11 5 16 13 12 10 2 4 3 2 5 2 2 4 1 3 2 4 5 1 3 2 3 2 4 3 1 3 2 4 3 6 0 1 2 3 5 1 5 3 3 4 0 2 Page 2-56 (Rev.OFFICE OF AUDIT SERVICES THREE STAGE RHC ATTRIBUTE APPRAISAL AUDIT/REVIEW: RHC 3-Stage Date: 10/26/2004 Time: 10:04 DATA FILE USED: C:\TEMP\RHC3DATA.txt **** SAMPLED UNITS **** PRIMARY / SECONDARY IDENTIFICATION ================================== REGION4 UNIV85 UNIV46 UNIV7 UNIV82 UNIV30 UNIV34 UNIV27 UNIV66 UNIV65 UNIV80 REGION6 UNIV113 UNIV43 UNIV78 UNIV104 UNIV89 UNIV112 UNIV30 UNIV65 UNIV3 UNIV99 REGION8 UNIV112 UNIV6 UNIV7 UNIV93 UNIV75 UNIV111 UNIV62 UNIV115 UNIV70 UNIV99 REGION10 UNIV78 UNIV43 UNIV7 UNIV73 UNIV55 UNIV33 UNIV10 UNIV59 THIRD STAGE UNIVERSE =========== 54 44 77 52 54 50 76 76 62 70 33 39 63 25 35 27 58 57 56 80 75 34 51 54 52 84 64 59 65 60 39 42 56 27 78 65 60 52 *** ATTRIBUTE *** SAMPLE NO.RHC Three Stage Attribute Appraisal RAT-STATS Companion Manual PRECISION PERCENT is the precision amount divided by the point estimate. 10/2004) . expressed as a percentage.txt OUTPUT FILE: C:\TEMP\OutRHC3.

13 8.25 0.15 0.25 7.36 0. 0.20 11.07 0.45 0.60 0.43 0.27 8.6364 10.44 14.0000 UNIV82 This group contained 12 0.8125 15.20 14.9231 0.19 POINT ESTIMATE ============= 112 285 108 122 315 140 87 162 46 155 Estimate for Region 4 º 1.7273 UNIV30 universities shown earlier 0.3636 11.7143 6.8182 UNIV34 in the output using 0.6250 UNIV66 0. 10/2004) Page 2-57 .9091 12.3333 TOTAL REGION6 UNIV113 UNIV43 UNIV78 UNIV104 UNIV89 UNIV112 UNIV30 UNIV65 UNIV3 UNIV99 TOTAL REGION8 UNIV112 UNIV6 UNIV7 UNIV93 UNIV75 UNIV111 UNIV62 UNIV115 UNIV70 UNIV99 TOTAL REGION10 UNIV78 UNIV43 UNIV7 UNIV73 UNIV55 UNIV33 0.50 0.8333 7.43 0.31 8.7857 10.20 0.0000 8.45 12.9286 UNIV80 0.RAT-STATS Companion Manual RHC Three Stage Attribute Appraisal UNIV64 UNIV39 TOTALS 50 38 2.5000 15.18 11.40 0.0000 12.8571 9.29 0.638 39 231 111 243 179 234 0 52 102 149 1.270 10 8 438 3 3 111 --.20 0.20 7.7143 10.340 216 51 153 204 85 154 (Rev.08 8.3636 0.TXT.6000 5.7000 0.5556 UNIV46 for the group containing UNIV7 UNIV85 (not just UNIV85).4545 7.531 13.POINT ESTIMATES --*** ATTRIBUTE *** **** SAMPLED UNITS **** SIZES PRIMARY / SECONDARY IDENTIFICATION SAMPLE MEAN RATIO ================================== =========== =========== REGION4 UNIV85 NOTE: 112 is the estimate 0.6667 10.08 0.8824 8.63 0.36 0.0714 127 293 210 53 266 117 125 113 213 121 1.9000 10.0000 UNIV65 0.4545 8. 0.38 0.30 0.5385 10.2143 9.35 0.00 0.18 0.13 0.19 0.0000 UNIV27 data set REGION4.8000 8.8750 10.21 10.6667 17.

102 3.508 19.039 90 PERCENT 12.3333 0 92 137 162 1.RHC Three Stage Attribute Appraisal RAT-STATS Companion Manual UNIV10 UNIV59 UNIV64 UNIV39 TOTAL 0.254 --.SUMMARY OF APPRAISAL RESULTS --PRIMARY UNITS SAMPLED 4 PRIMARY UNITS NOT SAMPLED 8 TOTAL PRIMARY UNITS 12 PROJECTED QUANTITY IN UNIVERSE STANDARD ERROR CONFIDENCE LEVEL LOWER LIMIT UPPER LIMIT PRECISION AMOUNT PRECISION PERCENT Z-VALUE USED 80 PERCENT 13.385 72.157.19% 1.14% 1.466.857 3.38 8.474 2.996 25.212 TOTAL VARIANCE ======================= 4.COMBINED VARIANCE COMPONENTS --STAGE 1 ======================= 3.VARIANCE COMPONENTS FOR PRIMARY UNITS --**** SAMPLED UNITS **** PRIMARY UNIT IDENTIFICATION ============================== REGION4 REGION6 REGION8 REGION10 WITHIN VARIANCE ============== 4.687 64.865 19.722 4.959963984540 Page 2-58 (Rev.651 68.734 4.215 3.285 (Values of V4) (Values of V3) --.272 51.861 2.169 47.47% 1. 10/2004) .4444 8.281551565545 15.30 0.1250 11.104 (Value of V1) (Value of V2) *** ATTRIBUTE *** --.934 TOTAL VARIANCE ============== 64.613 16.644853626951 95 PERCENT 11.892 STAGES 2 AND 3 ======================= 690.20 0.351 BETWEEN VARIANCE ============== 59.8750 9.354 21.965 59.00 0.248 18.

5.s Sij = (size of j-th S.U. (population) kij = number of records for j-th sampled S.U. where T i $ ⎞ ⎛T ij ⎜ π ij ⎜ ⎟ ∑ ⎟ j =1 ⎝ Sij ⎠ mi (equation 1) $ = estimator of population total for j-th sampled S. Bij = ESij over the j-th group in i-th sampled P.U.)/(size of entire population) Bi = ESi over the i-th group of P.U.s in i-th sampled P. 10/2004) Page 2-59 . 2. in the i-th sampled P. in i-th sampled P.U. 10.U.U. 7. N = number of P.U.U. 9. (population) mi = number of S.RAT-STATS Companion Manual RHC Three Stage Attribute Appraisal FORMULAS Definitions 1.U. Si = (size of i-th P.U.U. in i-th sampled P.s in i-th sampled P. 8.U. (sample) Kij = number of records for j-th sampled S.s (sample) Mi = number of S. and T ij $ ij = Kij p $ ij = proportion of kij records having the attribute of interest with p (Rev.) (Note: denominator of Sij = numerator of Si) 4.s (population) n = number of P.U.U.U. in i-th sampled P. 6. (sample) Estimator of population total (T) n $⎞ T i $= π⎛ T ⎜ ⎟ ∑ i S ⎝ i =1 i ⎠ $ = estimator of total for i-th sampled P.U. 3.U.)/(size of i-th sampled P.U.

i = ∑ π ij j =1 mi Kij Sij ( Kij − k ij ) $ ij (1 − p $ ij ) p k ij − 1 and where (1) Mij = the number of S.where the i-th P..e.U. Consequently. $ ) is obtained by applying the two stage RHC procedure within the i-th sampled P. within the i-th sampled P.U.i 4 . v (T i 3. i-th sampled P.RHC Three Stage Attribute Appraisal RAT-STATS Companion Manual NOTE: $ is an unbiased estimator of T. $ ij = proportion of the kij items having the attribute of interest for the j-th (2) p sampled S. It can be shown that T $ Estimated variance of T $ ) = V1 + V2 where v (T 2 $ ⎛ ⎞ T $⎟ V1 = i =1 πi ⎜ i − T ∑ n S ⎠ N 2 − ∑ N i2 i =1 ⎝ i n i =1 ∑N n 2 i −N (equation 2) and ⎛π ⎞ $ V2 = ∑ ⎜ i ⎟ v (T i) i =1 ⎝ Si ⎠ n (equation 3) and Ni = number of P..U. v (T i $ ) = V + V .U.s in the i-th group after the random split into n groups.U.i V3.U. is viewed as the entire population. Page 2-60 (Rev.s in the j-th random group.U. i.i $ ij ⎛ Kij p ⎞ j =1 $⎟ ⎜ = π T − ∑ ij ⎜ i⎟ mi Sij ⎝ ⎠ 2 2 j =1 M i − ∑ M ij mi j =1 ∑M mi 2 ij − Mi 2 and V4 . 10/2004) .

v (T NOTE: For a 90% confidence interval.and third-stage variation and is obtained by treating each sampled P.RAT-STATS Companion Manual RHC Three Stage Attribute Appraisal Comments 1.U. (Rev.281551565545. V2 is the contribution of the second. replace 1. replace 1. as the population to be sampled using two (additional) stages. 2. The estimated standard error of T $) . 10/2004) Page 2-61 .644853626951 and for an 80% confidence interval. V1 is essentially the same expression obtained for the single-stage RHC procedure and will be referred to as the “between unit” variation.959963984540 with 1.959963984540 with 1. v (T Approximate 95% confidence interval for the population total (T) $ ± 1959963984540 $) T . $ is 3.

there are 3. a single-stage cluster sample was obtained with n1 = 25 universities selected from Stratum 1 (state supported universities) and n2 = 10 universities from Stratum 2 (private universities).S. Within each stratum. The total number of grants in the universe for Stratum 1 is 2. you first stratify. an audit was conducted for 583 universities with health-related research grants. Mj is the number of grants (universe) for this university (all of which are audited). the total number is 1.Stratified Cluster Attribute Appraisal RAT-STATS Companion Manual Stratified Cluster With this procedure.500 and for Stratum 2. then obtain a cluster (single-stage) sample within each stratum. In a large section of the U. 10/2004) . Two strata were defined: Stratum 1: state-supported universities and Stratum 2: private universities. Consequently. The strata sizes were N1 = 415 and N2 = 168. Example 8. and pj is aj/Mj.. Page 2-62 (Rev. It was decided to estimate the proportion of contracts containing charges after the scheduled completion of the contract using the same two strata. NOTE 1: These sample sizes are not adequate according to OAS policy and are used here for illustration purposes only.500 grants in the entire universe. The following data were obtained. NOTE 2: This procedure does require knowledge of the total number of elements in the universe for each stratum. where aj is the number of grants containing charges after the scheduled completion of the grant for the j-th university.000 grants. This is motivated by the discussion in the RAT-STATS User’s Guide.

2905 p $ (using the square root of Equation 5) is 46. EMj = 151. The estimated standard error for T $ (using the square root of Equation 6) is . EMj = 49. p The projected number in the universe for Stratum 1 is $ = (2500)(.3878) = 388 T 2 The estimate of the total number of grants in the universe with charges after the scheduled grant completion is $ = 629 + 388 = 1017 T The estimated proportion of grants with such charges is $ = 1017 / 3500 = . p The projected number in the universe for Stratum 2 is $ = (1000)(.2517) = 629 T 1 $ 2 = 19/49 = .RAT-STATS Companion Manual Stratified Cluster Attribute Appraisal Summary Using Computer Output and Corresponding Formulas: $1 = 38/151 = .2517 Stratum 1: Eaj = 38.0132. 10/2004) Page 2-63 . The estimated standard error for p (Rev.3878 Stratum 2: Eaj = 19.

14 15 16 17 18 19 20 21 22 23 24 25 Mj 10 9 3 6 5 5 4 6 8 7 3 8 aj 3 1 1 2 1 1 1 1 1 2 1 2 pj .40 .11 .33 .50 .20 .25 . 10/2004) .33 .25 Stratum 2 -.State Universities Univ.37 . 1 2 3 4 5 6 7 8 9 10 11 12 13 Mj 8 12 4 5 6 6 7 5 8 3 2 6 5 aj 2 3 2 1 1 2 2 2 2 1 0 2 1 pj .50 .20 .25 .50 .29 .33 . 1 2 3 4 5 Mj 2 5 7 4 3 aj 1 2 2 2 1 pj .20 . 6 7 8 9 10 Mj 8 6 10 3 1 aj 3 2 4 1 1 pj .17 .17 .25 .33 .20 Univ.00 .33 Univ.00 These results are combined into data set DATACLUS.33 1.40 .Private Universities Univ.29 .Stratified Cluster Attribute Appraisal RAT-STATS Companion Manual Stratum 1 -.29 .33 .12 .30 .40 .TXT (39 lines) Data set STATE UNIVERSITIES 2500 UNIV1 8 2 UNIV2 12 3 UNIV3 4 2 UNIV4 5 1 UNIV5 6 1 UNIV6 6 2 UNIV7 7 2 UNIV8 5 2 UNIV9 8 2 UNIV10 3 1 UNIV11 2 0 UNIV12 6 2 UNIV13 5 1 UNIV14 10 3 UNIV15 9 1 UNIV16 3 1 UNIV17 6 2 415 25 Page 2-64 (Rev.33 .33 .25 .

Stratified Cluster NAME OF INPUT FILE: C:\Temp\DATACLUS.RAT-STATS Companion Manual Stratified Cluster Attribute Appraisal UNIV18 5 1 UNIV19 5 1 UNIV20 4 1 UNIV21 6 1 UNIV22 8 1 UNIV23 7 2 UNIV24 3 1 UNIV25 8 2 PRIVATE UNIVERSITIES 10 1000 UNIV1 2 1 UNIV2 5 2 UNIV3 7 2 UNIV4 4 2 UNIV5 3 1 UNIV6 8 3 UNIV7 6 2 UNIV8 10 4 UNIV9 3 1 UNIV10 1 1 168 The following computer printout is produced: DEPARTMENT OF HEALTH & HUMAN SERVICES OIG .OFFICE OF AUDIT SERVICES STRATIFIED CLUSTER ATTRIBUTE APPRAISAL AUDIT/REVIEW: Attribute .TXT SAMPLE UNIVERSE =========== 415 8 12 4 5 6 6 7 5 8 3 2 6 5 10 9 3 6 5 5 4 6 8 SAMPLE SIZE ====== 25 8 12 4 5 6 6 7 5 8 3 2 6 5 10 9 3 6 5 5 4 6 8 MEETING CRITERIA ======== 2 3 2 1 1 2 2 2 2 1 0 2 1 3 1 1 2 1 1 1 1 1 PERCENT ======= Date: 2/16/2004 Time: 15:36 STRATUM IDENTIFICATION CLUSTER IDENTIFICATION =========================== STATE UNIVERSITIES UNIV1 UNIV2 UNIV3 UNIV4 UNIV5 UNIV6 UNIV7 UNIV8 UNIV9 UNIV10 UNIV11 UNIV12 UNIV13 UNIV14 UNIV15 UNIV16 UNIV17 UNIV18 UNIV19 UNIV20 UNIV21 UNIV22 PROJECTED QUANTITY ========= (Rev. 10/2004) Page 2-65 .

093 FORMULAS 1.22% 941 1.017 46 95 PERCENT 26.000 583 3.88% 31. and nh is the number of sample items in stratum h. Page 2-66 (Rev. h is the number of elements in the j-th secondary unit in stratum h possessing the attribute of interest.47% 31.Stratified Cluster Attribute Appraisal RAT-STATS Companion Manual UNIV23 UNIV24 UNIV25 STRATUM TOTALS PRIVATE UNIVERSITIES UNIV1 UNIV2 UNIV3 UNIV4 UNIV5 UNIV6 UNIV7 UNIV8 UNIV9 UNIV10 STRATUM TOTALS STRATA TOTALS CLUSTER UNIT TOTALS OVERALL TOTALS OVERALL STANDARD ERROR CONFIDENCE LEVEL LOWER LIMIT FOR PROPORTION UPPER LIMIT FOR PROPORTION LOWER LIMIT FOR TOTAL UPPER LIMIT FOR TOTAL 7 3 8 2. Mj.36% 30. 10/2004) .h j= 1 where aj.107 38.h is the number of secondary units in the j-th primary unit in stratum h.64% 926 1.500 168 2 5 7 4 3 8 6 10 3 1 1.500 7 3 8 151 10 2 5 7 4 3 8 6 10 3 1 49 35 200 2 1 2 38 1 2 2 2 1 3 2 4 1 1 19 57 29.78% 388 25.h j=1 $h = n p h ∑ M j .74% 958 1.17% 629 80 PERCENT 27.32% 1.076 90 PERCENT 26. Estimated proportion in stratum h that possess the attribute of interest nh ∑ a j .05% 1.

Approximate 95% confidence interval for T $ ± 1.RAT-STATS Companion Manual Stratified Cluster Attribute Appraisal 2.h − p j=1 nh where Nh is the number of universe items in stratum h. (Rev. Approximate 95% confidence interval for p $ ± 1. Estimated universe proportion having the attribute of interest is p total number of secondary units in the universe and M = EMh (summed over the L strata).644853626951 and for the Precision at the 80% Confidence Level. Estimated variance of p 2 $ $ v ( p) = v ( T ) / M 7.281551565545.959963984540 with 1. replace1. replace 1. Estimated total number of elements in stratum h that possess the attribute of interest $ = M p T h h $h where Mh = number of secondary units in the universe for stratum h (must be known) 3. Estimated universe total having the attribute of interest L $= T ∑ T$h h= 1 summed over the L strata $ / M where M is the $=T 4.959963984540 v(T) $ T 8. Estimated variance of T $) = v (T ∑ L h= 1 N h ( N h − nh ) nh (nh − 1) $ h M j .h ) 2 ∑ ( a j .959963984540 v(p) $ p NOTE: For the Precision at the 90% Confidence Level.959963984540 with 1. $ 6. $ 5. 10/2004) Page 2-67 .

For example.500 grants) Because these universities are so widespread. standard error. These multistage samples may be random (using the Two-Stage Unrestricted or Three-Stage Unrestricted programs) or may be obtained using the RHC procedure and the RHC Two-Stage or RHC Three-Stage programs. Example 9. Rather than take a cluster (single-stage) sample within each stratum.Stratified Multistage Attribute Appraisal RAT-STATS Companion Manual Stratified Multistage As with the stratified cluster procedure. it was decided (based on available resources) to audit roughly 20% of the grants at each selected university to estimate the proportion of grants containing charges after the Page 2-68 (Rev. The output results are then used as input to the Stratified Multistage program. universe size) in a file or simply input these values interactively. this program requires that you first run the appropriate multistage program on each stratum and record the results.600 grants) and and Stratum 2: private universities (3. The universe consisting of university grants is stratified by defining Stratum 1: state-supported universities (5. Unlike the Stratified Cluster program. it was decided to employ a two-stage sample using 20 state-supported universities and 10 private universities. you will obtain a multistage (two-stage or three-stage) sample within each stratum. 10/2004) . Rather than audit all grants at a selected university. You may store the results from each stratum (point estimate. This example is similar to Example 8 in the Stratified Cluster section. NOTE: The “universe size” refers to the number of units at the most detailed level of the multistage sample. if you are obtaining a three-stage sample within each stratum. then the “universe size” refers to the total number of third-stage units within this stratum. you must first stratify the universe.

10/2004) Page 2-69 . mi is the number of audited (sampled) grants at the i-th university. 12 31 6 2 13 27 5 1 14 49 10 3 15 46 9 1 16 15 3 1 17 30 6 2 18 24 5 1 19 23 5 1 20 21 4 1 These values are stored in data set MULSTAT1.TXT.TXT. State-supported universities Private universities Univ.RAT-STATS Companion Manual Stratified Multistage Attribute Appraisal scheduled completion of the grant. Mi mi ai 1 41 8 2 1 11 2 1 2 62 12 3 2 25 5 2 3 21 4 2 3 34 7 2 4 23 5 1 4 18 4 2 5 31 6 1 5 16 3 1 6 32 6 2 6 40 8 3 7 33 7 2 7 31 6 2 8 27 5 2 8 50 10 4 9 41 8 2 9 14 3 1 10 16 3 1 10 12 2 1 11 9 2 0 These values are stored in data set MULSTAT2. (Rev. The following data were obtained. and Mi is the total number of grants in the audit universe at the i-th university. Mi mi ai Univ. where ai is the number of grants in the sample from the i-th university containing such charges.

00% 25.00% 16.63% 20.00% 25.00% 33. DEPARTMENT OF HEALTH & HUMAN SERVICES OIG .Stratified Multistage Attribute Appraisal RAT-STATS Companion Manual The following two computer outputs are obtained using the Two-Stage Unrestricted program.00% 25.00% 20.00% 30.33% 20.00% 50.67% 33.00% 20.11% 33.79% 728 1.90% 1.00% 20.600 PROJECTED ========= 10 16 11 5 5 11 9 11 10 5 0 10 5 15 5 5 10 5 5 5 OVERALL TOTALS 120 STANDARD ERROR CONFIDENCE LEVEL LOWER LIMIT FOR PROPORTION UPPER LIMIT FOR PROPORTION LOWER LIMIT FOR TOTAL UPPER LIMIT FOR TOTAL Page 2-70 (Rev.TXT SAMPLE SIZE =========== 8 12 4 5 6 6 7 5 8 3 2 6 5 10 9 3 6 5 5 4 119 SAMPLE ITEMS WITH CHARACTERISTIC(S) ================= 2 3 2 1 1 2 2 2 2 1 0 2 1 3 1 1 2 1 1 1 31 16.OFFICE OF AUDIT SERVICES TWO STAGE UNRESTRICTED ATTRIBUTE APPRAISAL AUDIT/REVIEW: Stratum1 DATA FILE: C:\Temp\MULSTAT1.00% Date: 2/11/2004 Time: 15:02 PRIMARY UNIT ======= 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 TOTALS UNIVERSE ============ 41 62 21 23 31 32 33 27 41 16 9 31 27 49 46 15 30 24 23 21 602 5.089 90 PERCENT 13.44% 803 1.33% 0.00% 33. 10/2004) .33% 33.33% 28.99% 80 PERCENT 14.33% 20.57% 40.164 RATIO ===== 25.00% 11.17% 763 1.35% 19.129 946 111 95 PERCENT 13.

00% 33. 10/2004) Page 2-71 .85% 28.33% 37.50% 33.009 CONFIDENCE LEVEL LOWER LIMIT FOR PROPORTION UPPER LIMIT FOR PROPORTION LOWER LIMIT FOR TOTAL UPPER LIMIT FOR TOTAL The values used as input to the Stratified Multistage program are highlighted in the preceding computer output.97% 27.83% 520 1.TXT SAMPLE SIZE =========== 2 5 7 4 3 8 6 10 3 2 50 SAMPLE ITEMS WITH CHARACTERISTIC(S) ================= 1 2 2 2 1 3 2 4 1 1 19 21. The following computer screen illustrates how to enter these values: (Rev.00% 33.70% 559 970 RATIO ===== 50.00% Time: 15:13 PRIMARY UNIT ======= 1 2 3 4 5 6 7 8 9 10 TOTALS UNIVERSE ============ 11 25 34 18 16 40 31 50 14 12 251 3.33% 40.00% 40.33% 50.00% 28.27% 26.41% 604 924 90 PERCENT 15.500 PROJECTED ========= 6 10 10 9 5 15 10 20 5 6 OVERALL TOTALS 80 STANDARD ERROR 764 125 95 PERCENT 14.OFFICE OF AUDIT SERVICES TWO STAGE UNRESTRICTED ATTRIBUTE APPRAISAL AUDIT/REVIEW: Stratum 2 DATA FILE: C:\Temp\MULSTAT2.57% 80 PERCENT 17.RAT-STATS Companion Manual Stratified Multistage Attribute Appraisal Date: 2/11/2004 DEPARTMENT OF HEALTH & HUMAN SERVICES OIG .57% 50.84% 3.

83% 1.44% 21.41% 1.16% 1.84% 3.925 90 PERCENT 15.496 1.986 95 PERCENT 15.OFFICE OF AUDIT SERVICES STRATIFIED MULTISTAGE ATTRIBUTE APPRAISAL AUDIT/REVIEW: Illustration Date: 2/11/2004 Time: 16:08 STRATUM 1 2 THE ESTIMATORS ARE BASED ON THE FOLLOWING ENTRIES: ATTRIBUTE STANDARD ERROR UNIVERSE SIZE 16.435 1. 10/2004) .383 2.Stratified Multistage Attribute Appraisal RAT-STATS Companion Manual The following output is obtained from the Stratified Multistage program.77% 21.99% 5.19% 22.500 = = = = = = = = = = = = = = = = = RESULTS = = = = = = = = = = = = = = = = = = ESTIMATED PERCENTAGE: ESTIMATED TOTAL: STANDARD ERROR (PERCENTAGE): STANDARD ERROR (TOTAL): CONFIDENCE LEVEL LOWER LIMIT FOR PROPORTION UPPER LIMIT FOR PROPORTION LOWER LIMIT FOR TOTAL UPPER LIMIT FOR TOTAL 18.80% 1. DEPARTMENT OF HEALTH & HUMAN SERVICES OIG .57% 3.711 1.90% 1.600 21.84% 167 80 PERCENT 16.039 Page 2-72 (Rev.

FORMULAS 1.RAT-STATS Companion Manual Stratified Multistage Attribute Appraisal Final results: The point estimate for the percentage of grants containing improper charges is 18.435 to 1. The 90% confidence interval for the universe total is from 1.8%) p and $ ) = (5600/9100)2(.169) + (3500/9100)(.8 ± 1. Using Equation 1 in the Formulas section.8% (standard error of 1.000339 v( p $ ) = .500 = 9.0184 (1.84%).84).84%) and the 90% confidence interval for this proportion is from 15.0357)2 = . Estimated universe proportion having the attribute of interest $= p ∑ i= 1 L ⎛ Mi ⎞ $ ⎜ ⎟p ⎝ M⎠ i where L = number of strata Mi = universe size for the most detailed level of the multistage sample M = total universe size = Emi (Rev.188 (18. that is 15. $ = (5600/9100)(. $ is v ( p The estimated standard error of p The corresponding 90% confidence interval is 18.600 + 3.77% to 21.77% to 21.0199)2 + (3500/9100)2(.100. 10/2004) Page 2-73 .83%.644853626951(1.2184) = . Discussion.83% The estimate of the universe total and corresponding confidence interval are obtained by multiplying the previous results by the total universe size = 5.986.

90% confidence interval for p $ ± 1. replace 1. 10/2004) . Estimated variance of p $) = v( p ∑ i=1 L ⎛ Mi ⎞ $ i )2 ⎟ (standard error of p ⎜ ⎝ M⎠ 2 3. 90% confidence interval for T $ ± 1.644853626951 v(T) $ T NOTE: For the Precision at the 95% Confidence Level.Stratified Multistage Attribute Appraisal RAT-STATS Companion Manual $i = estimated proportion for the i-th stratum p $ 2. Estimated universe total having the attribute of interest $ = Mp $ T $ 5. Page 2-74 (Rev. replace 1.644853626951 with 1. Estimated variance of T $ ) = M 2 v( p $) v (T 6.959963984540 and for the Precision at the 80% Confidence Level.644853626951 with 1.281551565545.644853626951 v(p) $ p 4.

10/2004) Page 3-1 . For example.RAT-STATS Companion Manual Variable Appraisals VARIABLE APPRAISALS A variable appraisal is carried out to estimate a particular universe total (T) and its corresponding sampling error. They are listed below and described in the sections to follow. the audit intent may be to determine the dollar value of an inventory or the amount of duplicate payments made by an organization. There are ten sampling strategies utilized in the Variables Appraisals modules. A variety of procedures can be used to obtain and appraise a variable sample. # Unrestricted # Stratified # Two-Stage Unrestricted Unrestricted # Three-Stage # RHC # RHC Two Stage Three Stage Cluster Multistage # Stratified # Stratified # Post Stratification Universe Size # Unknown (Rev.

or audit and difference amounts).TXT. examined and difference amounts.TXT 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 300 900 300 200 900 700 1000 100 900 700 700 400 300 100 200 100 600 400 267 774 255 174 810 560 820 80 765 630 630 332 255 84 168 88 528 340 µ Each line contains a line counter. audit amounts. all of the resulting differences (examined value . For an unrestricted sample. examined value. Example 1.Unrestricted Variable Appraisal RAT-STATS Companion Manual Unrestricted Variable Appraisal An unrestricted sample is the same as a simple random sample. the user may input a set of single values (examined amounts. For this sample. or a comma. or difference amounts) or a set of two values (examined and audit amounts. a tab delimiter. every sample of size n has the same chance of being selected. Page 3-2 (Rev. Actually. Consequently. Data file DATASRS. and audited value separated by one or more spaces.audited value) were nonzero since all the examined (book) values were unequal to the corresponding audit (actual) values. a sample of size n is randomly obtained and the variable of interest is recorded for each sample item. An unrestricted sample of 50 items resulted in the 50 examined/audited values contained in data set DATASRS. 10/2004) .

(Rev.RAT-STATS Companion Manual Unrestricted Variable Appraisal 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 900 1000 1000 600 800 200 200 1000 900 600 500 200 200 500 200 500 500 400 200 600 500 300 900 100 100 900 300 500 500 300 500 100 747 800 862 504 648 176 172 890 792 540 525 172 178 425 164 420 400 324 160 540 425 264 765 84 85 810 240 415 425 237 435 86 The output on the next page was obtained from the Unrestricted Variable Appraisal program. 10/2004) Page 3-3 .

000 CONFIDENCE LIMITS 80% CONFIDENCE LEVEL 3.676550892617 95% CONFIDENCE LEVEL 4.257.81 STANDARD ERROR (MEAN) 41.00 Time: 11:23 SAMPLE SIZE 50 TOTAL OF AUD VALUES 21.90 SKEWNESS .823 POINT ESTIMATE 4.656 16.800.921 5.97% 1.079 10.504.00 ----------------------.E X A M I N E D -----------------------MEAN / UNIVERSE 496.118.254.OFFICE OF AUDIT SERVICES VARIABLE UNRESTRICTED APPRAISAL AUDIT/REVIEW: Variable SRS DATA FILE USED: C:\Temp\DATASRS.177 702.960.A U D I T E D -------------------------MEAN / UNIVERSE 425.40 10.97% 2.32 KURTOSIS 1.500 469.415.299068784748 90% CONFIDENCE LEVEL 4.299068784748 LOWER LIMIT UPPER LIMIT PRECISION AMOUNT PRECISION PERCENT T-VALUE USED Page 3-4 (Rev.14 STANDARD ERROR (TOTAL) 361.04% 1. 10/2004) .20 SKEWNESS .30 KURTOSIS 1.823 5.78 STANDARD ERROR (MEAN) 36.00 NONZERO DIFFS 50 TOTAL OF DIFF VALUES 3.656 841.723.00 10.412 POINT ESTIMATE 4.530.500 11.16% 1.801.500 4.009575237129 LOWER LIMIT UPPER LIMIT PRECISION AMOUNT PRECISION PERCENT T-VALUE USED LOWER LIMIT UPPER LIMIT PRECISION AMOUNT PRECISION PERCENT T-VALUE USED LOWER LIMIT UPPER LIMIT PRECISION AMOUNT PRECISION PERCENT T-VALUE USED ----------------------.079 544.344 5.000 STANDARD DEVIATION 296.270.662.000 STANDARD DEVIATION 256.Unrestricted Variable Appraisal RAT-STATS Companion Manual Date: 4/5/2004 DEPARTMENT OF HEALTH & HUMAN SERVICES OIG .000 CONFIDENCE LIMITS 80% CONFIDENCE LEVEL 4.88 STANDARD ERROR (TOTAL) 418.177 14.784.TXT EXAMINED VALUE 24.

119 16.000 STANDARD DEVIATION 48.676550892617 95% CONFIDENCE LEVEL 3. 10/2004) Page 3-5 .52% 1.009575237129 LOWER LIMIT UPPER LIMIT PRECISION AMOUNT PRECISION PERCENT T-VALUE USED LOWER LIMIT UPPER LIMIT PRECISION AMOUNT PRECISION PERCENT T-VALUE USED LOWER LIMIT UPPER LIMIT PRECISION AMOUNT PRECISION PERCENT T-VALUE USED (Rev.881 820.37% 2.648.676550892617 95% CONFIDENCE LEVEL 569.RAT-STATS Companion Manual Unrestricted Variable Appraisal LOWER LIMIT UPPER LIMIT PRECISION AMOUNT PRECISION PERCENT T-VALUE USED LOWER LIMIT UPPER LIMIT PRECISION AMOUNT PRECISION PERCENT T-VALUE USED 90% CONFIDENCE LEVEL 3.299068784748 90% CONFIDENCE LEVEL 591.25 SKEWNESS .D I F F E R E N C E ---------------------MEAN / UNIVERSE 70.787 136.64 KURTOSIS 2.81 STANDARD ERROR (TOTAL) 68.859.60 10.068 POINT ESTIMATE 706.715 4.980.24% 1.074 4.425 88.16% 1.000 CONFIDENCE LIMITS 80% CONFIDENCE LEVEL 617.575 794.926 605.787 19.285 726.98 STANDARD ERROR (MEAN) 6.527.425 12.07% 2.009575237129 --------------------.119 114.213 842.285 17.926 14.

80677 = 68.068 rounded).000) = $706. this value is small whenever the frequency of observations close to the mean is high and the frequency of observations far from the mean is low. (Rev.213 to 842. Essentially.067. The sample standard deviation is s = 48. the PRECISION PERCENT is set equal to zero.37% of the point estimate and is referred to as the PRECISION PERCENT.7) = 706.80677. The 95% confidence interval for the universe total of the difference amounts is 706.7 (68. that is. This is referred to as the POINT ESTIMATE in the that is.067. indicating a very slight positive (right-tail) skew.787.64. This value is 19. 569.60)(10. x = $70. The sample skewness is a measure of the symmetry of the sample data. 10/2004) Page 3-6 .000 ± (2.000. or the difference values. This value is SKEWNESS = 0. The estimated total difference for the universe (T) is the sample mean times the universe size. The difference values will be used when discussing the computer output. the audited values. that is.000 x 6.1 1 When the POINT ESTIMATE is negative.60. The PRECISION AMOUNT is the amount added and subtracted to the POINT ESTIMATE.009575237129)(68. $ = (70. The estimated mean of the difference amounts in the universe is the sample mean.Unrestricted Variable Appraisal RAT-STATS Companion Manual Explanation.787. NOTE: The following discussion can be applied to the examined values.2519 and the corresponding (estimated) standard error for the mean is 48. $136.2519 10000 − 50 = 6. T computer output.000 ± 136. The (estimated) standard error for the (50)(10000) total is 10. The sample kurtosis is a measure of the sample “peakedness” and is equal to KURTOSIS = 2.787.98.

n 1 n SKEWNESS = ⎡1 ⎤ 2⎥ ⎢ (x − x) ⎢n ⎥ ⎣ i= 1 ⎦ ∑ n i=1 n ∑ ( x − x )3 3/ 2 1 n KURTOSIS = ∑ ( x − x )4 i= 1 n 2 ⎡1 ⎤ 2 ⎢ (x − x) ⎥ ⎢n ⎥ ⎣ i= 1 ⎦ ∑ 95% confidence interval for the universe total (T) $±t T . NOTE: For a 90% confidence interval.n − 1 ⋅ s ⋅ N ( N − n) n $ = x⋅ N where (1) T (2) t.n-1. N = universe size.n-1 with t.025 (RAT-STATS provides t-values accurate to 12 decimal places). replace t.1 df having a right-tail area = . (Rev. 10/2004) Page 3-7 .n-1 is the t-value with n .025.n-1 and for an 80% confidence interval.10.RAT-STATS Companion Manual Unrestricted Variable Appraisal FORMULAS ∑ ( xi − x )2 STANDARD DEVIATION = s = i= 1 n n− 1 STANDARD ERROR (MEAN) = s N−n N −n and STANDARD ERROR (TOTAL) = Ns nN nN where n = sample size.n-1 with t.05. replace t.025.025.025.

25 32. the intent is to make a statistical estimate for a universe total (T) for a particular variable of interest. the user can take a larger sample (perhaps 100%) from the stratum containing the large dollar items. when obtaining a stratified sample. As with an unrestricted sample. So. The program will request the number of universe items in each stratum and these values must be known.9 Partition the universe into three strata: {5 7 8 10} #1 {55 60 66 70} #2 {120 133 145 150} #3 Variance 3.9 134. the individual strata are much more homogeneous. This plan involves obtaining a random sample from each of the strata. Using a Stratified Sample Purpose: To divide (partition) the universe into separate strata so that variation within individual strata is less than variation within the entire universe. Simple Illustration: Universe consists of {5 7 8 10 55 60 66 70 120 133 145 150} Mean of universe is : = 69. 10/2004) . Page 3-8 (Rev.69 µ Compare these to F2 = 2871. The program will develop estimates for each stratum as well as for the entire universe.Stratified Variable Appraisal RAT-STATS Companion Manual Stratified Variable Appraisal In a stratified variable sampling plan.50 The strata variances are: Stratum 1 2 3 Consequently. the universe is divided into two or more nonoverlapping categories (strata).08 Variance of universe is F2 = 2871.

B. Precision is improved because each stratum should have a relatively small variance and the weighted sum of the strata variances is less than the variance for the entire universe. he/she cannot control the sample size within each stratum. For example. in a sample of health service employees. (Rev. Improved Sampling Precision Stratification tends to make the sampling more efficient. Accommodation of Different Techniques It may be desirable to employ different sampling methods or audit techniques in various portions of the universe. Stratified sampling permits the auditor to also impose different precision requirements on different strata. 10/2004) Page 3-9 . Separate Information About Strata and the Universe Strata may be formed because separate estimates are desired for subuniverses.RAT-STATS Companion Manual Stratified Variable Appraisal Reasons for Using Stratified Sampling: A. a nationwide audit of nursing homes can be planned in advance such that separate estimates are published for each state (stratum). the user will obtain narrower confidence intervals for the same sample size. For example. such as requiring more precise estimates for large accounts. the sample size required to provide a reasonable degree of precision using simple random sampling may be quite large. that is. C. the headquarters employees (Stratum 1) may be sampled as individuals and the employees scattered throughout the state (Stratum 2) may be sampled as clusters to save travel time and cost. When a sample is skewed or has a high degree of variability. When an auditor selects a simple random sample from the entire universe.

Page 3-10 (Rev. cause. source. a well-designed stratified plan can provide audit protection and/or improved precision. (2) Strata can be defined after sample data are obtained provided the proportions of the universe in each stratum are known (with negligible error) and samples of at least 20 are obtained from each stratum. it is not a good idea to stratify for convenience (unlike cluster or multistage sampling) since the resulting estimator may be less efficient than the estimator which uses a single simple random sample. (4) Even though random selection is performed within strata. trend and impact. (5) A careful balance must be maintained between the gains expected in sample precision and the additional time and resources involved in introducing a stratified scheme into the sample design. (3) Generally.Stratified Variable Appraisal RAT-STATS Companion Manual Comments: (1) Defining effective strata is no accident! The user can incorporate all sorts of prior knowledge in defining the strata. 10/2004) . As a result. Such a technique does not introduce any bias into the final estimate since strata are defined prior to obtaining the sample and each sampling item has a known (although not the same) chance of being selected. this does not mean that the user cannot take a close look at the individual findings to determine nature.

An alternative is to stratify using some other variable which is highly correlated with the principal variable. etc.200) and Stratum 2: Examined amounts $ $200 (N2 = 3. Example 2. Random samples of size 25 were obtained from two strata: Stratum 1: Examined amounts under $200 (N1 = 5. Experience.TXT. Basic Rule: Select strata so that their means are as different as possible and their standard deviations are as small as possible. intuition. Quantitative rather than qualitative (sex. Guidelines: • • • • • A few strata yield most of the gains (say. and the judgment of the auditor are extremely useful in improving the sampling precision through effective stratification. Coarser divisions of several stratifying variables are preferable to finer divisions of one variable. It is better to use unrelated stratifying variables. 2 to 6). 10/2004) Page 3-11 . (Rev. race. such as using the number of hospital beds to measure the “size” of a hospital.500) NOTE: These sample sizes are too small to meet OAS standards and are used for illustrative purposes only.) variables are preferable for defining strata.RAT-STATS Companion Manual Stratified Variable Appraisal Strata Formation Strata are typically defined using the dollar value of the items being sampled.TXT and the universe/sample sizes are stored in file UNIVSTRAT. The sample difference amounts for the two strata are stored in data file DATASTRAT.

Stratified Variable Appraisal RAT-STATS Companion Manual Data set DATASTRAT.TXT 1 2 5200 3500 25 25 Page 3-12 (Rev.TXT 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 80 43 133 125 116 84 111 148 104 114 83 132 96 86 66 89 72 114 135 71 127 105 102 69 76 354 328 313 250 261 294 380 296 248 277 331 305 360 348 318 290 249 362 348 355 295 277 355 314 277 Universe File UNIVSTRAT. 10/2004) .

TXT STRATUM NUMBER 1 2 TOTALS Stratum 1 SAMPLE SIZE 25 25 50 VALUE OF SAMPLE 2.063898561628 LOWER LIMIT UPPER LIMIT PRECISION AMOUNT PRECISION PERCENT T-VALUE USED LOWER LIMIT UPPER LIMIT PRECISION AMOUNT PRECISION PERCENT T-VALUE USED LOWER LIMIT UPPER LIMIT PRECISION AMOUNT PRECISION PERCENT T-VALUE USED (Rev. = 39.25 STANDARD ERROR (TOTAL) 27.RAT-STATS Companion Manual Stratified Variable Appraisal The sample results are: Stratum 1: n1 = 25.24 5.00 7.3317 Stratum 2: n2 = 25.24 STANDARD ERROR (MEAN) 5.06% 1.050 36. dev.664 572.432 56. dev.740 9.40.266.319 POINT ESTIMATE 516.002 6.6432 The following computer output was obtained from the Stratified Variables Appraisal program: DEPARTMENT OF HEALTH & HUMAN SERVICES OIG .D I F F E R E N C E ---------------------MEAN / UNIVERSE 99.046 552. mean = 99.00 NONZERO ITEMS 25 25 50 Date: 4/5/2004 Time: 12:17 --------------------.308 562.98% 1.00 10.93% 2.710882079909 95% CONFIDENCE LEVEL 459. std. std.481.07 KURTOSIS 2.317835933673 90% CONFIDENCE LEVEL 469. mean = 311.200 STANDARD DEVIATION 26.33 SKEWNESS -.OFFICE OF AUDIT SERVICES STRATIFIED VARIABLE APPRAISAL AUDIT/REVIEW: Variable . = 26.788 46.384 10. 10/2004) Page 3-13 .Stratified DATA FILE USED: C:\Temp\DATASTRAT.785.24.048 CONFIDENCE LIMITS 80% CONFIDENCE LEVEL 480.

34% 1.831 1.317835933673 90% CONFIDENCE LEVEL 1.64 -.74% 1.500 LOWER LIMIT UPPER LIMIT PRECISION AMOUNT PRECISION PERCENT T-VALUE USED LOWER LIMIT UPPER LIMIT PRECISION AMOUNT PRECISION PERCENT T-VALUE USED LOWER LIMIT UPPER LIMIT PRECISION AMOUNT PRECISION PERCENT T-VALUE USED OVERALL POINT ESTIMATE / UNIVERSE STANDARD ERROR 8.884 63.870 CONFIDENCE LIMITS 80% CONFIDENCE LEVEL 1.936 3.461 1.700 LOWER LIMIT UPPER LIMIT PRECISION AMOUNT PRECISION PERCENT Z-VALUE USED LOWER LIMIT UPPER LIMIT PRECISION AMOUNT PRECISION PERCENT Z-VALUE USED LOWER LIMIT UPPER LIMIT PRECISION AMOUNT PRECISION PERCENT Z-VALUE USED Page 3-14 (Rev.Stratified Variable Appraisal RAT-STATS Companion Manual Stratum 2 MEAN / UNIVERSE STANDARD DEVIATION SKEWNESS KURTOSIS STANDARD ERROR (MEAN) STANDARD ERROR (TOTAL) POINT ESTIMATE 311.146.98% 1.126.814 3.34% 1.032.710882079909 95% CONFIDENCE LEVEL 1.10% 1.134 1.40 39.529.069 5.281551565545 90% CONFIDENCE LEVEL 1.24% 2.184 4.063898561628 1.439 3.605.644853626951 95% CONFIDENCE LEVEL 1.089.682.959963984540 3.90 27.556.542.06 1.137.053.969 57.308 4.592 1.948 38.012 1.764 1.339 36.655.651 1.762 49.85 7.900 CONFIDENCE LIMITS 80% CONFIDENCE LEVEL 1.669.208 47. 10/2004) .132 76.042.

870) that is.048 (Stratum 1) and $1. 1.870.510. standard error. the PRECISION PERCENT is set equal to zero. the estimate of the universe total difference is $ = (5200)(99. refer to the previous section (Unrestricted Variable Appraisal). and kurtosis).900 (Stratum 2).3317 ⎛ 3500 − 25⎞ 39.132).682.510.948 ± (1.764 to $1.605.605.184 ($1. 1.287 = 38. $.906.089. The 95% confidence interval for universe total (T) is 1.40) = $1. Estimate of universe mean (:): yst = ( N1 / N ) y1 + ( N 2 / N ) y2 + ⋅ ⋅ ⋅ + ( N L / N ) y L where L = number of strata Ni = number of items in i-th stratum (universe) (Rev. Referring to the formula section and the OVERALL section in the computer output. The PRECISION AMOUNT here is $76.959963984540)(38. T NOTE: When the POINT ESTIMATE is negative.605.287 $ is The (estimated) standard error of T 1.24) + (3500)(311.6432 52002 ⎜ + 35002 ⎜ ⎟ ⎟ ⎝ 5200 ⎠ ⎝ 3500 ⎠ 25 25 = 1.529.RAT-STATS Companion Manual Stratified Variable Appraisal Discussion.906. skewness.948 T $ is The estimated variance of T 2 2 ⎛ 5200 − 25⎞ 26. FORMULAS NOTE: For definitions and formulas of the statistics within each stratum (standard deviation. The point estimates for the universe total difference amounts are $516.948 ± 76.74% of the point estimate. 10/2004) Page 3-15 .184 and is 4.

10/2004) .281551565545. replace Z.05 = 1.Stratified Variable Appraisal RAT-STATS Companion Manual N = N1 + N2 + @@@ + NL yi = average of sample items in the i-th stratum 2.025 = 1.025 v ( T ) NOTES: 1. For a 90% confidence interval.025 v ( yst ) where Z. Estimated variance of T 2 $ v (T ) = N v ( yst ) 5.10 = 1. 2. replace Z. 6. Page 3-16 (Rev. Estimated variance of yst : 2 2 ⎛ Ni − ni ⎞ si ⎟ v ( yst ) = 2 Ni ⎜ ⎝ Ni ⎠ ni N i=1 1 ∑ L where ni = number of sampled items in i-th stratum si2 = sample variance for i-th stratum $: 4.025 with Z. Approximate 95% confidence interval for universe total (T): $± Z $ T .644853626951 and for an 80% confidence interval.959963984540. The confidence intervals for each stratum total use t-values that are accurate to 12 decimal places.025 with Z. Approximate 95% confidence interval for universe mean (:): yst ± Z. Estimate of universe total (T): $ = N ⋅ y = N ⋅ y + N ⋅ y + ⋅⋅⋅ + N ⋅ y T st 1 1 2 2 L L 3.

s within each selected P.s) 2nd Stage: Hospitals (secondary units.U. it was decided to use a cluster (Rev. records.U. there are N = 90 universities with government research grants. the procedure is to first obtain a random sample of P. 3. clusters are the sampling unit (sampling units are not always individual people. In a particular region of the U. S. For a two-stage procedure. Example 3. Notice that at the first stage. Then.s) So. Put another way. The program will accept a maximum of 20 clusters.s. widespread universes. The goal of multistage sampling is to get the most precise results per unit of examination cost. Multistage sampling is a very cost-effective sampling procedure when (1) obtaining a frame that lists all elements in the universe is very costly or impossible or (2) the cost of obtaining observations increases as the distance separating the elements increases.U.). You don't have to visit all locations. 10/2004) Page 3-17 .U.U. multistage sampling is cost effective when it is more costly to get to the sampling unit than it is to audit the sampling unit. 2. You can estimate cost recoveries for the entire universe with multistage sampling and it is very useful for large.S.RAT-STATS Companion Manual Two-Stage Unrestricted Variable Appraisal Two-Stage Unrestricted Variable Appraisal This is a special case of multistage sampling. P. etc. Because these universities are so widespread.” Example: 1st Stage: Carriers (primary units. obtain a random sample of S. This is a very convenient sampling procedure for many situations. These are called clusters. General Comments 1. the universe can be broken down into “subgroups.

23. 76 Page 3-18 (Rev. Note that the selected universities are in sequential order: Universities: 2. 70. 10/2004) . 56. 28. 67. Enter the values shown in the following input screen: The resulting output is shown on the next page. 46. 5. 7.Two-Stage Unrestricted Variable Appraisal RAT-STATS Companion Manual sample of n = 10 universities. The 10 universities to be sampled may be obtained using the Single-Stage Random Numbers module discussed in the Random Numbers section.

continued. (Rev.RAT-STATS Companion Manual Two-Stage Unrestricted Variable Appraisal Date: 10/26/2004 Department of Health and Human Services OIG . and mi is the number of audited grants at the i-th university. The following data were obtained where yi is the dollars (in thousands) of improper charges for the ith sampled university. Also. yi and si 2 are the mean and variance of the sample values from the i-th university. There are a total of M = 4500 grants in all 90 universities (see NOTE 5 under Formula 2). it was decided (based on available resources) to audit roughly 20% of the grants at each selected university. 10/2004) Page 3-19 . Rather than audit all grants at a selected university. Selection Order 4 7 9 10 5 2 6 1 8 3 Value 2 5 7 23 28 46 56 67 70 76 SUMMATION OF RANDOM NUMBERS = 380 Example 3 -.RANDOM NUMBER EACH COLUMN OF NUMBERS IS RIGHT JUSTIFIED. Mi is the total number of grants at the i-th university.00 FILE OF RANDOM NUMBERS: C:\TEMP\SELECT.Office of Audit Services Random Number Generator AUDIT: select FRAME SIZE: Time: 11:03 90 SEED NUMBER: 1357.ORDER OF SELECTION POSITIONS 7 THROUGH 17 .TXT TOTAL RANDOM NUMBERS GENERATED: 10 THE NUMBERS ARE IN THE FOLLOWING FORMAT IN YOUR FILE: POSITIONS 1 THROUGH 6 .

12 14.Two-Stage Unrestricted Variable Appraisal RAT-STATS Companion Manual For ease of illustration. and so on. 2. 4. 3 12. 6. 8. 9. 7. These data are in data set DATA2STG. 10/2004) . 4 yi 5. 0. 11. 6. 3. 7. 12. 1.00 3.00 5. 2. 10 11. 6. 6. 1.80 This example violates OAS minimum sample sizes of at least 30 grants at each university. 5. 3.75 13. 4. 5. 2. 3. 0. 1. 4. 4. 4. 2. 3. 3. 6. The universe sizes (the Mi's) are in data set UNIV2STG.67 4. 3.30 3. 0.67 16.80 4. 7. Univ. 11. 0. 5 4. 2.TXT.83 5. 4 3. 5 5. 4. 8. 11. 4. It is used for illustration only. 2. 5. 6. 7. 1.14 4. 9. Dataset DATA2STG. 4.85 4.TXT. 4. 3 6. 4. 1 2 3 4 5 6 7 8 9 10 NOTE: Mi 50 65 45 48 52 58 42 66 40 56 mi 10 13 9 10 10 12 8 13 8 11 Dollars (yi. 4. 2. 0. 11. 0. 8. 10. 9. 8.13 11.40 4. 7. university 1 refers to university 2.TXT 1 5 2 7 3 9 4 0 5 11 6 2 7 8 8 4 9 3 10 5 11 4 12 3 13 7 14 2 15 11 16 0 17 1 18 9 19 4 20 3 21 2 22 1 Dataset UNIV2STG. 3.88 5.88 5. 9. 5.31 6. 11. 0. 7. 1. 1. in thousands) 5. 4.38 10. 1. The program output immediately follows. 0. 4. 4 6. 8. university 2 refers to university 5.00 si 2 11. 2 3. 5 6.29 11. 8. 1. 3. 3.TXT 1 50 10 2 65 13 3 45 9 4 48 10 5 52 10 6 58 12 7 42 8 8 66 13 9 40 8 10 56 11 Page 3-20 (Rev. 0. 4. 2. 0. 4. 2.

978 4.491 22.712 1.14 4.500 CONFIDENCE LIMITS 80% CONFIDENCE LEVEL 20.12 14.80 11.00 3. 10/2004) Page 3-21 .29 11.602 867 270 260 255 230 224 222 210 254 195 280 2.83 5.14% 1.281551565545 21.111 5.TXT Time: 10:07 UNIT NBR 1 2 3 4 5 6 7 8 9 10 --------------------.D I F F E R E N C E ---------------------SAMPLE SIZE/ NONZERO ITEMS SAMPLE MEAN VARIANCE UNIVERSE SIZE POINT ESTIMATE 10/9 13/12 9/8 10/8 10/9 12/10 8/8 13/12 8/8 11/10 104/94 5.00 4.31 6.88 5.75 13.400 NOT SAMPLED 80 OVERALL TOTALS 90 STANDARD ERROR LOWER LIMIT UPPER LIMIT PRECISION AMOUNT PRECISION PERCENT Z-VALUE USED (Rev.00 5.OFFICE OF AUDIT SERVICES TWO-STAGE UNRESTRICTED VARIABLE APPRAISAL AUDIT/REVIEW: Variable 2-Stage DATA FILE USED: C:\Temp\DATA2STG.40 4.30 3. .13 11.67 16.80 50 65 45 48 52 58 42 66 40 56 522 3. .67 4.88 5.RAT-STATS Companion Manual Two-Stage Unrestricted Variable Appraisal 23 .85 4.80 4. 94 95 96 97 98 99 100 101 102 103 104 5 6 7 5 10 11 2 1 4 0 5 4 Date: 10/29/2004 DEPARTMENT OF HEALTH & HUMAN SERVICES OIG .38 10.

903 23.300 1.699.959963984540)(867) = 1.600. Since there are 90 universities in the universe.300.602. The point estimate for the universe total (T) is n $ = N i=1 T ∑ Mi yi n Page 3-22 (Rev.000).699 7. $21.027 1. $1.000).60% 1.000 (more precisely.000 to $23.176 23. $21. The PRECISION AMOUNT at the 95% confidence level is $ ) = (1. Notice that the output also contains the estimated totals for each primary unit (university).300 ($19.000). that is. the point estimate for the universe total is (90)(240) = 21. For example. FORMULAS 1.Two-Stage Unrestricted Variable Appraisal RAT-STATS Companion Manual LOWER LIMIT UPPER LIMIT PRECISION AMOUNT PRECISION PERCENT Z-VALUE USED LOWER LIMIT UPPER LIMIT PRECISION AMOUNT PRECISION PERCENT Z-VALUE USED 90% CONFIDENCE LEVEL 20. The 95% confidence interval for this amount is from 19.000). the estimated total difference for university 1 is 270 ($270. 21. The point estimate (highlighted) for the universe total difference amount is 21.699 (that is.000).644853626951 95% CONFIDENCE LEVEL 19.600.000).400/10 = 240 ($240.86% 1.602 (that is.425 6.903.602. 10/2004) .903 to 23.959963984540 Discussion. This value (Z-value)(standard error of T is 7.86% of the point estimate. The sample average of these estimates is 2.

644853626951 and for an 80% confidence interval replace 1.RAT-STATS Companion Manual Two-Stage Unrestricted Variable Appraisal $ is 2. 5. The PRECISION AMOUNT at the 95% confidence level for the universe total is $ ). For the PRECISION AMOUNT at the 90% confidence level. replace 1. $ is the square root of v( T $ ).959963984540 with 1. replace 1. The approximate 95% confidence interval for T is $ ± 1959963984540 $) T . 10/2004) Page 3-23 .959963984540 with 1. 3. (1. replace 1.959963984540)(standard error of T 4.959963984540 with 1.959963984540 with 1. The total number of secondary units in the universe (M) may be known or unknown and is not used in any of the calculations. 2. v (T NOTE: For a 90% confidence interval. The STANDARD ERROR of T 3. For the PRECISION AMOUNT at the 80% confidence level.281551565545.281551565545.644853626951. (Rev. n = number of primary units in the sample and N = number of primary units in the universe. The estimated variance of T 2⎞ ⎛ n $ ⎛ ⎞ T ⎟ ⎜ ⎜ Mi yi − ⎟ ⎟ ⎜ n N⎠ ⎝ ⎛ Mi − mi ⎞ si 2 N N ( N − n ) ⎛ ⎞ ⎟ ⎜ i = 1 2 $ ⎟ Mi ⎜ v (T ) = ⎜ ⎟⎜ ⎟+ n ⎝ ⎠ n− 1 n ⎝ Mi ⎠ mi i=1 ⎟ ⎜ ⎟ ⎜ ⎠ ⎝ ∑ ∑ NOTES: 1.

8. where Mi is the number of grants in the universe for each university. 10/2004) . The situation discussed in Example 3 was extended the following year to a threestage procedure by defining: Stage 1: REGION (select 4 out of 12 regions) Stage 2: UNIVERSITY (select 10 from each selected region) Stage 3: GRANT (select approximately 20% of all grants at each university) NOTE: This example violates OAS minimum sample sizes and is used for illustration only. Using the Single-Stage Random Numbers module.TXT. and 10 were selected as the sampled primary units. 10 universities (secondary units) were randomly selected from the available universities in each of the four selected regions. and sij is the sample standard deviation of the items from the j-th university within the i-th region. The data on the following four pages were obtained. yij is the sample average of the items from the j-th university within the i-th region. Next. REGION 5 -. Univ. regions 5. 1 2 3 4 5 6 7 8 9 10 Mi 47 51 45 46 46 50 50 57 54 64 mi 9 10 9 9 9 10 10 11 11 13 Data (total thousands of dollars of improper charges) Page 3-24 (Rev. mi is the number of sampled grants at each university (chosen to be roughly 20% of Mi).Universe contains 90 universities.Three-Stage Unrestricted Variable Appraisal RAT-STATS Companion Manual Three-Stage Unrestricted Variable Appraisal Example 4. The resulting data are stored in file DATA3STG. 7.

91 5.20 5. 1 2 3 4 5 6 7 8 9 10 11 12 13 1 * 8 * 0 * 6 * 6 * 0 * 13 * 1 * 7 * 2 * * * * 2 13 13 4 6 0 15 12 9 0 13 3 10 0 12 14 13 13 13 2 9 4 11 9 12 1 10 11 15 7 8 University 5 6 14 14 5 1 2 11 14 15 11 5 11 0 4 13 8 15 2 0 14 7 0 14 4 10 13 10 8 0 7 3 8 0 3 0 6 1 0 3 13 7 5 4 9 2 9 15 4 13 12 14 6 11 0 1 10 12 13 14 11 6 10 11 0 7 12 9 11 7 9.73 y1 j s1 j 4.02 3.33 3.79 6.90 5. 1 2 3 4 5 6 7 8 9 * * * * * * * * * 1 0 4 0 10 11 18 18 16 2 2 12 0 15 11 0 18 0 17 8 3 4 19 16 12 4 2 1 5 1 4 5 10 2 10 12 7 3 0 20 University 5 6 0 4 6 9 13 19 9 18 16 2 18 0 0 12 17 0 0 16 7 4 16 0 8 3 8 13 0 0 8 19 13 0 13 4 0 0 0 0 9 17 8 15 12 0 20 17 6 9 10 16 17 6 10 2 6 13 0 12 (Rev.78 4.38 8.82 3.91 9.46 3. Univ.67 5.56 5. 10/2004) Page 3-25 . 1 2 3 4 5 6 7 8 9 10 Mi 53 59 52 67 59 73 51 75 66 58 mi 11 12 10 13 12 15 10 15 13 12 Data (total thousands of dollars of improper charges) Obs.64 9.52 7.13 9.RAT-STATS Companion Manual Three-Stage Unrestricted Variable Appraisal Obs.92 7.50 5.Universe contains 110 universities.52 REGION 7 -.

90 7 13 10 10 6 9 8 1 8 0 13 9 3 14 14 12 11 11 6 15 11 9.46 10.33 6.08 7.40 8.78 0 0 14 11 14 5 1 13 0 5 10 6.64 5.Three-Stage Unrestricted Variable Appraisal RAT-STATS Companion Manual 10 11 12 13 14 15 * * * * * * 18 18 12 13 5 8 10 0 0 15 15 0 13 3 0 4 3 6 5 5.41 7.86 2.92 4.Universe contains 85 universities.23 s2 j REGION 8 -.10 3.56 5. 10/2004) .67 8.59 7.20 6.23 10.38 5.17 6.73 6.66 9 6 8 3 12 4 8 1 2 4 3 10 14 0 0 10 12 0 7 6 0 y3 j 5.30 6.01 6.14 3.69 9.45 7.25 6.44 5.20 3.60 10.80 5.65 6.12 6.65 9 13 4 16 13 0 4 16 7 2 7 20 19 0 y2 j 10. Univ.10 5. 1 2 3 4 5 6 7 8 9 10 Mi 45 39 43 34 54 54 34 59 49 43 mi 9 8 9 7 11 11 7 12 10 9 Data (total thousands of dollars of improper charges) Obs. 1 2 3 4 5 6 7 8 9 10 11 12 * * * * * * * * * * * * 1 6 5 1 3 12 7 0 3 12 2 0 8 0 1 10 15 1 14 3 10 15 11 6 12 2 14 7 0 4 4 1 2 5 2 9 4 University 5 6 13 13 15 10 10 7 0 0 3 13 1 7.21 7.68 s3 j Page 3-26 (Rev.73 5.59 6.44 4.

38 8 16 2 14 0 0 0 6 19 17 13 12 13 12 11 14 13 0 1 9.39 6. each line begins with a counter (1. 1 2 3 4 5 6 7 8 9 10 mi Mi 59 12 68 14 57 11 72 14 70 14 73 15 83 17 89 18 73 15 77 15 Data (total thousands of dollars of improper charges) Obs.57 5.21 10.67 6.25 8.06 6.17 s4 j Using the file construction suggested in the User’s Guide for this module.. 2.67 8.Universe contains 120 universities. a value identifying the primary unit number in the second (Rev.55 8.80 6.27 7. 10/2004) Page 3-27 .53 6.64 6.47 7.44 9. In the secondary unit file.77 9 20 14 5 0 19 0 12 5 15 15 2 15 2 2 1 10 18 0 8 4 5 17 0 0 13 8 10 7 10 0 0 y4 j 7.RAT-STATS Companion Manual Three-Stage Unrestricted Variable Appraisal REGION 10 -. the primary unit file and secondary unit file could be constructed as shown below. 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 * * * * * * * * * * * * * * * * * * 1 0 10 14 0 18 0 8 20 19 0 0 3 2 3 13 0 12 12 7 1 13 2 0 16 14 17 5 3 8 10 1 8 19 18 15 0 0 17 18 4 9 10 15 16 6 17 5 2 8 9 16 7 0 0 University 5 6 7 6 3 15 12 10 17 0 0 0 0 0 7 16 15 15 7 12 15 4 0 14 18 17 13 6 0 3 4 7 18 10 17 18 0 0 8 0 3 18 12 0 11 19 0 9 6 8. 3.76 7.). Univ.32 8.36 6. ..

10/2004) .TXT 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 1 1 1 1 1 1 1 1 1 1 2 2 2 2 2 2 2 2 2 2 3 3 3 3 3 3 3 3 3 3 4 4 4 4 4 4 4 4 4 4 1 2 3 4 5 6 7 8 9 10 1 2 3 4 5 6 7 8 9 10 1 2 3 4 5 6 7 8 9 10 1 2 3 4 5 6 7 8 9 10 UNIV1 UNIV2 UNIV3 UNIV4 UNIV5 UNIV6 UNIV7 UNIV8 UNIV9 UNIV10 UNIV1 UNIV2 UNIV3 UNIV4 UNIV5 UNIV6 UNIV7 UNIV8 UNIV9 UNIV10 UNIV1 UNIV2 UNIV3 UNIV4 UNIV5 UNIV6 UNIV7 UNIV8 UNIV9 UNIV10 UNIV1 UNIV2 UNIV3 UNIV4 UNIV5 UNIV6 UNIV7 UNIV8 UNIV9 UNIV10 47 51 45 46 46 50 50 57 54 64 53 59 52 67 59 73 51 75 66 58 45 39 43 34 54 54 34 59 49 43 59 68 57 72 70 73 83 89 73 77 9 10 9 9 9 10 10 11 11 13 11 12 10 13 12 15 10 15 13 12 9 8 9 7 11 11 7 12 10 9 12 14 11 14 14 15 17 18 15 15 Page 3-28 (Rev. and a value identifying the secondary unit number within each primary unit in the third column.Three-Stage Unrestricted Variable Appraisal RAT-STATS Companion Manual column. Data set PRIMARY3STG.TXT 1 2 3 4 REGION REGION REGION REGION 5 7 8 10 90 110 85 120 10 10 10 10 Data set SECONDARY3STG.

. the data file could be constructed as shown below. 2.RAT-STATS Companion Manual Three-Stage Unrestricted Variable Appraisal Once again using the file construction suggested in the User’s Guide for this module. The data lines for the first two universities in Region 5 and the last two universities in Region 10 are shown.). 3. . Each line begins with a counter (1. a value identifying the primary unit number in the second column. 10/2004) Page 3-29 . and a value identifying the third-stage unit number within each sampled primary/secondary unit in the fourth column. .. 433 4 9 434 4 9 435 4 9 436 4 9 437 4 9 438 4 9 439 4 9 440 4 9 441 4 9 1 2 3 4 5 6 7 8 9 1 2 3 4 5 6 7 8 9 10 8 0 6 6 0 13 1 7 2 13 13 4 6 0 15 12 9 0 13 1 2 3 4 5 6 7 8 9 20 14 5 0 19 0 12 5 15 (Rev. a value identifying the secondary unit number within each primary unit in the third column. Data set DATA3STG. . The sample value appears in the fifth column.TXT 1 1 1 2 1 1 3 1 1 4 1 1 5 1 1 6 1 1 7 1 1 8 1 1 9 1 1 10 1 2 11 1 2 12 1 2 13 1 2 14 1 2 15 1 2 16 1 2 17 1 2 18 1 2 19 1 2 .

56 26. DEPARTMENT OF HEALTH & HUMAN SERVICES OIG .49 54 427 UNIV10 13 12 9.256 Date: 10/4/2004 Time: 17:34 Page 3-30 (Rev.51 50 360 UNIV7 10 8 6.OFFICE OF AUDIT SERVICES THREE-STAGE UNRESTRICTED VARIABLE APPRAISAL AUDIT/REVIEW: Variable 3-Stage DATA FILE USED: C:\Temp\DATA3STG. 10/2004) .83 51 434 UNIV3 9 8 9.82 15.20 33.91 30.21 50 345 UNIV8 11 8 3.50 46 445 UNIV6 10 8 7.90 25.D I F F E R E N C E ---------------------FIRST STAGE SAMPLE NONSECOND STAGE SIZE ZEROES SAMPLE MEAN VARIANCE UNIVERSE POINT ESTIMATE ================ ====== ====== =========== ======== ========== ============== REGION 5 UNIV1 9 7 4.19 47 225 UNIV2 10 8 8.36 57 218 UNIV9 11 10 7.67 30.46 13.Three-Stage Unrestricted Variable Appraisal RAT-STATS Companion Manual 442 443 444 445 446 447 448 449 450 451 452 453 454 455 456 457 458 459 460 461 462 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 9 9 9 9 9 9 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 11 12 13 14 15 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 15 2 15 2 2 1 18 0 8 4 5 17 0 0 13 8 10 7 10 0 0 The program output using these three files is shown on the following pages.50 31.33 15.78 19.94 64 606 COMBINED 101 392 90 35.28 45 430 UNIV4 9 9 9.TXT --------------------.25 46 429 UNIV5 9 9 9.

388 8 6 8 7 9 8 7 11 10 5 5.20 7.48 21.21 31.06 59 452 UNIV2 14 12 8.73 6.21 39.65 72 617 UNIV5 14 9 6.86 7.78 40.643 245 239 368 131 417 358 277 585 250 234 26.281551565545 604 UNIVERSE 12 405 2.64 8.48 70 465 UNIV6 15 13 9.53 39.57 33.60 52.44 6.08 506 59.27 53 59 52 67 59 73 51 75 66 58 110 554 546 374 484 600 419 311 495 690 585 55.17 5.56 3.76 54.615 131.26 68 559 UNIV3 11 9 10.45 9.10 5.14 33.98 73 696 UNIV7 17 12 8.64 41.69 43.314 700.298 120 SAMPLED 4 40 462 569.10 6.07 43.44 83 727 UNIV8 18 14 9.14 9.67 68.55 73 618 UNIV10 15 10 6.464 102.67 38.88 49.151 23.337 72.42 44.05 57 591 UNIV4 14 12 8.46 10.06 45. 10/2004) Page 3-31 .92 5.72 11.44 310 18.12 8.47 54.10 77 513 COMBINED 145 STAGES FIRST SECOND THIRD OVERALL POINT ESTIMATE OVERALL STANDARD ERROR CONFIDENCE LIMITS 80% CONFIDENCE LEVEL 438.25 7.73 6.03% 1.23 10.07 39.RAT-STATS Companion Manual Three-Stage Unrestricted Variable Appraisal REGION 7 UNIV1 UNIV2 UNIV3 UNIV4 UNIV5 UNIV6 UNIV7 UNIV8 UNIV9 UNIV10 COMBINED REGION 8 UNIV1 UNIV2 UNIV3 UNIV4 UNIV5 UNIV6 UNIV7 UNIV8 UNIV9 UNIV10 COMBINED 11 12 10 13 12 15 10 15 13 12 123 9 8 9 7 11 11 7 12 10 9 93 9 9 10 10 10 10 7 9 12 10 10.43 32.85 14.03 7.98 27.82 89 806 UNIV9 15 13 8.42 34.D I F F E R E N C E ---------------------FIRST STAGE SAMPLE NONSECOND STAGE SIZE ZEROES SAMPLE MEAN VARIANCE UNIVERSE POINT ESTIMATE ================ ====== ====== =========== ======== ========== ============== REGION 10 UNIV1 12 7 7.36 57.48 41.11 38.60 10.534 LOWER LIMIT UPPER LIMIT PRECISION AMOUNT PRECISION PERCENT Z-VALUE USED (Rev.28 45 39 43 34 54 54 34 59 49 43 85 --------------------.

10/2004) .337) = 200.644853626951 95% CONFIDENCE LEVEL 368.256 + 55.042 200. $569.643 + 26. Since there are 12 regions in the universe.000).042 ($368.887 to 770. For example. The sample average of the 10 university estimates in region 5 is 392 ($392.464.534)/4 = 47.388 + 72.000 to $770. the estimated total difference for UNIV1 within region 5 is 225 ($225.000). the point estimate for the region 5 total is (90)(392) = 35.25 ($47.134 737.578 (that is. The 95% confidence interval for this amount is from 368.256.280 (more precisely. The PRECISION $)= AMOUNT at the 95% confidence level is (Z-value)(standard error of T (1.578 35. Notice that the output also contains the estimated totals for each primary unit (region) and each secondary unit (university).22% 1.887.455.959963984540)(102. (2) 2. The point estimate and confidence intervals: The point estimate (highlighted) for the universe total difference amount is 569.578. This value is 35.000).959963984540 Some highlighted values: (1) 405 is 90 + 110 + 85 + 120.887 770.Three-Stage Unrestricted Variable Appraisal RAT-STATS Companion Manual LOWER LIMIT UPPER LIMIT PRECISION AMOUNT PRECISION PERCENT Z-VALUE USED LOWER LIMIT UPPER LIMIT PRECISION AMOUNT PRECISION PERCENT Z-VALUE USED 90% CONFIDENCE LEVEL 401.000).000). The average of the four regional estimates is (35.256 or $35.455. 35.464 (that is.795 168.000).250). (3) 462 is 101 + 123 + 93 + 145.298 is the total number of thirdstage units (universe) for the four sampled primary units.042.22% of the point estimate.56% 1. Since there are 90 universities in this region.330 29. $200. the (unbiased) point estimate for the Page 3-32 (Rev.

.s in the universe mi = number of sampled secondary units (S.463.U. bij. (i = 1. .455.U. . The actual amount (highlighted) is $569. . mi. . and where j= 1 $ = T ij Notation: ∑ yijk is the estimate of the total for the j-th S..U. . and j-th S.U. . . .U. n and j = 1. (i = 1. . k=1 n = number of primary units (P. . FORMULAS 1. .U.s in the universe in the i-th P. .U.RAT-STATS Companion Manual Three-Stage Unrestricted Variable Appraisal universe total is (12)(47. (i = 1. n and j = 1. . .U. . mi) yijk = sample value of the k-th item from the i-th P. n) Mi = number of S. .000. .. Point estimate of the universe total (T): $= N T n ∑ T$i i= 1 mi n $ = where T i Bij bij Mi mi bij ∑ T$ij is the estimate of the total for the i-th sampled P. .U.U.s) in the sample N = number of P. 10/2004) Page 3-33 . k = 1. . (i = 1..s) in the i-th P.U. mi) Bij = number of 3rd-stage items in the universe for the i-th P. . n and j = 1.U. Bij for each sampled primary and secondary unit must be known.. bij) NOTE: The value of n. along with mi. $: 2.. n) bij = number of 3rd-stage items in the sample for the i-th P. . and j-th S.. . .250) = approximately $569.. N. Mi. .000. within the i-th sampled P. Estimated variance of T (Rev. (i = 1.U.U. and j-th S... .U.464. .

10/2004) .Three-Stage Unrestricted Variable Appraisal RAT-STATS Companion Manual $ ) = N ( N − n) s 2 + N v (T n n ∑ n i=1 Mi ( Mi − mi ) 2 N si + mi n ∑ ∑ i=1 n Mi mi mi Bij ( Bij − bij ) bij sij 2 j= 1 2 ⎡ n ⎤ ⎛ n ⎞ 2 ⎢ $2 − ⎜ $ ⎟ / n⎥ / (n − 1) where s = ⎢ T T i i⎟ ⎥ ⎜ ⎝ i= 1 ⎠ 1 = i ⎢ ⎥ ⎣ ⎦ ∑ ∑ 2 ⎡ mi ⎤ ⎛ mi ⎞ ⎢ ⎥ ⎜ ⎟ $ 2− $ si 2 = ⎢ T T ij ij ⎟ / mi ⎥ / (mi − 1) ⎜ ⎢ j=1 ⎥ ⎝ j=1 ⎠ ⎣ ⎦ ∑ ∑ ∑ 2 ⎡ bij ⎤ ⎛ bij ⎞ ⎢ ⎥ ⎜ ⎟ sij 2 = ⎢ yijk 2 − ⎜ yijk ⎟ / bij ⎥ / (bij − 1) ⎢ k=1 ⎥ ⎝ k=1 ⎠ ⎣ ⎦ ∑ 3. Page 3-34 (Rev.959963984540 with 1. replace 1.644853626951 and for an 80% confidence interval. v (T NOTE: For a 90% confidence interval. replace 1. The approximate 95% confidence interval for T: $ ± 1959963984540 $) T .959963984540 with 1.281551565545.

U. When the primary units (P.RAT-STATS Companion Manual RHC Two-Stage Variable Appraisal RHC Two-Stage Variable Sampling For a discussion on the motivation behind the RHC (developed by Rao. Hartley. Rather than audit all grants at a selected university.s should contain a larger number of S.U. The P. (Note: This is the same example used in Example 8 in the RHC Sample Selection discussion contained in the Random Numbers section).s) are selected.U.U. (Rev. and the number of secondary units (S..U. refer to the RHC Sample Selection section. you can expect improved precision using the RHC procedure if there is a high correlation between the size of each P. it was decided to use a cluster sample of n = 10 universities. In general.U. The size of each P.s) within each P. the total grant dollars were used.U. dollars in error) is/are recorded. In other words. and Cochran) sampling procedure. A random sample is then obtained for each selected P. larger P. the size of each P. Example 5. is rather arbitrary and can be the number of people. and the variable(s) of interest (e.s.U.. In a particular region of the United Sates there are N = 90 universities (primary units) with Government research grants. beds (for hospitals).s. dollars.U. 10/2004) Page 3-35 . As a measure of the size for each university. is considered rather than obtaining a simple random sample of P. contained in the Random Numbers section of this manual.U. It provides a method of sample selection that allows sampling without replacement while maintaining the flavor of sampling using probability proportional to size. and so forth. Because these universities are so widespread. it was decided (based on available resources) to audit roughly 20% of the grants at each selected university.g.s are selected using the RHC Sample Selection program.U.

Data set UNIVRHC. UNIV19. 10/2004) .RHC Two-Stage Variable Appraisal RAT-STATS Companion Manual DATA: University ID.TXT and the program output are contained in the pages to follow. UNIV5. Page 3-36 (Rev. UNIV42. UNIV49. This program is required by the RHC appraisal program.TXT. The output file created by this program is OUTRHC. UNIV28. number of grants. OUTPUT: The 10 universities to use in the sample (see last page of computer output): UNIV78. UNIV38. UNIV60. UNIV75 Here there are 10 groups with 9 universities per group. UNIV62.TXT. total grant dollars (90 rows) The data are contained in UNIVRHC.

RAT-STATS Companion Manual RHC Two-Stage Variable Appraisal Dataset UNIVRHC..continued . 10/2004) Page 3-37 .000) µ This is the size of the university..TXT < ...> (1) (2) (3) UNIV31 UNIV32 UNIV33 UNIV34 UNIV35 UNIV36 UNIV37 UNIV38 UNIV39 UNIV40 UNIV41 UNIV42 UNIV43 UNIV44 UNIV45 UNIV46 UNIV47 UNIV48 UNIV49 UNIV50 UNIV51 UNIV52 UNIV53 UNIV54 UNIV55 UNIV56 UNIV57 UNIV58 UNIV59 UNIV60 52 66 25 60 19 24 44 76 41 77 37 63 52 76 51 23 24 68 34 49 55 38 72 51 71 59 23 57 53 64 11 14 5 12 4 5 9 17 9 18 8 12 11 17 10 4 5 15 7 10 11 9 16 10 15 12 4 11 11 13 < .continued ...> UNIV1 UNIV2 UNIV3 UNIV4 UNIV5 UNIV6 UNIV7 UNIV8 UNIV9 UNIV10 UNIV11 UNIV12 UNIV13 UNIV14 UNIV15 UNIV16 UNIV17 UNIV18 UNIV19 UNIV20 UNIV21 UNIV22 UNIV23 UNIV24 UNIV25 UNIV26 UNIV27 UNIV28 UNIV29 UNIV30 42 21 63 74 51 43 57 49 63 18 64 56 19 44 20 34 25 38 72 46 44 64 45 55 29 36 40 78 49 60 8 4 13 16 11 9 11 10 13 4 13 11 4 9 4 7 6 9 16 10 9 13 9 11 7 7 9 18 10 12 UNIV61 UNIV62 UNIV63 UNIV64 UNIV65 UNIV66 UNIV67 UNIV68 UNIV69 UNIV70 UNIV71 UNIV72 UNIV73 UNIV74 UNIV75 UNIV76 UNIV77 UNIV78 UNIV79 UNIV80 UNIV81 UNIV82 UNIV83 UNIV84 UNIV85 UNIV86 UNIV87 UNIV88 UNIV89 UNIV90 66 77 31 46 32 68 41 28 66 31 27 33 23 71 75 47 50 37 77 49 76 66 28 77 27 75 71 59 71 72 13 18 7 9 7 14 9 6 14 7 6 7 4 15 16 10 10 7 18 10 17 14 6 17 6 17 15 12 15 16 Columns: (1) unit ID (2) number of grants (3) grant dollar amount (x $100.. (Rev..

TXT GROUPS OF PRIMARY UNITS Time: 12:52 ********* GROUP 1 ********* PRIMARY UNIT PRIMARY UNIT IDENTIFICATION SIZE ============================== ============= UNIV51 11 UNIV44 17 UNIV32 14 UNIV78 <-.RHC Two-Stage Variable Appraisal RAT-STATS Companion Manual Date: 10/15/2004 DEPARTMENT OF HEALTH & HUMAN SERVICES OIG .selected 7 UNIV79 18 UNIV2 4 UNIV52 9 UNIV33 5 UNIV47 5 GROUP TOTALS: 9 90 SECONDARY UNIVERSE ============= 55 76 66 37 77 21 38 25 24 419 SECONDARY UNIVERSE ============= 43 63 32 77 51 42 49 24 31 412 ********* GROUP 2 ********* PRIMARY UNIT PRIMARY UNIT IDENTIFICATION SIZE ============================== ============= UNIV6 9 UNIV42 <-.selected 12 UNIV65 7 UNIV40 18 UNIV45 10 UNIV1 8 UNIV80 10 UNIV36 5 UNIV70 7 GROUP TOTALS: 9 86 < Groups 3 Through 9 Are Omitted Here > ********* GROUP 10 ********* PRIMARY UNIT PRIMARY UNIT IDENTIFICATION SIZE ============================== ============= UNIV22 13 UNIV39 9 UNIV88 12 UNIV55 15 UNIV29 10 UNIV75 <-.selected 16 UNIV87 15 UNIV13 4 UNIV53 16 GROUP TOTALS: 9 110 SECONDARY UNIVERSE ============= 64 41 59 71 49 75 71 19 72 521 Page 3-38 (Rev.OFFICE OF AUDIT SERVICES GENERATION OF PRIMARY UNIT SAMPLE NAME OF INPUT FILE: C:\TEMP\UNIVRHC. 10/2004) .

10/2004) Page 3-39 .OFFICE OF AUDIT SERVICES Date: 10/15/2004 GENERATION OF PRIMARY UNIT SAMPLE NAME OF OUTPUT FILE: C:\TEMP\OutRHCsummary.00 Time: 12:52 NUMBER OF PRIMARY UNITS IN THE POPULATION: NUMBER OF PRIMARY UNITS SAMPLED: SECONDARY UNIVERSE ============= 37 63 34 51 72 76 77 78 64 75 PRIMARY UNIT ID ========================= UNIV78 UNIV42 UNIV49 UNIV5 UNIV19 UNIV38 UNIV62 UNIV28 UNIV60 UNIV75 PRIMARY UNIT SIZE ============= 7 12 7 11 16 17 18 18 13 16 GROUP SIZE ============= 90 86 96 84 89 89 92 115 99 110 UNITS IN GROUP ===== 9 9 9 9 9 9 9 9 9 9 (Rev.RAT-STATS Companion Manual RHC Two-Stage Variable Appraisal DEPARTMENT OF HEALTH & HUMAN SERVICES OIG .txt FIRST SEED NUMBER: 100.00 SECOND SEED NUMBER: 90 10 200.

This file (PRIMRHC2. are used as the input files for the RHC Two-Stage appraisal program. and mi is the number of audited grants at the i-th university. 10/2004) . The output file created using the sample selection program is also used as input to the appraisal program in this two-step procedure.TXT.) was chosen to be approximately 20% of the universe size.TXT). Both files are listed on the next page and the computer output from the appraisal program immediately follows.RHC Two-Stage Variable Appraisal RAT-STATS Companion Manual Example--continued. The sample size for each selected university (P.TXT. Page 3-40 (Rev. The illustrated data file (DATARHC2. University UNIV78 UNIV42 UNIV49 UNIV5 UNIV19 UNIV38 UNIV62 UNIV28 UNIV60 UNIV75 Mi 37 63 34 51 72 76 77 78 64 75 mi 7 13 7 10 14 15 15 16 13 15 125 Data from these 125 secondary units (grants) were obtained by recording the total amount that was charged to each grant after the scheduled completion of this grant (dollars in error).U. along with DATARHC2. This leads to the following table where Mi is the total number of grants at the i-th university.TXT) contains the data for the first two universities (primary units) and the last two universities. The error amounts (in thousands of dollars) for each grant are contained in data set DATARHC2.

RAT-STATS Companion Manual RHC Two-Stage Variable Appraisal Dataset DATARHC2.TXT 1 9 2 2 3 9 4 6 5 0 6 5 7 7 8 2 9 7 10 6 11 0 12 6 13 0 14 3 15 4 16 1 17 13 18 8 19 0 20 6 21 11 22 8 23 8 24 0 . 100 7 101 10 102 2 103 6 104 0 105 8 106 4 107 0 108 10 109 3 110 2 111 5 112 10 113 0 114 0 115 0 116 0 (Rev. . . 10/2004) Page 3-41 .

TXT was created by adding the third column containing sample sizes to the output file created by the RHC Sample Selection program.RHC Two-Stage Variable Appraisal RAT-STATS Companion Manual 117 118 119 120 121 122 123 124 125 0 8 9 0 2 8 4 6 2 Output/Input file PRIMRHC2. 10/2004) .TXT UNIV78 UNIV42 UNIV49 UNIV5 UNIV19 UNIV38 UNIV62 UNIV28 UNIV60 UNIV75 37 63 34 51 72 76 77 78 64 75 7 13 7 10 14 15 15 16 13 15 7 12 7 11 16 17 18 18 13 15 90 86 96 84 89 89 92 115 99 110 9 9 9 9 9 9 9 9 9 9 NOTE: File PRIMRHC2. Page 3-42 (Rev.

00 14 84.00 102 PRIMARY UNIT FILE USED: C:\TEMP\PRIMRHC2.00 11 54.142 1.857.00 8 67.TXT Time: 13:26 PRIMARY UNIT ======= 1 2 3 4 5 6 7 8 9 10 TOTALS SAMPLE SIZE ====== 7 13 7 10 14 15 15 16 13 15 125 ==DIFFERENCE=== NUMBER OF SAMPLE TOTAL NONZERO ITEMS ============= ============= 38. 10/2004) Page 3-43 .636 5.RAT-STATS Companion Manual RHC Two-Stage Variable Appraisal Date: 10/26/2004 DEPARTMENT OF HEALTH & HUMAN SERVICES OIG . 37. NBR ==== 1 2 3 4 5 6 7 8 9 10 --.073 P.27 77 * Note: 2.79 72 5. This is the point estimate for P.00 13 76.714 7.71 34 5.txt UNITS SECONDARY PRIMARY IN PRIMARY UNIT ID UNIVERSE UNIT SIZE GROUP SIZE GROUP ========================= ============= ============= ============= ===== UNIV78 37 7 90 9 UNIV42 63 12 86 9 UNIV49 34 7 96 9 UNIV5 51 11 84 9 UNIV19 72 16 89 9 UNIV38 76 17 89 9 UNIV62 77 18 92 9 UNIV28 78 18 115 9 UNIV60 64 13 99 9 UNIV75 75 15 110 9 TOTALS 627 134 950 90 P.198 2.563 5.016 2.43 37 4.00 10 33.111 POINT ESTIMATE ============= 2.945 2.00 4 55.43. 1.582 * 1.00 6 56.OFFICE OF AUDIT SERVICES RHC TWO-STAGE VARIABLE APPRAISAL AUDIT/REVIEW: RHC 2-Stage DATA FILE USED: C:\TEMP\DATARHC2.07 76 5.00 9 603.235 5.582 is the product of 5.857 7.00 13 61.31 63 4.U.U. SIZES RATIO ======== 12.167 13. NBR ==== 1 2 3 4 5 6 7 SAMPLE SIZE ====== 7 13 7 10 14 15 15 (Rev.917 2.POINT ESTIMATES — ==DIFFERENCE=== SECONDARY SAMPLE MEAN UNIVERSE ============== ============= 5.U.50 51 4. and 12.00 14 79.

VARIANCE COMPONENTS --P.615 7. 10/2004) .870 38.756 930 LOWER LIMIT UPPER LIMIT PRECISION AMOUNT PRECISION PERCENT Z-VALUE USED LOWER LIMIT UPPER LIMIT PRECISION AMOUNT PRECISION PERCENT Z-VALUE USED LOWER LIMIT UPPER LIMIT PRECISION AMOUNT PRECISION PERCENT Z-VALUE USED Page 3-44 (Rev.570 523 1. of the universe total ( T --.756 is the point estimate $ ). Note: 865.107 29.530 7.933 23.38% 1.948 1.927 603.192 247 369 247.797 75.299 29.500 34.333 2.674 35.25 4.69 3.281551565545 90% CONFIDENCE LEVEL 20.287 1.U.564 22.644853626951 95% CONFIDENCE LEVEL 19.271 14.356 659 0 53.192 5.464 $) .286 1.443 19.616 2.389 7.980 21.074 279.230 15.579 1.392 BETWEEN VARIANCE 283.03% 1.366 262. NBR 1 2 3 4 5 6 7 8 9 10 TOTALS: WITHIN VARIANCE 23.60 78 64 75 627 6.797 22.704 31.529 38.464 is equal to v ( T PRIMARY UNITS SAMPLED: PRIMARY UNITS NOT SAMPLED: PRIMARY UNITS IN POPULATION: POINT ESTIMATE OF POPULATION TOTAL: STANDARD ERROR CONFIDENCE LIMITS 80% CONFIDENCE LEVEL 20.701 13.985 24.823 8.044 26.756 Note: 21.959963984540 10 80 90 21.689 25.977 33.226 23.072 TOTAL VARIANCE 307.48% 1.RHC Two-Stage Variable Appraisal RAT-STATS Companion Manual 8 9 10 TOTALS: 16 13 15 125 5.293 865.738 23.

(Rev.000 to $23.000).RAT-STATS Companion Manual RHC Two-Stage Variable Appraisal Discussion: The (highlighted) estimate of the universe total on the previous page obtained using formula 1 is: $ = (Estimate of group 1 total) + AAA + (Estimate of group 10 total) T = (90/7)(5.756 ± 1. 10/2004) Page 3-45 .959963984540(930) 21. 19.933.933 to 23.945 + AAA + 1. the estimated variance of T $ ) = 865.31)(63) + AAA + (110/15)(3.464 v (T $ is the square root of 865.43)(37) + (86/12)(4. The approximate 95% confidence interval is: 21.60)(75) = 2. that is.823 that is.464. 930 (highlighted on and the estimated standard error of T the previous page).582 + 1.980 = 21.756 ($21.756 ± 1.756.579.000) $ is: Using formula 2.579 ($19.

s in the population Ni = number of P.s in the i-th sampled P.RHC Two-Stage Variable Appraisal RAT-STATS Companion Manual FORMULAS Definitions 1. is “secondary unit” Ai = size of i-th P.s in the i-th sampled P.U.U. 10/2004) . stands for “primary unit” and S. 10.U. P. 6. 2.s in the sample Mi = number of S. 7. 9.U.s in the i-th group n = number of P. 5.)/(size of entire population) = Ai/(size of entire population) Bi = total size for i-th group Bi = (total size for i-th group)/(size of entire population) = Bi/(size of entire population) N = number of P. Si = (size of i-th P.U. 4.U. 8. (sample) Estimate of population total (T) n ⎛ Bi ⎞ $= ⎜ ⎟ Mi yi T ⎝ Ai ⎠ i=1 ∑ where yi = average of mi sampled S.s and Bi /Ai is labeled SIZES RATIO in the computer output.U. ^ Estimated variance of T $ ) = V1 + V2 where v (T Page 3-46 (Rev. 3.U.U.U.U. (population) mi = number of S.U.

s.959963984540 with 1.U. v (T NOTE: For a 90% confidence interval.RAT-STATS Companion Manual RHC Two-Stage Variable Appraisal ⎛ Mi yi ⎞ $⎟ ⎜ T V1 = π − i n ⎝ Si ⎠ 2 2 i= 1 N − Ni i=1 ∑ Ni2 − N ∑ i= 1 n ∑ n 2 and V2 = ∑ i=1 n πi Mi ⎛ Mi − mi ⎞ 2 ⎜ ⎟s Si ⎝ mi ⎠ i where si2 = variance of the mi sampled S. NOTE: $ is The estimated standard error of T $) v (T Approximate 95% confidence interval for the population total (T) $ ± 1959963984540 $) T . replace 1. 10/2004) Page 3-47 .959963984540 with 1.644853626951 and for an 80% confidence interval replace 1.281551565545. (Rev.

Using pps sampling and the size of each secondary unit. Prior to running the appraisal program. A sample of secondary units is obtained within each chosen primary unit by partitioning the primary unit into random groups. The selected regions are 4. This is a random sample. A random sample of third-stage units is obtained for each of the chosen secondary units. one secondary unit is chosen from each of the secondary groups. Page 3-48 (Rev. 10/2004) . The situation discussed in Example 4 in the Three-Stage Unrestricted section will be appraised using the RHC methodology. The group sizes are chosen to be as nearly equal as possible. 6. 3. 8. The size of the primary units is considered for this sample. 2. Example 6. the user must run the RHC Sample Selection program in the OAS software. No attention is paid to “size” here. the stages are: Stage 1: REGION (select 4 out of 12 regions) Stage 2: UNIVERSITY (select 10 from each selected region) Stage 3: GRANT (select approximately 20% of all grants at each university) Selection of Primary Units A file must be constructed containing (for each region) (1) the number of secondary units (universities) in this region and (2) the size of this region (total dollars of grants). A sample of primary units (clusters) is obtained as in the one. For this example.RHC Three-Stage Variable Appraisal RAT-STATS Companion Manual RHC Three-Stage Variable Sampling The RHC sampling procedure can used for a three-stage design. The steps for such a procedure are the following: 1.and two-stage procedures. and 10.TXT. where pps sampling is used for each group of primary units. This file is GRANTSPU.

U.s [column (A)] equal to one in this file.RAT-STATS Companion Manual RHC Three-Stage Variable Appraisal NOTE: Seed values of 100 and 200 were used to select the primary units.s) (B) size of each P. The correct number of S.U. --.TXT --REGION6 REGION4 REGION8 REGION10 116 123 118 85 1240 1320 1300 640 3100 3410 3170 2320 3 3 3 3 (Rev.U.s. it is recommended that the user not set these seed values. 10/2004) Page 3-49 .TXT --(A) 117 63 91 123 107 116 102 118 122 85 94 62 (B) 1250 610 720 1320 1160 1240 960 1300 1320 640 930 550 REGION1 REGION2 REGION3 REGION4 REGION5 REGION6 REGION7 REGION8 REGION9 REGION10 REGION11 REGION12 NOTE: It is okay to set the number of S. In practice.U.Data set GRANTSPUOUT.Data set GRANTSPU. (total grant amount x $100.U. Columns: (A) number of universities (S.s must be known for the selected P. The actual number of S.000) --.U.s must then be inserted into file GRANTSPUOUT.TXT (the highlighted values).

320 REGION5 1.00 In practice.00 ********* GROUP 2 ********* PRIMARY UNIT PRIMARY UNIT IDENTIFICATION SIZE ============================== ============= REGION4 <-. 10/2004) .100 SECONDARY UNIVERSE ============= 63 116 117 296 SECONDARY UNIVERSE ============= 123 107 94 324 SECONDARY UNIVERSE ============= 62 118 122 302 SECONDARY UNIVERSE ============= 91 102 85 278 200.RHC Three-Stage Variable Appraisal RAT-STATS Companion Manual Date: 10/22/2004 DEPARTMENT OF HEALTH & HUMAN SERVICES OIG .Selected 1.Selected 640 GROUP TOTALS: FIRST SEED NUMBER: 3 2.OFFICE OF AUDIT SERVICES GENERATION OF PRIMARY UNIT SAMPLE NAME OF INPUT FILE: C:\TEMP\GRANTSPU.Selected 1.250 GROUP TOTALS: 3 3.160 REGION11 930 GROUP TOTALS: 3 3.300 REGION9 1.240 REGION1 1.320 SECOND SEED NUMBER: 100.320 GROUP TOTALS: 3 3.170 ********* GROUP 4 ********* PRIMARY UNIT PRIMARY UNIT IDENTIFICATION SIZE ============================== ============= REGION3 720 REGION7 960 REGION10 <-.410 ********* GROUP 3 ********* PRIMARY UNIT PRIMARY UNIT IDENTIFICATION SIZE ============================== ============= REGION12 550 REGION8 <-.Selected 1.TXT GROUPS OF PRIMARY UNITS Time: 15:09 ********* GROUP 1 ********* PRIMARY UNIT PRIMARY UNIT IDENTIFICATION SIZE ============================== ============= REGION2 610 REGION6 <-. NUMBER OF PRIMARY UNITS IN THE POPULATION: NUMBER OF PRIMARY UNITS SAMPLED: 12 4 Page 3-50 (Rev. do not set these seed values.

these files can be joined to form one of the input files (the one containing primary/secondary unit information) for the three-stage RHC program which calculates the confidence interval. 104. 43.RAT-STATS Companion Manual RHC Three-Stage Variable Appraisal < Program output . 111. 30. 78. A sample of 10 universities is selected for each region.170 2. REGION8.410 3. 65. 27. This input is shown in files REGION4. 93. here) and the number of third-stage units in the universe for each secondary unit. 66. 6. Regions 4. and REGION10. (Rev.TXT Selection of Secondary Units The input for three-stage RHC can be greatly simplified if you only obtain information for each selected primary unit (that is. 10.continued > PRIMARY UNIT ID ========================= REGION6 REGION4 REGION8 REGION10 SECONDARY UNIVERSE ============= 116 123 118 85 PRIMARY UNIT SIZE ============= 1. 59. 55. 75. 64. 7. 62. 73. and 10 here).TXT. 43. 39 The previous five program runs (one at the primary level and four at the secondary level) created five output files. 8. in that order. 99 112. 30.320 UNITS IN GROUP ===== 3 3 3 3 NOTE: The above four lines make up file GRANTSPUOUT.100 3. 10/2004) Page 3-51 . 99 78. 6. 3.TXT.TXT. 112. 7. 33. 82.000). The file for this example is PUSURHC3. Each line in the files contains the number of third-stage units (grants) in the universe and the size of that secondary unit (total grant dollars x 100. 65. 115. Using a word processor or spreadsheet.TXT.320 1. After each of these four files is the computer output using the RHC Sample Selection program. 70.240 1. 46.300 640 GROUP SIZE ============= 3. The information consists of the size of each secondary unit (university. 80 113.TXT. 89. REGION6. The results are: REGION 4 6 8 10 UNIVERSITIES 85. 34. 7.

... 10/2004) .continued .. Columns: (1) unit ID (2) number of grants Page 3-52 (Rev...RHC Three-Stage Variable Appraisal RAT-STATS Companion Manual Data set REGION4.> UNIV51 UNIV52 UNIV53 UNIV54 UNIV55 UNIV56 UNIV57 UNIV58 UNIV59 UNIV60 UNIV61 UNIV62 UNIV63 UNIV64 UNIV65 UNIV66 UNIV67 UNIV68 UNIV69 UNIV70 UNIV71 UNIV72 UNIV73 UNIV74 UNIV75 UNIV76 UNIV77 UNIV78 UNIV79 UNIV80 UNIV81 UNIV82 UNIV83 UNIV84 UNIV85 UNIV86 UNIV87 UNIV88 UNIV89 UNIV90 UNIV91 UNIV92 UNIV93 UNIV94 UNIV95 UNIV96 UNIV97 UNIV98 UNIV99 UNIV100 62 52 56 70 41 65 76 30 75 27 36 61 58 61 62 76 71 34 62 23 28 46 62 67 25 24 57 44 73 70 45 52 34 59 54 31 69 22 47 57 31 73 52 22 22 29 56 74 43 57 13 11 11 15 9 14 16 7 16 7 8 13 12 13 14 16 15 8 13 6 7 10 14 14 6 6 12 10 16 15 10 11 8 12 11 7 14 6 10 12 7 15 11 6 6 7 12 16 9 12 < .> UNIV101 UNIV102 UNIV103 UNIV104 UNIV105 UNIV106 UNIV107 UNIV108 UNIV109 UNIV110 UNIV111 UNIV112 UNIV113 UNIV114 UNIV115 UNIV116 UNIV117 UNIV118 UNIV119 UNIV120 UNIV121 UNIV122 UNIV123 34 28 73 65 68 28 55 37 54 47 44 24 50 52 66 50 66 34 73 37 42 59 45 8 7 15 14 14 7 11 9 11 10 9 6 10 11 14 10 14 8 16 8 9 12 11 (2) (3) 52 37 38 20 69 69 77 32 49 73 21 62 55 59 55 36 51 26 25 73 71 47 34 25 39 49 76 21 33 54 45 74 69 50 29 56 64 66 63 57 71 45 21 46 48 44 71 67 23 54 11 9 9 5 15 15 17 7 10 15 5 13 11 12 11 8 11 7 6 15 15 10 8 6 9 10 16 5 8 11 10 16 14 10 7 12 14 14 14 12 15 10 5 10 10 9 15 14 6 11 NOTE: This file has 123 lines.TXT (1) UNIV1 UNIV2 UNIV3 UNIV4 UNIV5 UNIV6 UNIV7 UNIV8 UNIV9 UNIV10 UNIV11 UNIV12 UNIV13 UNIV14 UNIV15 UNIV16 UNIV17 UNIV18 UNIV19 UNIV20 UNIV21 UNIV22 UNIV23 UNIV24 UNIV25 UNIV26 UNIV27 UNIV28 UNIV29 UNIV30 UNIV31 UNIV32 UNIV33 UNIV34 UNIV35 UNIV36 UNIV37 UNIV38 UNIV39 UNIV40 UNIV41 UNIV42 UNIV43 UNIV44 UNIV45 UNIV46 UNIV47 UNIV48 UNIV49 UNIV50 < .continued ...

10/2004) Page 3-53 .TXT GROUPS OF SECONDARY UNITS ********* GROUP 1 ********* SECONDARY UNIT SECONDARY UNIT IDENTIFICATION SIZE ============================== ============= UNIV57 16 UNIV48 14 UNIV35 7 UNIV107 11 UNIV85 <-.Selected 11 UNIV103 15 UNIV86 7 UNIV2 9 UNIV81 10 UNIV58 7 UNIV36 12 UNIV49 6 GROUP TOTALS: 12 125 3RD STAGE UNIVERSE ============= 76 67 29 55 54 73 31 37 45 30 56 23 576 3RD STAGE UNIVERSE ============= 52 69 44 62 37 46 54 42 52 21 69 63 611 Time: 14:21 ********* GROUP 2 ********* SECONDARY UNIT SECONDARY UNIT IDENTIFICATION SIZE ============================== ============= UNIV52 11 UNIV6 15 UNIV46 <-.000) Date: 10/25/2004 DEPARTMENT OF HEALTH & HUMAN SERVICES OIG .OFFICE OF AUDIT SERVICES GENERATION OF SECONDARY UNIT SAMPLE NAME OF INPUT FILE: C:\TEMP\REGION4.Selected 9 UNIV69 13 UNIV108 9 UNIV44 10 UNIV50 11 UNIV121 9 UNIV1 11 UNIV43 5 UNIV87 14 UNIV39 14 GROUP TOTALS: 12 131 < GROUPS 3 THROUGH 9 ARE OMITTED HERE > ********* GROUP 10 ********* SECONDARY UNIT SECONDARY UNIT IDENTIFICATION SIZE ============================== ============= UNIV53 11 UNIV24 6 UNIV42 10 UNIV120 8 UNIV105 14 UNIV97 12 3RD STAGE UNIVERSE ============= 56 25 45 37 68 56 (Rev.RAT-STATS Companion Manual RHC Three-Stage Variable Appraisal (3) size of university (grant amount x $100.

00 NUMBER OF SECONDARY UNITS IN THE POPULATION: NUMBER OF SECONDARY UNITS SAMPLED: 3RD STAGE UNIVERSE ============= 54 44 77 52 54 50 76 76 62 70 SECONDARY UNIT ID ========================= UNIV85 UNIV46 UNIV7 UNIV82 UNIV30 UNIV34 UNIV27 UNIV66 UNIV65 UNIV80 SECONDARY UNIT SIZE ============= 11 9 17 11 11 10 16 16 14 15 GROUP SIZE ============= 125 131 119 129 141 140 138 128 125 155 UNITS IN GROUP ===== 12 12 12 12 12 12 12 13 13 13 Page 3-54 (Rev.Selected UNIV96 UNIV13 UNIV62 UNIV59 GROUP TOTALS: 13 16 16 15 7 11 13 16 155 73 74 70 29 55 61 75 724 DEPARTMENT OF HEALTH & HUMAN SERVICES OIG .txt FIRST SEED NUMBER: 100.OFFICE OF AUDIT SERVICES Date: 10/25/2004 GENERATION OF SECONDARY UNIT SAMPLE NAME OF OUTPUT FILE: C:\TEMP\OutRegion4.RHC Three-Stage Variable Appraisal RAT-STATS Companion Manual UNIV119 UNIV32 UNIV80 <-. 10/2004) .00 SECOND SEED NUMBER: 123 10 Time: 14:21 200.

10/2004) Page 3-55 .RAT-STATS Companion Manual RHC Three-Stage Variable Appraisal UNIV1 UNIV2 UNIV3 UNIV4 UNIV5 UNIV6 UNIV7 UNIV8 UNIV9 UNIV10 UNIV11 UNIV12 UNIV13 UNIV14 UNIV15 UNIV16 UNIV17 UNIV18 UNIV19 UNIV20 UNIV21 UNIV22 UNIV23 UNIV24 UNIV25 UNIV26 UNIV27 UNIV28 UNIV29 UNIV30 UNIV31 UNIV32 UNIV33 UNIV34 UNIV35 UNIV36 UNIV37 UNIV38 UNIV39 UNIV40 UNIV41 UNIV42 UNIV43 UNIV44 UNIV45 UNIV46 UNIV47 UNIV48 UNIV49 UNIV50 UNIV51 UNIV52 UNIV53 UNIV54 UNIV55 UNIV56 UNIV57 UNIV58 56 27 56 23 72 24 61 65 68 40 64 66 80 53 36 53 47 73 41 58 45 43 56 35 34 65 78 35 31 58 29 76 57 42 69 58 31 33 40 51 60 78 39 46 58 59 53 57 28 63 31 60 30 30 40 26 24 44 10 5 11 5 13 5 11 12 13 8 12 13 14 9 7 10 9 14 8 11 9 8 10 7 7 13 14 7 6 11 6 14 10 8 13 11 6 6 8 9 11 14 7 9 11 11 10 10 6 12 6 11 6 6 8 5 5 8 <-.TXT NOTE: This file has 116 lines.continued --> UNIV59 UNIV60 UNIV61 UNIV62 UNIV63 UNIV64 UNIV65 UNIV66 UNIV67 UNIV68 UNIV69 UNIV70 UNIV71 UNIV72 UNIV73 UNIV74 UNIV75 UNIV76 UNIV77 UNIV78 UNIV79 UNIV80 UNIV81 UNIV82 UNIV83 UNIV84 UNIV85 UNIV86 UNIV87 UNIV88 UNIV89 UNIV90 UNIV91 UNIV92 UNIV93 UNIV94 UNIV95 UNIV96 UNIV97 UNIV98 UNIV99 UNIV100 UNIV101 UNIV102 UNIV103 UNIV104 UNIV105 UNIV106 UNIV107 UNIV108 UNIV109 UNIV110 UNIV111 UNIV112 UNIV113 UNIV114 UNIV115 UNIV116 67 56 33 40 68 70 57 40 54 65 62 28 56 41 31 31 46 38 62 63 50 53 39 39 39 25 67 47 54 50 35 66 65 71 29 74 66 71 43 62 80 57 22 33 78 25 76 39 48 54 63 28 69 27 33 52 33 23 13 10 7 8 13 13 10 7 10 12 12 5 10 8 6 6 9 7 12 12 9 9 7 7 7 5 13 9 10 9 7 13 12 13 6 14 13 13 8 11 14 11 5 6 5 9 8 5 5 7 12 8 8 10 8 7 15 10 Data set REGION6. (Rev.

Selected 8 UNIV85 13 UNIV109 12 UNIV87 10 UNIV2 5 UNIV80 9 UNIV53 6 GROUP TOTALS: 11 108 3RD STAGE UNIVERSE ============= 60 58 76 47 33 67 63 54 27 53 30 568 3RD STAGE UNIVERSE ============= 57 57 24 39 65 60 59 56 51 50 58 576 ********* GROUP 2 ********* SECONDARY UNIT SECONDARY UNIT IDENTIFICATION SIZE ============================== ============= UNIV33 10 UNIV48 10 UNIV6 5 UNIV43 <-.RHC Three-Stage Variable Appraisal RAT-STATS Companion Manual Date: 10/25/2004 DEPARTMENT OF HEALTH & HUMAN SERVICES OIG .Selected 7 UNIV68 12 UNIV41 11 UNIV46 11 UNIV1 10 UNIV40 9 UNIV88 9 UNIV36 11 GROUP TOTALS: 11 105 < GROUPS 3 THROUGH 9 ARE OMITTED HERE > ********* GROUP 10 ********* SECONDARY UNIT SECONDARY UNIT IDENTIFICATION SIZE ============================== ============= UNIV20 11 UNIV22 8 UNIV39 8 UNIV111 8 UNIV100 11 UNIV29 6 UNIV105 8 UNIV79 9 UNIV99 <-.txt GROUPS OF SECONDARY UNITS Time: 13:57 ********* GROUP 1 ********* SECONDARY UNIT SECONDARY UNIT IDENTIFICATION SIZE ============================== ============= UNIV52 11 UNIV45 11 UNIV32 14 UNIV86 9 UNIV113 <-.OFFICE OF AUDIT SERVICES GENERATION OF SECONDARY UNIT SAMPLE NAME OF INPUT FILE: C:\TEMP\REGION6. 10/2004) .Selected 14 3RD STAGE UNIVERSE ============= 58 43 40 69 57 31 76 50 80 Page 3-56 (Rev.

OFFICE OF AUDIT SERVICES Date: 10/25/2004 GENERATION OF SECONDARY UNIT SAMPLE NAME OF OUTPUT FILE: C:\TEMP\OutRegion6.RAT-STATS Companion Manual RHC Three-Stage Variable Appraisal UNIV13 UNIV60 UNIV54 GROUP TOTALS: 12 14 10 6 113 80 56 30 670 DEPARTMENT OF HEALTH & HUMAN SERVICES OIG .00 SECOND SEED NUMBER: 116 10 Time: 13:57 200.00 NUMBER OF SECONDARY UNITS IN THE POPULATION: NUMBER OF SECONDARY UNITS SAMPLED: 3RD STAGE UNIVERSE ============= 33 39 63 25 35 27 58 57 56 80 SECONDARY UNIT ID ========================= UNIV113 UNIV43 UNIV78 UNIV104 UNIV89 UNIV112 UNIV30 UNIV65 UNIV3 UNIV99 SECONDARY UNIT SIZE ============= 8 7 12 9 7 10 11 10 11 14 GROUP SIZE ============= 108 105 104 96 124 108 95 109 115 113 UNITS IN GROUP ===== 11 11 11 11 12 12 12 12 12 12 (Rev. 10/2004) Page 3-57 .txt FIRST SEED NUMBER: 100.

RHC Three-Stage Variable Appraisal RAT-STATS Companion Manual Data set REGION8.... 10/2004) .continued ..continued ..> UNIV101 UNIV102 UNIV103 UNIV104 UNIV105 UNIV106 UNIV107 UNIV108 UNIV109 UNIV110 UNIV111 UNIV112 UNIV113 UNIV114 UNIV115 UNIV116 UNIV117 UNIV118 24 26 40 77 27 65 61 36 26 38 84 75 26 45 59 59 57 58 5 6 10 16 6 15 13 9 6 9 17 16 6 10 13 13 12 12 NOTE: This file has 118 lines.> UNIV51 UNIV52 UNIV53 UNIV54 UNIV55 UNIV56 UNIV57 UNIV58 UNIV59 UNIV60 UNIV61 UNIV62 UNIV63 UNIV64 UNIV65 UNIV66 UNIV67 UNIV68 UNIV69 UNIV70 UNIV71 UNIV72 UNIV73 UNIV74 UNIV75 UNIV76 UNIV77 UNIV78 UNIV79 UNIV80 UNIV81 UNIV82 UNIV83 UNIV84 UNIV85 UNIV86 UNIV87 UNIV88 UNIV89 UNIV90 UNIV91 UNIV92 UNIV93 UNIV94 UNIV95 UNIV96 UNIV97 UNIV98 UNIV99 UNIV100 77 36 75 68 34 55 42 36 36 66 61 64 72 65 58 49 30 75 33 65 55 38 36 60 52 65 49 27 48 36 66 62 70 68 53 38 35 36 26 26 51 25 54 56 81 73 44 50 60 31 16 9 16 15 8 12 10 9 9 15 13 14 15 14 13 11 7 16 8 14 12 9 9 13 11 14 10 7 10 9 15 14 15 15 11 9 8 9 6 6 11 5 11 12 17 15 10 11 13 7 < .TXT UNIV1 UNIV2 UNIV3 UNIV4 UNIV5 UNIV6 UNIV7 UNIV8 UNIV9 UNIV10 UNIV11 UNIV12 UNIV13 UNIV14 UNIV15 UNIV16 UNIV17 UNIV18 UNIV19 UNIV20 UNIV21 UNIV22 UNIV23 UNIV24 UNIV25 UNIV26 UNIV27 UNIV28 UNIV29 UNIV30 UNIV31 UNIV32 UNIV33 UNIV34 UNIV35 UNIV36 UNIV37 UNIV38 UNIV39 UNIV40 UNIV41 UNIV42 UNIV43 UNIV44 UNIV45 UNIV46 UNIV47 UNIV48 UNIV49 UNIV50 72 44 43 55 27 34 51 42 54 25 82 65 33 48 32 82 35 54 34 62 26 31 58 61 61 54 53 56 57 26 25 37 79 60 57 27 31 75 26 36 36 49 83 71 31 42 62 54 31 80 15 10 10 12 7 8 11 10 12 6 17 14 8 10 8 17 8 12 8 14 6 7 13 13 14 12 11 12 12 6 5 9 16 13 12 7 7 15 6 9 9 10 17 15 7 10 14 11 7 16 < .. Page 3-58 (Rev...

TXT GROUPS OF SECONDARY UNITS Time: 14:03 ********* GROUP 1 ********* SECONDARY UNIT SECONDARY UNIT IDENTIFICATION SIZE ============================== ============= UNIV54 15 UNIV46 10 UNIV33 16 UNIV86 9 UNIV112 <-.Selected 8 UNIV44 15 UNIV68 16 UNIV42 10 UNIV48 11 UNIV1 15 UNIV41 9 UNIV89 6 UNIV37 7 GROUP TOTALS: 11 127 < GROUPS 3 THROUGH 9 ARE OMITTED HERE > ********* GROUP 10 ********* SECONDARY UNIT SECONDARY UNIT IDENTIFICATION SIZE ============================== ============= UNIV21 6 UNIV23 13 UNIV40 9 UNIV110 9 UNIV100 7 UNIV30 6 UNIV104 16 UNIV81 15 UNIV99 <-.Selected 16 UNIV85 11 UNIV108 9 UNIV87 8 UNIV2 10 UNIV55 8 UNIV34 13 GROUP TOTALS: 11 125 3RD STAGE UNIVERSE ============= 68 42 79 38 75 53 36 35 44 34 60 564 3RD STAGE UNIVERSE ============= 62 80 34 71 75 49 54 72 36 26 31 590 ********* GROUP 2 ********* SECONDARY UNIT SECONDARY UNIT IDENTIFICATION SIZE ============================== ============= UNIV47 14 UNIV50 16 UNIV6 <-.RAT-STATS Companion Manual RHC Three-Stage Variable Appraisal Date: 10/25/2004 DEPARTMENT OF HEALTH & HUMAN SERVICES OIG . 10/2004) Page 3-59 .OFFICE OF AUDIT SERVICES GENERATION OF SECONDARY UNIT SAMPLE NAME OF INPUT FILE: C:\TEMP\REGION8.Selected 13 UNIV13 8 3RD STAGE UNIVERSE ============= 26 58 36 38 31 26 77 66 60 33 (Rev.

00 NUMBER OF SECONDARY UNITS IN THE POPULATION: NUMBER OF SECONDARY UNITS SAMPLED: 3RD STAGE UNIVERSE ============= 75 34 51 54 52 84 64 59 65 60 SECONDARY UNIT ID ========================= UNIV112 UNIV6 UNIV7 UNIV93 UNIV75 UNIV111 UNIV62 UNIV115 UNIV70 UNIV99 SECONDARY UNIT SIZE ============= 16 8 11 11 11 17 14 13 14 13 GROUP SIZE ============= 125 127 120 136 126 134 123 137 143 129 UNITS IN GROUP ===== 11 11 12 12 12 12 12 12 12 12 Page 3-60 (Rev.00 SECOND SEED NUMBER: 118 10 Time: 14:03 200. 10/2004) .RHC Three-Stage Variable Appraisal RAT-STATS Companion Manual UNIV60 UNIV56 GROUP TOTALS: 12 15 12 129 66 55 572 DEPARTMENT OF HEALTH & HUMAN SERVICES OIG .OFFICE OF AUDIT SERVICES Date: 10/25/2004 GENERATION OF SECONDARY UNIT SAMPLE NAME OF OUTPUT FILE: C:\TEMP\OutRegion8.txt FIRST SEED NUMBER: 100.

TXT UNIV1 UNIV2 UNIV3 UNIV4 UNIV5 UNIV6 UNIV7 UNIV8 UNIV9 UNIV10 UNIV11 UNIV12 UNIV13 UNIV14 UNIV15 UNIV16 UNIV17 UNIV18 UNIV19 UNIV20 UNIV21 UNIV22 UNIV23 UNIV24 UNIV25 UNIV26 UNIV27 UNIV28 UNIV29 UNIV30 UNIV31 UNIV32 UNIV33 UNIV34 UNIV35 UNIV36 UNIV37 UNIV38 UNIV39 UNIV40 UNIV41 UNIV42 UNIV43 UNIV44 UNIV45 34 32 69 23 60 72 56 28 38 60 58 37 70 37 81 53 63 32 33 37 77 52 63 41 45 34 61 70 34 22 66 69 65 26 43 65 80 74 38 43 47 59 42 54 73 6 5 10 4 9 11 9 5 6 9 9 6 10 6 12 9 10 5 5 6 11 8 10 7 8 6 10 10 5 4 10 10 10 4 7 10 12 11 6 7 8 9 7 9 11 <--continued --> UNIV46 UNIV47 UNIV48 UNIV49 UNIV50 UNIV51 UNIV52 UNIV53 UNIV54 UNIV55 UNIV56 UNIV57 UNIV58 UNIV59 UNIV60 UNIV61 UNIV62 UNIV63 UNIV64 UNIV65 UNIV66 UNIV67 UNIV68 UNIV69 UNIV70 UNIV71 UNIV72 UNIV73 UNIV74 UNIV75 UNIV76 UNIV77 UNIV78 UNIV79 UNIV80 UNIV81 UNIV82 UNIV83 UNIV84 UNIV85 78 72 30 47 52 24 26 22 57 78 62 57 68 52 54 41 61 79 50 54 53 40 44 39 72 76 34 27 40 41 25 41 39 58 71 37 30 78 59 29 12 11 5 8 8 4 4 4 9 12 10 9 10 8 9 7 10 12 8 9 9 7 7 7 11 11 5 5 7 7 4 7 7 9 11 6 5 12 9 5 Note: This file has 85 lines.RAT-STATS Companion Manual RHC Three-Stage Variable Appraisal Data set REGION10. (Rev. 10/2004) Page 3-61 .

Selected 7 UNIV62 10 UNIV41 8 UNIV1 6 UNIV40 7 UNIV79 9 UNIV36 10 GROUP TOTALS: 8 68 < GROUPS 3 THROUGH 9 ARE OMITTED HERE > ********* GROUP 10 ********* SECONDARY UNIT SECONDARY UNIT IDENTIFICATION SIZE ============================== ============= UNIV71 11 UNIV9 6 UNIV21 11 UNIV23 10 UNIV39 <-.OFFICE OF AUDIT SERVICES GENERATION OF SECONDARY UNIT SAMPLE NAME OF INPUT FILE: C:\TEMP\REGION10.RHC Three-Stage Variable Appraisal RAT-STATS Companion Manual Date: 10/25/2004 DEPARTMENT OF HEALTH & HUMAN SERVICES OIG . 10/2004) .TXT GROUPS OF SECONDARY UNITS Time: 13:49 ********* GROUP 1 ********* SECONDARY UNIT SECONDARY UNIT IDENTIFICATION SIZE ============================== ============= UNIV44 9 UNIV32 10 UNIV77 7 UNIV78 <-.Selected 6 UNIV29 5 UNIV72 5 UNIV13 10 UNIV51 4 GROUP TOTALS: 9 68 3RD STAGE UNIVERSE ============= 76 38 77 63 38 34 34 70 24 454 Page 3-62 (Rev.Selected 7 UNIV2 5 UNIV50 8 UNIV34 4 UNIV46 12 GROUP TOTALS: 8 62 3RD STAGE UNIVERSE ============= 54 69 41 39 32 52 26 78 391 3RD STAGE UNIVERSE ============= 72 42 61 47 34 43 58 65 422 ********* GROUP 2 ********* SECONDARY UNIT SECONDARY UNIT IDENTIFICATION SIZE ============================== ============= UNIV6 11 UNIV43 <-.

.TXT) is shown on the next page.00 NUMBER OF SECONDARY UNITS IN THE POPULATION: NUMBER OF SECONDARY UNITS SAMPLED: 3RD STAGE UNIVERSE ============= 39 42 56 27 78 65 60 52 50 38 SECONDARY UNIT ID ========================= UNIV78 UNIV43 UNIV7 UNIV73 UNIV55 UNIV33 UNIV10 UNIV59 UNIV64 UNIV39 SECONDARY UNIT SIZE ============= 7 7 9 5 12 10 9 8 8 6 GROUP SIZE ============= 62 68 54 63 70 77 76 71 73 68 UNITS IN GROUP ===== 8 8 8 8 8 9 9 9 9 9 Constructing the data file The data file for this example (PUSURHC3. This file was constructed using the RHC Sample Selection program to select the primary units (regions) and within each selected primary unit. . 10/2004) Page 3-63 .TXT).. the 10 secondary units (universities).00 SECOND SEED NUMBER: 85 10 Time: 13:49 200. The 10 lines after each REGIONx line consist of the output file created when selecting the universities from each region (OUTREGION4. Using a word processor or spreadsheet. The four lines beginning with REGIONx are from the output file created during the primary unit selection (GRANTSPUOUT.RAT-STATS Companion Manual RHC Three-Stage Variable Appraisal DEPARTMENT OF HEALTH & HUMAN SERVICES OIG .TXT.OFFICE OF AUDIT SERVICES Date: 10/25/2004 GENERATION OF SECONDARY UNIT SAMPLE NAME OF OUTPUT FILE: C:\TEMP\OutRegion10.TXT).txt FIRST SEED NUMBER: 100. . (Rev. OUTREGION10. a column containing the sample sizes (highlighted) must be added to the files created by the five RHC Sample Selection programs.

Data set PUSURHC3. 10/2004) .RHC Three-Stage Variable Appraisal RAT-STATS Companion Manual --.TXT --REGION4 123 UNIV85 54 UNIV46 44 UNIV7 77 UNIV82 52 UNIV30 54 UNIV34 50 UNIV27 76 UNIV66 76 UNIV65 62 UNIV80 70 REGION6 116 UNIV113 33 UNIV43 39 UNIV78 63 UNIV104 25 UNIV89 35 UNIV112 27 UNIV30 58 UNIV65 57 UNIV3 56 UNIV99 80 REGION8 118 UNIV112 75 UNIV6 34 UNIV7 51 UNIV93 54 UNIV75 52 UNIV111 84 UNIV62 64 UNIV115 59 UNIV70 65 UNIV99 60 REGION10 85 UNIV78 39 UNIV43 42 UNIV7 56 UNIV73 27 UNIV55 78 UNIV33 65 UNIV10 60 UNIV59 52 UNIV64 50 UNIV39 38 10 11 9 15 10 11 10 15 15 12 14 10 7 8 13 5 7 5 12 11 11 16 10 15 7 10 11 10 17 13 12 13 12 10 8 8 11 5 16 13 12 10 10 8 1320 11 9 17 11 11 10 16 16 14 15 1240 8 7 12 9 7 10 11 10 11 14 1300 16 8 11 11 11 17 14 13 14 13 640 7 7 9 5 12 10 9 8 8 6 3410 125 131 119 129 141 140 138 128 125 155 3100 108 105 104 96 124 108 95 109 115 113 3170 125 127 120 136 126 134 123 137 143 129 2320 62 68 54 63 70 77 76 71 73 68 3 12 12 12 12 12 12 12 13 13 13 3 11 11 11 11 12 12 12 12 12 12 3 11 11 12 12 12 12 12 12 12 12 3 8 8 8 8 8 9 9 9 9 9 Page 3-64 (Rev.

10/2004) Page 3-65 . the following sample sizes are determined: Region 4: University UNIV85 UNIV46 UNIV7 UNIV82 UNIV30 UNIV34 UNIV27 UNIV66 UNIV65 UNIV80 University UNIV113 UNIV43 UNIV78 UNIV104 UNIV89 UNIV112 UNIV30 UNIV65 UNIV3 UNIV99 University UNIV112 UNIV6 UNIV7 UNIV93 UNIV75 UNIV111 UNIV62 UNIV115 UNIV70 UNIV99 Grants in universe 54 44 77 52 54 50 76 76 62 70 Grants in universe 33 39 63 25 35 27 58 57 56 80 Grants in universe 75 34 51 54 52 84 64 59 65 60 Grants in universe 39 42 56 27 78 65 60 52 50 38 Number to be audited 11 9 15 10 11 10 15 15 12 14 122 Number to be audited 7 8 13 5 7 5 12 11 11 16 95 Number to be audited 15 7 10 11 10 17 13 12 13 12 120 Number to be audited 8 8 11 5 16 13 12 10 10 8 101 Region 6: Region 8: Region 10: University UNIV78 UNIV43 UNIV7 UNIV73 UNIV55 UNIV33 UNIV10 UNIV59 UNIV64 UNIV39 (Rev.RAT-STATS Companion Manual RHC Three-Stage Variable Appraisal Selection of Third-Stage Units Since approximately 20% of the grants at each selected university are to be audited.

Notice that each line begins with a counter. Each sample value is equal to the total charges after the scheduled completion of the grant (in thousands of dollars). Finally. the RHC Three-Stage program is run.TXT.TXT.TXT and DATARHC3. The values for the first two universities in Region 4 and the last two universities in Region 10 are illustrated. The output from this program is shown at the end of this section.RHC Three-Stage Variable Appraisal RAT-STATS Companion Manual The data file containing the errors for these 438 audited grants is DATARHC3. which generates a confidence interval for the universe total using input files PUSURHC3. shown on the next page. Page 3-66 (Rev. 10/2004) .

421 0 422 6 423 19 424 17 425 13 426 12 427 13 428 12 429 11 430 14 431 13 432 0 433 1 434 5 435 16 436 0 437 0 438 8 These are the sample values for UNIV85 and UNIV46 in Region 4. These are the sample values for UNIV64 and UNIV39 in Region 10.TXT --1 8 2 0 3 6 4 6 5 0 6 13 7 1 8 7 9 2 10 13 11 13 12 4 13 6 14 0 15 15 16 12 17 9 18 0 19 13 20 10 . . 10/2004) Page 3-67 .RAT-STATS Companion Manual RHC Three-Stage Variable Appraisal --. (Rev.Data set DATARHC3. .

116 ($394.s (universities) within each sampled region.3636.000) and the estimated error amount for the group of 12 universities containing UNIV85 is 3.849.000). To illustrate. 10/2004) .529.936 is obtained by subtracting the precision amount of 68. In the 80% confidence interval.526 ($463. NOTE: This estimate does not require knowing the number of grants in the universe.521.RHC Three-Stage Variable Appraisal RAT-STATS Companion Manual Summary of results. The SIZES RATIO refers to the ratio of the size of the group containing this university to the size of this university.526) to obtain the corresponding confidence interval. For example. The PRECISION AMOUNT is the amount added and subtracted to the point estimate (463.526.936 to 532.U. The PRECISION PERCENT is the precision amount divided by the point estimate. The SIZES RATIO here is 125/11 = 11. The program also provides estimates for the total error amount for each sampled P.526. Referring to the last page in the computer output. the estimate of the universe total (all 12 regions) is the OVERALL POINT ESTIMATE of 463.TXT).849 ($3.590 from 463. expressed as a percentage. UNIV85 in Region 4 has a size of 11 and is in a group of size 125 (look at file REGION4.116.U.936.529 ($50. (region) and for each of the groups of S. The 80% confidence interval for the total error amount is 394. the lower limit of 394.521 ($53.000).000 to $532.000). Page 3-68 (Rev.000) with a corresponding estimated OVERALL STANDARD ERROR of 53. the estimated error amount for Region 4 is 50.

00 9 7 14 10 9 8 12 13 10 11 103 7 8 9 5 6 2 9 8 6 15 75 12 7 9 9 7 16 13 8 9 9 99 7 6 11 3 11 11 8 7 9 5 78 (Rev.00 56.00 77.OFFICE OF AUDIT SERVICES RHC THREE-STAGE VARIABLE APPRAISAL AUDIT/REVIEW: RHC 3-Stage Time: 16:27 DATA FILE USED: C:\TEMP\DATARHC3.00 82.00 32.00 36.00 82.00 154.00 69.00 83.RAT-STATS Companion Manual RHC Three-Stage Variable Appraisal Date: 10/26/2004 DEPARTMENT OF HEALTH & HUMAN SERVICES OIG .00 123.00 83.00 91.00 113.00 78.00 85.00 153.00 104.00 61.00 48.00 90.00 29.00 63.00 71.00 117.00 81.00 134.00 97.00 164.00 67.00 65.00 75.txt **** SAMPLED UNITS **** PRIMARY / SECONDARY IDENTIFICATION ================================== REGION4 UNIV85 UNIV46 UNIV7 UNIV82 UNIV30 UNIV34 UNIV27 UNIV66 UNIV65 UNIV80 Total REGION6 UNIV113 UNIV43 UNIV78 UNIV104 UNIV89 UNIV112 UNIV30 UNIV65 UNIV3 UNIV99 Total REGION8 UNIV112 UNIV6 UNIV7 UNIV93 UNIV75 UNIV111 UNIV62 UNIV115 UNIV70 UNIV99 Total REGION10 UNIV78 UNIV43 UNIV7 UNIV73 UNIV55 UNIV33 UNIV10 UNIV59 UNIV64 UNIV39 Total THIRD STAGE UNIVERSE =========== 54 44 77 52 54 50 76 76 62 70 615 33 39 63 25 35 27 58 57 56 80 473 75 34 51 54 52 84 64 59 65 60 598 39 42 56 27 78 65 60 52 50 38 507 *****D I F F E R E N C E***** SAMPLE SAMPLE NONZERO SIZE VALUE COUNT ====== ============ ======== 11 9 15 10 11 10 15 15 12 14 122 7 8 13 5 7 5 12 11 11 16 95 15 7 10 11 10 17 13 12 13 12 120 8 8 11 5 16 13 12 10 10 8 101 69.00 145.00 83.00 43.00 123.00 72.TXT PRIMARY/SECONDARY UNIVERSE FILE USED: C:\TEMP\PUSURHC3. 10/2004) Page 3-69 .00 87.00 69.00 60.

57 4.50 6.998 3.691 2.42 7.5000 15.88 9.80 7.38 8.905 Estimate for Region 4 ý 50.671 5.09 10.753 5.69 7.30 9.38 5.57 4.8000 8.7273 UNIV30 universities show earlier 7.8333 5.626 2.0000 8.20 8.161 4.590 3.2143 9.90 14.9231 11.621 3.6667 17. 9.5385 10.498 Page 3-70 (Rev.238 3.20 11.047 4.713 4.7143 UNIV7 11.414.619 40.8182 UNIV34 in the output using data 6.632 3.566 6.80 5.45 6.9000 10.TXT.0000 UNIV65 8.93 8.744 3.223 4.73 6.6364 10.3636 UNIV46 for the group containing 7.8750 10.8824 8.0000 UNIV82 This group contained 12 9.4545 7.00 355 --.06 6.431 4.0000 UNIV73 6.443 4.0000 UNIV27 set REGION4.210 5.159 3.69 5.177 3.10 11.3636 11.67 7.55 12.18 6.6000 UNIV55 7.9091 12.5556 UNIV7 UNIV85 (not just UNIV85).913 POINT ESTIMATE ============= 3.0714 5.RHC Three-Stage Variable Appraisal RAT-STATS Companion Manual **** SAMPLED UNITS **** PRIMARY / SECONDARY IDENTIFICATION ================================== TOTALS THIRD STAGE UNIVERSE =========== 2.3333 TOTAL REGION6 UNIV113 UNIV43 UNIV78 UNIV104 UNIV89 UNIV112 UNIV30 UNIV65 UNIV3 UNIV99 TOTAL REGION8 UNIV112 UNIV6 UNIV7 UNIV93 UNIV75 UNIV111 UNIV62 UNIV115 UNIV70 UNIV99 TOTAL --.146 5.9286 UNIV80 10.25 POINT ESTIMATE ============= 3.193 *****D I F F E R E N C E***** SAMPLE SAMPLE NONZERO SIZE VALUE COUNT ====== ============ ======== 438 3.987 7.50 7.38 11. 10/2004) .08 8.27 11.7857 10.6250 UNIV66 8.849 4.4545 8. 4.67 14.08 6.71 5.6667 10.7143 10.155 2.830 3.465 40.8571 UNIV43 8.40 12.849 is the estimate 6.POINT ESTIMATES--*****D I F F E R E N C E***** **** SAMPLED UNITS **** SIZES PRIMARY / SECONDARY IDENTIFICATION SAMPLE MEAN RATIO ================================== =========== =========== REGION4 UNIV85 Note: 3.263 1.80 8.486 2.549 5.529 13.910 5.757 2.POINT ESTIMATES--*****D I F F E R E N C E***** **** SAMPLED UNITS **** SIZES PRIMARY / SECONDARY IDENTIFICATION SAMPLE MEAN RATIO ================================== =========== =========== REGION10 UNIV78 9.93 10.633 3.475 7.8125 15.

959963984540 LOWER LIMIT UPPER LIMIT PRECISION AMOUNT PRECISION PERCENT Z-VALUE USED LOWER LIMIT UPPER LIMIT PRECISION AMOUNT PRECISION PERCENT Z-VALUE USED LOWER LIMIT UPPER LIMIT PRECISION AMOUNT PRECISION PERCENT Z-VALUE USED (Rev.590 14.8750 9.864.492 551.890 30.116 68.453.241.923 710.936 532.288 (Value of V1) (Value of V2) *****D I F F E R E N C E***** --.560 88.521 CONFIDENCE LIMITS 80% CONFIDENCE LEVEL 394.63% 1. 10/2004) Page 3-71 .526 OVERALL STANDARD ERROR 53.OVERALL VARIANCE COMPONENTS --STAGE 1 ======================= 2.065.627 568.631 781.1250 11.022.460.373 798.80% 1.7000 8.351 4.69 8.VARIANCE COMPONENTS FOR PRIMARY UNITS --**** SAMPLED UNITS **** PRIMARY UNIT IDENTIFICATION ============================== REGION4 REGION6 REGION8 REGION10 WITHIN VARIANCE ============== 757.SUMMARY OF APPRAISAL RESULTS --PRIMARY UNITS SAMPLED 4 PRIMARY UNITS NOT SAMPLED 8 TOTAL PRIMARY UNITS 12 OVERALL POINT ESTIMATE 463.034 18.554 5.3333 4.70 5.99% 1.707 TOTAL VARIANCE ============== 11.338 2.280 7.425 104.38 7.67 7.240 --.383.209.517 29.391 3.644853626951 95% CONFIDENCE LEVEL 358.466 TOTAL VARIANCE ======================= 2.4444 8.064 BETWEEN VARIANCE ============== 10.RAT-STATS Companion Manual RHC Three-Stage Variable Appraisal UNIV33 UNIV10 UNIV59 UNIV64 UNIV39 TOTAL 8.317 8.315 36.713.281551565545 90% CONFIDENCE LEVEL 375.822 STAGES 2 AND 3 ======================= 151.650 6.008.394 8.771 (Values of V4) (Values of V3) --.776.70 11.899 22.140.476.

Page 3-72 (Rev. Computations are relatively simple and straightforward. the correlation between columns (A) and (B) is . 10/2004) . the point The point estimate (T estimate will exhibit relatively small variation. As mentioned earlier. the benefits of RHC sampling include the following: • • Precision is increased if the above correlation rule is satisfied. provided there is a significant correlation between NUMBER OF UNITS and SIZE OF UNIT. (grant amount x $100. • • $ ) is stable.RHC Three-Stage Variable Appraisal RAT-STATS Companion Manual Discussion. This implies that when sampling indefinitely.958.U. you can expect greater precision with the RHC procedure. The flavor of pps sampling is maintained. For a three-stage procedure.000) For this example. since pps sampling is used to select a unit from each random group. To illustrate. and we would expect a single.and two-stage RHC procedure to work quite well. at the secondary unit level. In general. consider the file containing the primary unit information used in the three-stage RHC illustration: REGION1 REGION2 REGION3 REGION4 REGION5 REGION6 REGION7 REGION8 REGION9 REGION10 REGION11 REGION12 (A) 117 63 91 123 107 116 102 118 122 85 94 62 (B) 1250 610 720 1320 1160 1240 960 1300 1320 640 930 550 Columns: (A) number of universities (S.U. this correlation rule must also apply within the sampled primary units.s) (B) size of each P.

This implies that when sampling indefinitely.RAT-STATS Companion Manual RHC Three-Stage Variable Appraisal • $ is stable. (Rev. the lower confidence limits will exhibit relatively small variation. 10/2004) Page 3-73 . producing more reliable confidence The point estimate of the variance of T intervals.

U.RHC Three-Stage Variable Appraisal RAT-STATS Companion Manual FORMULAS Definitions 1.U. It can be shown that T Page 3-74 (Rev.U.U. N = number of P.U.s Sij = (size of j-th S. 10.) (Note: denominator of Sij = numerator of Si) 4.U.)/(size of i-th sampled P.U.U.s in i-th sampled P. 8.s (population) n = number of P. in the i-th sampled P. 5.U. 3. Si = (size of i-th P. 9. 6.U. (population) mi = number of S. (sample) Estimator of population total (T) n $⎞ ⎛T $ ⎜ T= πi i ⎟ ⎝ Si ⎠ i= 1 ∑ $ = estimator of total for i-th sampled P. in i-th sampled P. where T i mi $ ⎞ ⎛T ij ⎜ = π ij ⎜ ⎟ S ⎟ j = 1 ⎝ ij ⎠ ∑ (equation 1) $ = estimator of population total for j-th sampled S. (sample) Kij = number of third-stage units for j-th sampled S.U. and T ij = Kij yij where yij = average of kij units at the third stage NOTE: $ is an unbiased estimator of T.U.U.U. 10/2004) .s (sample) Mi = number of S.U.)/(size of entire population) Bi = ESi over the i-th group of P. (population) kij = number of third-stage units for j-th sampled S. Bij = ESij over the j-th group in i-th sampled P.U. 7.s in i-th sampled P. in i-th sampled P. 2. in i-th sampled P.U.U.U.

$ ) = V3.U.RAT-STATS Companion Manual RHC Three-Stage Variable Appraisal $ Estimated variance of T $ ) = V1 + V2 where v (T ∑ Ni2 − N V1 = i=1 n n N2 − and ∑ Ni2 i = 1 i= 1 ∑ n 2 $ ⎛T ⎞ i $⎟ πi⎜ − T ⎝ Si ⎠ (equation 2) V2 = ∑ i=1 n ⎛ πi ⎞ $ ⎜ ⎟ v (Ti ) ⎝ Si ⎠ (equation 3) and Ni = number of P. the i-th P. v (T i i. Consequently.U.U.. $ ) is obtained by applying the two-stage RHC procedure within the i-th sampled P.s in the i-th group after the random split into n groups.i + V4. is viewed as the entire population..i = j=1 mi Mi2 − and ∑ Mij2 j=1 mi ∑ ⎛ Kij yij ⎞ $ ⎜ π ij ⎜ − Ti ⎟ ⎟ S ⎝ ij ⎠ j=1 mi 2 (Rev. 10/2004) Page 3-75 .e.i v (T i where ∑ Mij2 − Mi V3.

10/2004) . V1 is essentially the same expression obtained for the single-stage RHC procedure and will be referred to as the “between unit” variation. Page 3-76 (Rev. Comments 1.and 3rd-stage variation and is obtained by treating each sampled P.s in the j-th random group within the i-th sampled P. as the population to be sampled using two (additional) stages.U.959963984540 with 1.U. V2 is the contribution of the 2nd.959963984540 with 1. within the i-th sampled P.281551565545. v (T NOTE: For a 90% confidence interval.644853626951 and for an 80% confidence interval replace 1. 2. v (T Approximate 95% confidence interval for the population total (T) $ ± 1959963984540 $) T .U.U.i = π ij ⎟ Sij ⎜ k ⎠ ⎝ ij j=1 ∑ mi and where (1) Mij = the number of S. replace 1. $ is 3. The estimated standard error of T $) .U. (2) yij = average of the kij items for the j-th sampled S.RHC Three-Stage Variable Appraisal RAT-STATS Companion Manual Kij ⎛ Kij − kij ⎞ 2 ⎟ sij ⎜ V4.

The estimated variance of this estimator is the sum of the estimated variances for each stratum.S. For each of the sampled universities. then obtain a cluster (single-stage) sample within each stratum. NOTE: The number of grants audited at each university (the Mi values) are not used in the program calculations. you first stratify. The estimate of a universe total is the sum of the estimates for each stratum. The following data were obtained. Example 7. an audit was conducted for 583 universities with health related research grants. if all the Mi values are set equal to 1. all of which are audited. They are supplied for informational purposes only.RAT-STATS Companion Manual Stratified Cluster Variable Appraisal Stratified Cluster Variable Appraisal With this procedure. the resulting confidence intervals will be unchanged. This is motivated by the discussion in the RAT-STATS User’s Guide. where yj is the total of the improper charges (in thousands of dollars) for the j-th university (cluster) and Mj is the number of grants (universe) for this university. In a large section of the U. a single-stage cluster sample was obtained with n1 = 25 universities selected from Stratum 1 and n2 = 10 universities from Stratum 2. (Rev. all health-related grants would be audited (since there weren't that many at each university) to determine the amount of charges improperly charged to these grants.. 10/2004) Page 3-77 . For example. It was decided to define two strata: Stratum 1: state-supported universities (N1 = 415) Stratum 2: private universities (N2 = 168) Within each stratum.

TXT. 1 2 3 4 5 6 7 8 9 10 11 12 13 Stratum 2 Univ. Data file DATASTRCLUS. Immediately following the listing of this data file is the resulting computer output using the VARIABLE STRATIFIED CLUSTER program. 1 2 3 4 5 Mj 2 5 7 4 3 yj 18 52 68 36 45 Univ.329 EMj = 49 Eyj = 547 These values were stored in data file DATASTRCLUS. 14 15 16 17 18 19 20 21 22 23 24 25 Mj 10 9 3 6 5 5 4 6 8 7 3 8 yj 49 53 50 32 22 45 37 51 30 39 47 41 EMj = 151 Eyj = 1.TXT STATE UNIVERSITIES 415 UNIV1 8 96 UNIV2 12 121 UNIV3 4 42 UNIV4 5 65 UNIV5 6 52 UNIV6 6 40 UNIV7 7 75 UNIV8 5 65 UNIV9 8 45 UNIV10 3 50 UNIV11 2 85 UNIV12 6 43 UNIV13 5 54 UNIV14 10 49 25 Page 3-78 (Rev. Mj yj 6 8 96 7 6 64 8 10 115 9 3 41 10 1 12 Mj yj 8 96 12 121 4 42 5 65 6 52 6 40 7 75 5 65 8 45 3 50 2 85 6 43 5 54 Univ. 10/2004) .Stratified Cluster Variable Appraisal RAT-STATS Companion Manual Stratum 1 Univ.

00 43.00 85.00 45.00 65.00 50.00 49.RAT-STATS Companion Manual Stratified Cluster Variable Appraisal UNIV15 9 53 UNIV16 3 50 UNIV17 6 32 UNIV18 5 22 UNIV19 5 45 UNIV20 4 37 UNIV21 6 51 UNIV22 8 30 UNIV23 7 39 UNIV24 3 47 UNIV25 8 41 PRIVATE UNIVERSITIES UNIV1 2 18 UNIV2 5 52 UNIV3 7 68 UNIV4 4 36 UNIV5 3 45 UNIV6 8 96 UNIV7 6 64 UNIV8 10 115 UNIV9 3 41 UNIV10 1 12 168 10 DEPARTMENT OF HEALTH & HUMAN SERVICES OIG .00 75.00 121.00 53.00 45.00 30.TXT STRATUM IDENTIFICATION CLUSTER IDENTIFICATION =========================== STATE UNIVERSITIES UNIV1 UNIV2 UNIV3 UNIV4 UNIV5 UNIV6 UNIV7 UNIV8 UNIV9 UNIV10 UNIV11 UNIV12 UNIV13 UNIV14 UNIV15 UNIV16 UNIV17 UNIV18 UNIV19 UNIV20 UNIV21 UNIV22 SAMPLE UNIVERSE =========== 415 8 12 4 5 6 6 7 5 8 3 2 6 5 10 9 3 6 5 5 4 6 8 SAMPLE SIZE ====== 25 8 12 4 5 6 6 7 5 8 3 2 6 5 10 9 3 6 5 5 4 6 8 SAMPLED VALUE ============= 96.00 Time: 14:36 POINT ESTIMATE ============== (Rev.00 65.00 22. 10/2004) Page 3-79 .00 40.OFFICE OF AUDIT SERVICES Date: 10/23/2004 STRATIFIED CLUSTER VARIABLE APPRAISAL AUDIT/REVIEW: Variable .00 50.00 42.Stratified Cluster DATA FILE USED: C:\Temp\DATASTRCLUS.00 32.00 37.00 52.00 54.00 51.

Stratified Cluster Variable Appraisal

RAT-STATS Companion Manual

UNIV23 UNIV24 UNIV25 STRATUM TOTALS PRIVATE UNIVERSITIES UNIV1 UNIV2 UNIV3 UNIV4 UNIV5 UNIV6 UNIV7 UNIV8 UNIV9 UNIV10 STRATUM TOTALS STRATUM IDENTIFICATION CLUSTER IDENTIFICATION =========================== STRATA TOTALS CLUSTER UNIT TOTALS OVERALL POINT ESTIMATE OVERALL STANDARD ERROR CONFIDENCE LEVEL LOWER LIMIT UPPER LIMIT PRECISION AMOUNT PRECISION PERCENT Z-VALUE USED

7 3 8 151 168 2 5 7 4 3 8 6 10 3 1 49 SAMPLE UNIVERSE =========== 583 200

7 3 8 151 10 2 5 7 4 3 8 6 10 3 1 49 SAMPLE SIZE ====== 35 200

39.00 47.00 41.00 1,329.00 18.00 52.00 68.00 36.00 45.00 96.00 64.00 115.00 41.00 12.00 547.00

22,061

9,190

SAMPLED VALUE ============= 1,876.00

POINT ESTIMATE ==============

31,251 2,418 ---90 PERCENT-27,273 35,229 3,978 12.73% 1.644853626951 ---95 PERCENT-26,511 35,991 4,740 15.17% 1.959963984540

---80 PERCENT-28,152 34,350 3,099 9.92% 1.281551565545

Discussion. For stratum 1, the unbiased estimate of the universe total is
$ = (415/25)(1,329) = 22,061 ($22,061,000) T 1
The unbiased estimate of the universe total for stratum 2 is

$ = (168/10)(547) = 9,190 ($9,190,000) T 2
Consequently, an unbiased estimate of the universe total (highlighted) is

$=T $ +T $ = 31,251 ($31,251,000) T 1 2
$ is v (T $ ) = v (T $ ) + v (T $ ) = 5,848,565 and the Using formula 2, the estimated variance of T 1 2
corresponding standard error (highlighted) is 2,418. The approximate 95% confidence interval for the universe total is

Page 3-80

(Rev. 10/2004)

RAT-STATS Companion Manual

Stratified Cluster Variable Appraisal

31,251 ± 1.959963984540(2418) that is, 26,511 to 35,991 ($26,511,000 to $35,991,000).

FORMULAS
1. Estimated total in the universe (T)
L L ⎞ N h ⎛ nh $= T y = N h yh ⎜ ⎟ ∑ ∑ ∑ j ,h n ⎠ h=1 h =1 h ⎝ j =1

where L = number of strata Nh = number of clusters (universe) for stratum h nh = number of clusters (sample) for stratum h yj,h = total of the variable of interest (e.g., errors) for the j-th P.U. within stratum h

y h = sample average for stratum h

$= $ = estimated total for stratum h. Then T $ = N y and T NOTE: Let T h h h h
$ 2. Estimated variance of T
L L N h ( N h − nh ) nh 2 $ $) v (T ) = ∑ ( y j ,h − y h ) = ∑ v ( T ∑ h nh (nh − 1) j =1 h=1 h =1

$ ∑T

h

3. Approximate 95% confidence interval for T
$ ± 1959963984540 $) T . v (T

NOTE:

For a 90% confidence interval, replace 1.959963984540 with 1.644853626951 and for an 80% confidence interval replace 1.959963984540 with 1.281551565545.

(Rev. 10/2004)

Page 3-81

Stratified Multistage Variable Appraisal

RAT-STATS Companion Manual

Stratified Multistage Variable Appraisal
As with the stratified cluster procedure, you must first stratify the universe. Rather than take a cluster (single-stage) sample within each stratum, you will obtain a multistage (two-stage or three-stage) sample within each stratum. These multistage samples may be random (using the Two-Stage Unrestricted or Three-Stage Unrestricted programs) or may be obtained using the RHC procedure and the RHC Two-Stage or RHC Three-Stage programs.

Unlike the Stratified Cluster program, this program requires that you first run the appropriate multistage program on each stratum and record the results. The output results are then used as input to the Stratified Multistage program. You may store the results from each stratum (point estimate, standard error) in a file or simply input these values interactively.

NOTE:

The “universe size” refers to the number of units at the most detailed level of the multistage sample. For example, if you are obtaining a three-stage sample within each stratum, then the “universe size” refers to the total number of third-stage units within this stratum.

Example 8. This example is similar to Example 7 in the Stratified Cluster section. In a particular region, the universe consisting of university grants is stratified by defining Stratum 1: state-supported universities (N1 = 120 univ.) and Stratum 2: private universities (N2 = 85 univ.) Because these universities are so widespread, it was decided to employ a two-stage sample using 15 state supported universities and 10 private universities. Rather than audit all grants at a selected university, it was decided (based on available resources) to audit roughly 20% of the grants at each selected university to estimate the amount of charges improperly charged to these

Page 3-82

(Rev. 10/2004)

RAT-STATS Companion Manual

Stratified Multistage Variable Appraisal

grants. We know that there are a total of 5,800 grants within the universe of the 120 state supported universities and 4,500 grants within the 85 private universities.

The following data were obtained where yi,j is the dollars (in thousands) of improper charges for the j-th grant within the i-th sampled university, Mi is the total number of grants at the i-th university, and mi is the number of audited grants at the i-th university. Also, yi and si2 are the mean and variance of the sample values from the i-th university.

NOTE:

The 15 state-supported universities and 10 private universities were obtained using the Single-Stage Random Numbers program. For ease of illustration, they will be referred to as University 1, 2, 3, . . . within each stratum.

The corresponding data files are the input files for the Two-Stage Unrestricted program. These are files STRMULT1.TXT and STRMULT2.TXT. The files containing the universe sizes are UNIV1.TXT and UNIV2.TXT.

(Rev. 10/2004)

Page 3-83

50 4.56 4.13 16.75 4.60 2.47 12.12 14.65 4 2 4 5 4 5 Private universities Univ. Mi 1 2 3 4 5 6 7 8 9 10 66 52 47 55 48 60 57 50 62 56 mi 13 10 9 11 10 12 11 10 12 11 Dollars (yi.36 11.62 13.25 4.53 12. Mi 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 60 50 45 40 55 58 62 52 50 45 40 48 57 60 54 mi 12 10 9 8 11 12 12 10 10 9 8 10 11 12 11 4 4 3 2 7 0 4 3 7 2 4 0 3 4 3 Dollars (yi.40 17.70 12.09 4.45 3.80 3.00 12.09 si2 6.07 10.61 14.63 9.88 3.j.66 11.02 11.in thousands) 0 7 1 7 1 0 1 8 1 1 7 0 0 1 0 0 0 0 10 6 6 0 0 4 0 3 6 1 0 1 6 0 5 0 0 5 3 6 0 0 8 5 10 0 1 7 6 5 6 0 12 10 2 2 0 0 2 5 0 4 11 10 8 5 0 8 7 10 6 5 1 3 0 6 8 0 3 10 4 5 2 6 0 0 8 6 8 6 8 6 5 3 0 0 12 0 6 0 0 6 10 0 6 10 9 4 2 4 8 7 0 5 0 5 10 12 0 7 3 8 4 3 2 5 0 3 0 0 2 7 5 2 8 2 y i 3.00 4.89 8.93 11.50 3.50 4.j.27 8.89 12. Page 3-84 (Rev.Stratified Multistage Variable Appraisal RAT-STATS Companion Manual State-supported universities Univ.49 7.49 The data files are shown on the next page.75 4.49 14.60 4.92 3.58 3.17 12.21 15.in thousands) 4 10 2 3 0 7 2 3 8 5 4 1 8 8 5 3 4 5 4 0 1 0 8 0 0 1 0 1 0 6 0 0 1 5 6 1 2 2 0 4 0 0 3 0 6 7 5 6 6 11 6 5 2 1 6 10 6 6 0 10 7 5 0 0 3 2 6 5 8 7 10 2 0 0 4 12 7 0 1 12 8 5 7 9 3 0 8 2 3 5 0 0 3 8 8 10 7 1 12 5 0 0 6 0 4 2 8 5 5 yi 2.00 3.33 4.69 5.17 3.55 5. 10/2004) .64 si2 13.09 3. The corresponding computer outputs immediately follow.75 14.30 4.

TXT 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 . . 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 4 4 0 0 0 3 6 7 5 2 0 0 4 10 1 6 4 0 0 5 8 12 7 4 1 0 0 0 6 8 10 4 3 2 5 3 0 1 1 4 8 6 9 5 0 3 8 4 0 2 2 6 5 1 12 5 0 0 5 0 1 2 8 7 10 6 0 4 2 (Rev. .RAT-STATS Companion Manual Stratified Multistage Variable Appraisal Data file STRMULT1. 10/2004) Page 3-85 . .TXT 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 . 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 4 0 0 6 7 11 0 5 4 0 8 2 4 7 0 0 6 10 3 3 2 0 Data file STRMULT2. .

800 24.17 3.60 2.27 8.D I F F E R E N C E ---------------------SAMPLE SIZE/ NONZERO ITEMS SAMPLE MEAN VARIANCE UNIVERSE SIZE POINT ESTIMATE 12/8 10/7 9/7 8/6 11/8 12/9 12/10 10/7 10/6 9/6 8/7 10/6 11/9 12/9 11/9 155/114 NOT SAMPLED 105 OVERALL TOTALS 120 STANDARD ERROR 3.70 12.OFFICE OF AUDIT SERVICES TWO-STAGE UNRESTRICTED VARIABLE APPRAISAL AUDIT/REVIEW: Stratum 1 DATA FILE USED: C:\TEMP\STRMULT1.024 5. 10/2004) .00 4.50 4.09 4.TXT 1 66 13 2 52 10 3 47 9 4 55 11 5 48 10 6 60 12 7 57 11 8 50 10 9 62 12 10 56 11 Output using the state-supported universities DEPARTMENT OF HEALTH & HUMAN SERVICES OIG .58 3.93 11.63 9.64 4.12 16.53 12.36 11.50 3.45 3.50 4.077 1.00 12.92 3.88 3.65 60 50 45 40 55 58 62 52 50 45 40 48 57 60 54 776 5.010 Date: 10/22/2004 Time: 14:14 Page 3-86 (Rev.277 235 175 180 170 225 261 258 187 125 160 195 173 254 215 196 3.07 10.56 4.15 13.Stratified Multistage Variable Appraisal RAT-STATS Companion Manual Data file UNIV1.TXT 1 60 12 2 50 10 3 45 9 4 40 8 5 55 11 6 58 12 7 62 12 8 52 10 9 50 10 10 45 9 11 40 8 12 48 10 13 57 11 14 60 12 15 54 11 Data file UNIV2.49 14.49 7.61 14.25 4.TXT UNIT NBR 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 --------------------.60 4.21 15.17 12.

09 3.440 25.80 3.73% 1.12 14.80% 1.RAT-STATS Companion Manual Stratified Multistage Variable Appraisal LOWER LIMIT UPPER LIMIT PRECISION AMOUNT PRECISION PERCENT Z-VALUE USED LOWER LIMIT UPPER LIMIT PRECISION AMOUNT PRECISION PERCENT Z-VALUE USED LOWER LIMIT UPPER LIMIT PRECISION AMOUNT PRECISION PERCENT Z-VALUE USED CONFIDENCE LIMITS 80% CONFIDENCE LEVEL 22.47 12.75 4.101 8.257 Date: 10/22/2004 Time: 14:13 (Rev.976 26.89 12.49 66 52 47 55 48 60 57 50 62 56 553 3.644853626951 95% CONFIDENCE LEVEL 21.33 4.69 5.580 2.OFFICE OF AUDIT SERVICES TWO-STAGE UNRESTRICTED VARIABLE APPRAISAL AUDIT/REVIEW: Stratum 2 DATA FILE USED: C:\TEMP\STRMULT2.503 10.66 11.40 17.00 3.178 2.D I F F E R E N C E ---------------------SAMPLE SIZE/ NONZERO ITEMS SAMPLE MEAN VARIANCE UNIVERSE SIZE POINT ESTIMATE 13/8 10/8 9/7 11/9 10/9 12/9 11/8 10/8 12/9 11/9 109/84 NOT SAMPLED 75 OVERALL TOTALS 85 STANDARD ERROR 2.26 6.30 4. 10/2004) Page 3-87 .500 19.55 5.09 4.02 11.947 4.637 6.714 1.182 873 178 276 204 250 240 225 233 190 232 229 2.573 26.89 8.281551565545 90% CONFIDENCE LEVEL 21.40% 1.TXT UNIT NBR 1 2 3 4 5 6 7 8 9 10 --------------------.62 13.959963984540 Output using the private universities DEPARTMENT OF HEALTH & HUMAN SERVICES OIG .75 4.75 14.

710 8.472 20.892 1.Stratified Multistage Variable Appraisal RAT-STATS Companion Manual LOWER LIMIT UPPER LIMIT PRECISION AMOUNT PRECISION PERCENT Z-VALUE USED LOWER LIMIT UPPER LIMIT PRECISION AMOUNT PRECISION PERCENT Z-VALUE USED LOWER LIMIT UPPER LIMIT PRECISION AMOUNT PRECISION PERCENT Z-VALUE USED CONFIDENCE LIMITS 80% CONFIDENCE LEVEL 18.300 1.83% 1.118 5. you should see the following input window containing values for the first stratum.92% 1.959963984540 When running the STRATIFIED MULTISTAGE program. 10/2004) . The program output immediately follows.747 20. Page 3-88 (Rev.435 7.48% 1.644853626951 95% CONFIDENCE LEVEL 17.617 1.064 20.281551565545 90% CONFIDENCE LEVEL 17.

01% 1.227 to 46.715 45.RAT-STATS Companion Manual Stratified Multistage Variable Appraisal Date: 10/22/2004 DEPARTMENT OF HEALTH & HUMAN SERVICES OIG . The approximate 95% confidence interval for the universe total is 43.000 to $46.259. 40.277 19.227 46.858 and the corresponding standard error is is v (T = 1.182 = 43. 10/2004) Page 3-89 .547. that is.959963984540 Discussion.000).241 1.077 + 19.281551565545 ---90 PERCENT-40. The point estimate for the universe total is the sum of the point estimates for the $ = 24.88% 1.259 STANDARD ERROR 1.644853626951 ---95 PERCENT-40.58% 1.544 5. T $ ) = (1277)2 + (873)2 = 2.392.959963984540(1.982 4.077 1.291 3.000). The estimated variance of T $ two strata.227.291 ($40.392.858 (Rev.OFFICE OF AUDIT SERVICES STRATIFIED MULTISTAGE VARIABLE APPRAISAL AUDIT/REVIEW: Combining the Strata Time: 14:41 STRATUM 1 2 THE ESTIMATORS ARE BASED ON THE FOLLOWING ENTRIES: POINT ESTIMATE STANDARD ERROR 24.803 2.547) that is.032 7. 2.259 ± 1.182 873 = = = = = = = = = = = = = = = = = = RESULTS = = = = = = = = = = = = = = = = = = POINT ESTIMATE 43.277 45.291.547 CONFIDENCE LEVEL LOWER LIMIT UPPER LIMIT PRECISION AMOUNT PRECISION PERCENT Z-VALUE USED ---80 PERCENT-41.259 ($43.

Stratified Multistage Variable Appraisal RAT-STATS Companion Manual FORMULAS 1. 10/2004) . $ 2. Estimated total in the universe (T) $= $ T ∑T h h =1 L $ is the point estimate for the universe total in stratum h and L is the number of where T h strata.644853626951 and for an 80% confidence interval replace 1. replace 1.959963984540 with 1. v (T NOTE: For a 90% confidence interval. T h 3.281551565545.959963984540 with 1. Approximate 95% confidence interval for T $ ± 1959963984540 $) T . Estimated variance of T $) = $) v (T ∑ v (T h h =1 L $ ) is the estimated variance of T $ and is equal to the square of the standard error of where v (T h h $. Page 3-90 (Rev.

before you define a set of strata.519. make sure that the you know the number of universe items in each of the strata. So there are roughly 20% inpatient and 80% outpatient bad debts in the universe. at least 20). produces a slightly wider confidence interval for the same sample size. The total universe size is N = N1 + N2 = 1. A key thing to keep in mind here is that the universe strata sizes must be known. Suppose that a simple random sample of 100 bad debts revealed: (Rev. Example 9. 10/2004) Page 3-91 . The poststratification program is designed for such situations and provides reliable results if the overall sample size is large and the poststratified sample sizes are large (say.RAT-STATS Companion Manual Poststratification Poststratification Oftentimes sampling problems arise in which the user would like to stratify on a key variable but cannot place the sampling units into their correct strata until after the sample is selected. Another situation arises when an auditor does not recognize a need to stratify prior to obtaining a simple random sample and the sample items are evaluated. It is however.146 outpatient bad debts (Stratum 2). it was known that there were N1 = 373 inpatient bad debts (Stratum 1) and N2 = 1. Poststratification is often appropriate when a simple random sample is not properly balanced according to major groupings of the population. For the universe. the amount of unallowable bad debts was determined for a particular year. that is. In a recent hospital audit. Consequently. The program does not allow you to estimate these universe sizes. less efficient than using a prestratified sample.

56 s2 The data files (POSTDATA.00 2 = 198. Page 3-92 (Rev.TXT and UNIVPOST.00 2 = 22.04 s1 y 2 = $30.TXT) are shown on the next page and the resulting computer output immediately follows. 10/2004) .Poststratification RAT-STATS Companion Manual Inpatient bad debts n1 = 45 Outpatient bad debts n2 = 55 y1 = $240.

31 25.93 243.20 246.08 241.70 241.36 238.45 54.04 229.11 239.07 232.43 18.> Universe file UNIVPOST.continued .96 234.43 238.03 237.32 37.00 238.82 45.15 238.74 249.96 < .26 53.23 22.93 25.14 21.76 9.58 227.52 11.76 49.81 53.82 241.54 22.51 243.54 31.80 22.91 249.99 31.48 4.58 39.22 12.27 26.61 240.44 12.90 238.71 59.29 38.TXT 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 242.30 37.67 242.53 3.73 15.64 (Rev.91 231.75 42.48 15.84 22.00 29.59 31.27 240.88 241. 10/2004) Page 3-93 .43 241.30 58.53 232.21 245.29 21.92 44.61 243.43 9.05 25.71 40.TXT 1 2 373 1146 45 55 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 25.47 28.02 24.45 19.84 239.45 242.73 33.RAT-STATS Companion Manual Poststratification Data file POSTDATA.08 248.41 241.03 32.39 236.00 235.91 244.86 239.97 38.89 242.37 28.12 240.82 14.93 37.69 50.86 241.84 236.71 239.10 38.01 239.72 36.43 243.30 9.26 31.95 24.39 46.14 236.

200 680 .09 1.50% 1.00 STANDARD DEVIATION 4.281551565545 90% CONFIDENCE LEVEL 88.520 CONFIDENCE LIMITS 80% CONFIDENCE LEVEL 89.965 445 .307 6.13 POINT ESTIMATE 89.379 CONFIDENCE LIMITS 80% CONFIDENCE LEVEL 32.644853626951 1.839 90.71% 1.959963984540 55 30.072 36.03 34. 10/2004) .OFFICE OF AUDIT SERVICES POSTSTRATIFIED VARIABLE APPRAISAL AUDIT/REVIEW: Poststratification DATA FILE USED: C:\Temp\POSTDATA.00 14.61% 1.TXT Time: 9:51 Stratum 1 ----------------------D I F F E R E N C E----------------------SAMPLE SIZE / UNIVERSE SIZE 45 373 MEAN 240.64% 1.800.949 90.69 STANDARD ERROR (TOTAL) 347.339 2.075 89.685 2.Poststratification RAT-STATS Companion Manual Date: 10/24/2004 DEPARTMENT OF HEALTH & HUMAN SERVICES OIG .76% 1.644853626951 95% CONFIDENCE LEVEL 88.281551565545 90% CONFIDENCE LEVEL 31.091 571 .961 8.418 37.146 LOWER LIMIT UPPER LIMIT PRECISION AMOUNT PRECISION PERCENT Z-VALUE USED LOWER LIMIT UPPER LIMIT PRECISION AMOUNT PRECISION PERCENT Z-VALUE USED LOWER LIMIT UPPER LIMIT PRECISION AMOUNT PRECISION PERCENT Z-VALUE USED Stratum 2 SAMPLE SIZE / UNIVERSE SIZE MEAN STANDARD DEVIATION STANDARD ERROR (TOTAL) POINT ESTIMATE LOWER LIMIT UPPER LIMIT PRECISION AMOUNT PRECISION PERCENT Z-VALUE USED LOWER LIMIT UPPER LIMIT PRECISION AMOUNT PRECISION PERCENT Z-VALUE USED Page 3-94 (Rev.

528 10.50 and the estimate of the universe total is $ = (1519)(124. the estimate of the universe mean is y = [(45)(240) + (55)(30)]/100 = $124.491 3.00) + (1146)(30.959963984540 100 123.90% 1.914 3.959963984540 1. 10/2004) Page 3-95 . Using the usual estimator for a simple random sample.833 CONFIDENCE LIMITS 80% CONFIDENCE LEVEL 121.50) = $189.RAT-STATS Companion Manual Poststratification LOWER LIMIT UPPER LIMIT PRECISION AMOUNT PRECISION PERCENT Z-VALUE USED OVERALL SAMPLE SIZE / UNIVERSE SIZE POINT ESTIMATE STANDARD ERROR 95% CONFIDENCE LEVEL 30.549 126. a better procedure would be to use the poststratified estimate of the universe total.90% 1.593 2.115.519 LOWER LIMIT UPPER LIMIT PRECISION AMOUNT PRECISION PERCENT Z-VALUE USED LOWER LIMIT UPPER LIMIT PRECISION AMOUNT PRECISION PERCENT Z-VALUE USED LOWER LIMIT UPPER LIMIT PRECISION AMOUNT PRECISION PERCENT Z-VALUE USED Discussion.898 1.015 2.00) = $123.644853626951 95% CONFIDENCE LEVEL 120.883 126.305 127.907 3.281551565545 90% CONFIDENCE LEVEL 120.26% 1.900 T pst (Rev.43% 1. namely $ = (373)(240.851 37.248 2.50 T Since there is an unusually high number of inpatient bad debts (and low outpatient).349 1.

898 (highlighted).360.611 = $1. The corresponding 95% confidence interval for the universe total is 123.305 to $127. Notice that T pst $123. Also.11144 . 10/2004) .09098) 2 ⎥ 2 2 100 100 ⎣ 100 ⎦ ⎣ 100 ⎦ = 120.379 = $123.491.499.00) = $34.111.88 = 347. $ is v (T $ ) = v (T $ ) + v (T $) The estimated variance of T pst 1 2 pst 1519 1519 ⎤ ⎡ 1419 ⎤ ⎡ 1419 =⎢ (373)(4.898).88 + 3. The value of Ni is multiplied by the sample mean yi to estimate Page 3-96 (Rev. the point estimate for the inpatient stratum is T 1 $ = (1146)(30. $120.898 ± 1. the for the outpatient stratum.69484)2 ⎥ + ⎢ (1146)(14.360.959963984540(1.240.499.380 (more precisely.833 (highlighted) NOTE: The estimated standard error for stratum 1 is estimated standard error for stratum 2 is 120.800.611 and the estimated standard error is 3.00) = $89.833) that is.520 (highlighted). Poststratification allows you to obtain a single simple random sample (easier than obtaining a simple random sample from each stratum) and then stratify provided the strata sizes in the universe (Ni) are known. Comments 1. this estimate is T 2 $ = 89.899 (more precisely.13 and the 3.Poststratification RAT-STATS Companion Manual The (more precise) computer-generated point estimate is $123. highlighted value of $34.69484) 2 + (1146)(4. = 1.379).520 + 34.44 = 3. $ = (373)(240.09098) 2 + (373)(14.03.240.

Estimate of the universe total for the i-th stratum (Ti) $ =N y T i i i where Ni = number of items (universe) in stratum i yi = average of sample items in i-th stratum L = number of strata 2. The total sample size should be at least 100. With stratified sampling. Estimated variance of T pst pst $ ) where ∑ v (T i i =1 L N − n⎞ N 2 $)=⎛ v (T N s + ( N − N i ) si2 ⎜ ⎟ i i i 2 ⎝ n ⎠ n where N = universe total = ENi and n is the total sample size. the sample sizes are fixed (nonrandom). (Rev. 3. Estimate of universe total (T) $ = $ T ∑ N i yi = ∑ T pst i i =1 i =1 L L $ = v (T $ )= 3. the sample sizes (ni) are unknown in advance (random variables). FORMULAS 1. A minimum of 20 sampling units per stratum is required as well as 6 nonzero items per stratum (OA Policy and Procedures). 2. With poststratification.RAT-STATS Companion Manual Poststratification the total for the i-th stratum. These estimates are then summed over all the strata to estimate the universe total. 10/2004) Page 3-97 .

281551565545. replace 1.959963984540 with 1. Approximate 95% confidence interval for stratum total (Ti): $ ± 1959963984540 $) T .959963984540 with 1. Estimated standard error of T i $) v (T i 5.644853626951 and for an 80% confidence interval replace 1.Poststratification RAT-STATS Companion Manual $= 4. v (T i i $ is 6. Page 3-98 (Rev. 10/2004) . Approximate 95% confidence interval for universe total (T): $ ± 1959963984540 $ ) T . Estimated standard error of T pst $ ) v (T pst 7. v (T pst pst NOTE: For a 90% confidence interval in equations 5 and 7.

10/2004) Page 3-99 . (Rev.TXT. Suppose we sample 70 of the file drawers and count the number of claims related to procedure ABC. one of the user queries is for the universe size (N) and this value must be known. procedure ABC).one to estimate the universe size and the other to estimate one or more variable characteristics. For example. The population of interest is a subset of some other universe. The first step is to estimate the number of claims related to procedure ABC in all 575 drawers. For situations where N is unknown. as does the Unrestricted Variable Appraisal program. the Unknown Universe Size program can be used. When using the Unrestricted Variable Appraisal program. Use of this program requires that two random samples be used -.RAT-STATS Companion Manual Unknown Universe Size Unknown Universe Size This program calculates a confidence interval for a universe total when using variable sampling. the larger sampling frame might consist of 575 file drawers containing a mixture of dental claims and the population of interest consists of all claims related to a particular dental procedure (say. The results on the next page were obtained and are stored in data file DATAUNIV. Both samples must be appraised using the Unrestricted Variable Appraisal program prior to running this module since this program will ask for the mean and standard deviation of each sample.

10/2004) . . . Page 3-100 (Rev.33 s = 2.Unknown Universe Size RAT-STATS Companion Manual Sampled Drawer 1 2 3 .75 10 Total 723 <continued> Data file DATAUNIV. .TXT 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 9 12 9 6 12 13 10 9 10 6 13 7 12 7 12 12 10 9 10 10 13 10 10 10 6 8 14 22 10 8 8 8 12 10 8 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 9 12 8 6 14 9 14 14 6 14 12 9 10 9 9 8 14 9 14 8 12 14 10 12 9 12 12 14 10 10 6 9 8 12 10 NOTE: This file contains 70 lines. . 70 Number of Claims Related to Procedure ABC 9 12 9 Sample summary x = 10. .

86% 1.OFFICE OF AUDIT SERVICES VARIABLE UNRESTRICTED APPRAISAL AUDIT/REVIEW: First Sample DATA FILE USED: C:\Temp\DATAUNIV.00 NONZERO ITEMS 70 ----------------------.95 POINT ESTIMATE 5.234 295 4. the size of the larger universe (575 file drawers here) was used. 10/2004) Page 3-101 . you are able to see the estimated size of the universe of interest $ = 5.97% 1.75 STANDARD ERROR .95% 1.939) in this output.644 6.586 6. (N Date: 10/24/2004 DEPARTMENT OF HEALTH & HUMAN SERVICES OIG .RAT-STATS Companion Manual Unknown Universe Size The following computer output was obtained using the Unrestricted Variable Appraisal program. In the Universe Size box.TXT Time: 12:12 SAMPLE SIZE 70 VALUE OF SAMPLE 723.02 Universe Size program.994945415107 LOWER LIMIT UPPER LIMIT PRECISION AMOUNT PRECISION PERCENT T-VALUE USED LOWER LIMIT UPPER LIMIT PRECISION AMOUNT PRECISION PERCENT T-VALUE USED LOWER LIMIT UPPER LIMIT PRECISION AMOUNT PRECISION PERCENT T-VALUE USED (Rev.293941609194 90% CONFIDENCE LEVEL 5.31 NOTE: Input to Unknown SKEWNESS 1.33 575 MEAN / UNIVERSE STANDARD DEVIATION 2.710 6.292 353 5.667238548669 95% CONFIDENCE LEVEL 5.168 229 3. With this procedure.E X A M I N E D -----------------------10. KURTOSIS 5.939 <--estimated universe size CONFIDENCE LIMITS 80% CONFIDENCE LEVEL 5.

98 < .Unknown Universe Size RAT-STATS Companion Manual The next step is to independently obtain a random sample from the population of interest to appraise the variable(s) of interest.62 6.29 21.95 14.09 6.30 29.98 16.05 NOTE: This file contains 55 lines.95 19.38 16.89 15.85 19.00 13.98 16.75 6.55 13.97 17.50 14.97 15. Page 3-102 (Rev.97 18.97 22.51 6.98 20.> 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 16. 10/2004) .85 13.45 15.69 21.32 25. Data file DATAVAR.98 17. The results are stored in data file DATAVAR.17 23.07 20.12 15.TXT 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 15.34 6.62 6.10 17.23 15.75 19. It was decided to sample 55 claims related to procedure ABC and record the amount in error for each sampled claim.62 17.47 19.15 22.25 20.61 12.93 12.98 14.32 21.62 18.TXT.78 6.26 6.48 6.65 15.05 18.05 14.45 21.65 6.continued .60 6.17 14.

64% 1.000 MEAN / UNIVERSE STANDARD DEVIATION 5. 10/2004) Page 3-103 .14 Universe Size program. KURTOSIS 2.71 NOTE: Input to Unknown SKEWNESS -.846 1.92% 1. The user can enter any value in the Universe Size box since this value has no effect on the results produced by the Unknown Universe Size program.196 7.004879288188 LOWER LIMIT UPPER LIMIT PRECISION AMOUNT PRECISION PERCENT T-VALUE USED LOWER LIMIT UPPER LIMIT PRECISION AMOUNT PRECISION PERCENT T-VALUE USED LOWER LIMIT UPPER LIMIT PRECISION AMOUNT PRECISION PERCENT T-VALUE USED (Rev.65 1.RAT-STATS Companion Manual Unknown Universe Size This file is used as input to the Unrestricted Variable Appraisal program.76 NONZERO ITEMS 55 Any value can be used for universe --------------------.TXT Time: 13:12 SAMPLE SIZE 55 VALUE OF SAMPLE 860.723 16. Date: 10/24/2004 DEPARTMENT OF HEALTH & HUMAN SERVICES OIG .432 9.577 927 5.15% 2.D I F F E R E N C E ---------------------15.650 CONFIDENCE LIMITS 80% CONFIDENCE LEVEL 14.297426488209 90% CONFIDENCE LEVEL 14.63 POINT ESTIMATE 15.455 16.218 17.OFFICE OF AUDIT SERVICES VARIABLE UNRESTRICTED APPRAISAL AUDIT/REVIEW: Second Sample DATA FILE USED: C:\Temp\DATAVAR.082 1.673564906352 95% CONFIDENCE LEVEL 14. The following computer output was produced.45 STANDARD ERROR .

33)(575)](15.152 86.957 5.75 5.560 6.OFFICE OF AUDIT SERVICES VARIABLE APPRAISAL WITH UNKNOWN UNIVERSE SIZE AUDIT/REVIEW: Variable Unknown Universe Size Time: 13:48 = = = = = = = = = = = = = = = = I N P U T = = = = = = = = = = = = = = = = SAMPLE TO SAMPLE FOR ESTIMATE POPULATION VARIABLE ATTRIBUTE UNIVERSE 575 SAMPLE 70 55 MEAN 10.098 10.118 1− 575 70 and the value of se2 is 5.957.45 55 1− =. the value of se1 is (2.355 99. This is T Using the formula section.7315 5939 55 Page 3-104 (Rev.431 8.75)(575) 70 = 177.959963984540 Discussion.957 5.65) = $92. 10/2004) . The estimated total dollars in the universe is equal to (estimated universe size)(mean of the variable sample) $.055 10.644853626951 92.957 5. = [(10.859 103. The following computer output is produced by the Unknown Universe Size program: Date: 10/24/2004 DEPARTMENT OF HEALTH & HUMAN SERVICES OIG .152 84.33 15.10% 1.Unknown Universe Size RAT-STATS Companion Manual Comment: This is the same example used in the RAT-STATS User’s Guide illustration.45 = = = = = = = = = = = = = E S T I M A T I O N = = = = = = = = = = = = = 80% CONFIDENCE 90% CONFIDENCE 95% CONFIDENCE POINT ESTIMATE STANDARD ERROR LOWER LIMIT UPPER LIMIT PRECISION AMOUNT PRECISION PERCENT Z-VALUE USED 92.474 9.603 7.65 STANDARD DEVIATION 2.152 82.86% 1.12% 1.281551565545 $ --> T 92.483 101.

Sample to estimate variable of interest: sample size is n2.540. 1. Sample to estimate universe size: sample size is n1. $84.65)(177. total error amount) $=N $ •x T 2 $ 3.474.644853626951) = $8. Estimated variance of T $ ) = [ x • se ]2 + [ N $ • se ]2 − [ se • se ]2 v (T 2 1 2 1 2 where se1 = s1 N 1 n1 1− n1 s and se2 = 2 N1 n2 1− n2 $ N (Rev.g. FORMULAS Given: Larger universe size: N1.118)(. and standard deviation is s1. This amount is 9. mean is x1 .152.250. that is.957 = .[(177. since 8.118)]2 + [(5939)(.250 = $5.474 (highlighted).957 ± 8.474/92.12% of the point estimate. Estimate of universe (of interest) size $ = N •x N 1 1 2.483 to $101.7315)]2 = 26.. The estimated standard error of T The PRECISION AMOUNT at the 90% confidence level is (5. and standard deviation is s2. 10/2004) Page 3-105 .0912.RAT-STATS Companion Manual Unknown Universe Size $ is The estimated variance of T [(15. mean is x 2 .540. Overall estimate for variable total (e.431. The corresponding confidence interval is 92. $ is 26.7315)]2 .152)(1.

page 343. 10/2004) . exercise 10.959963984540 with 1. Page 3-106 (Rev.. Approximate 95% confidence interval for universe total (T): $ ± 1959963984540 $) T . Keith Ord. Alan Stuart and J. Reference: Estimated variance: Kendall's Advanced Theory of Statistics. 1987.959963984540 with 1. Volume 1.281551565545.23. replace 1.644853626951 and for an 80% confidence interval replace 1. v (T NOTE: For a 90% confidence interval. 5th ed.Unknown Universe Size RAT-STATS Companion Manual 4. New York: Oxford University Press.

this depends on the desired precision of the point estimate. 10/2004) Page 4-1 . The programs in this section are listed below and are concerned with determining sample sizes for various data types and sample strategies.RAT-STATS Companion Manual Sample Size Determination SAMPLE SIZE DETERMINATION A commonly encountered question in auditing is “How large a sample is necessary?” When using an unrestricted (simple random) sample. # Variable • Unrestricted Using a Probe Sample • Unrestricted Using Estimated Error Rate • Stratified (Total Sample Size Known) • Stratified (Total Sample Size Unknown) (Rev.

The probe sample may be stored in a text file. 95%.” the user will be prompted to enter the desired precision percentage. Situation 2: The program also allows the user to determine the optimum distribution of a sample among strata when the overall sample size has already been determined. 20% and “Other. Situation 1: The program will help select the necessary sample size for an unrestricted or stratified variable appraisal. 5%. an Excel spreadsheet. 2%.Unrestricted Using a Probe Sample This program allows the user to estimate sample sizes for specified precision percentages and specified confidence levels. The user may also select any combination of the following confidence levels: 80%. The program output includes sample sizes for each stratum that will provide precision percentages of 1%. It will allocate the larger samples to those strata that are larger in size and/or contain a larger amount of variation (are nonhomogeneous). or an Access table. 90%. Variable Sample Size Determination . 95%. The user has the option of having the program read a probe sample file to obtain an estimate of the universe mean and standard deviation or input these two estimates directly without reading a probe sample file. and 99%.Unrestricted Variable Sample Size Determination RAT-STATS Companion Manual Variable Sample Size Determination This RAT-STATS module can be used for two situations. 10%.” When selecting “Other. Any combination of the confidence levels 80%. 10/2004) . and 99% can be selected. 90%. Page 4-2 (Rev.

The sample mean is $400 and the sample standard deviation is $50.RAT-STATS Companion Manual Unrestricted Variable Sample Size Determination Example 1. 10/2004) Page 4-3 .TXT) is shown below. The audit objective was to determine the necessary sample sizes when estimating the total examined amount for the universe of 100. The probe sample (SAMPDATA.000 items. (Rev. A probe sample of 25 examined values was obtained. This example illustrates Situation 1. 321 382 453 459 343 388 313 420 407 395 441 448 447 333 357 395 477 391 356 368 376 350 461 472 447 The input screen and resulting text file output are shown on the following page.

A sample size under 30 will be flagged using “(*)” and the note immediately following the sample sizes will appear. 10/2004) . Page 4-4 (Rev.Unrestricted Variable Sample Size Determination RAT-STATS Companion Manual The following text file output is obtained using the previous screen.

You may need to increase the sample sizes in order to be in compliance with organizational objectives.00 Estimated Std. If the calculated sample size is 0.-”. Deviation: 50.00 Universe Size: 100. 10/2004) Page 4-5 . a text value of “.. The necessary sample size is the number of sample items necessary to obtain the specified sample precision at the specified confidence level.RAT-STATS Companion Manual Unrestricted Variable Sample Size Determination DEPARTMENT OF HEALTH & HUMAN SERVICES OIG . For example. The generated sample sizes were the result of mathematical formulas and did not incorporate management decisions concerning the purpose of the sample or current organizational sampling policies. a sample size of 106 is necessary to obtain a point estimate having a precision percentage of plus or minus 2% using a 90% confidence level.OFFICE OF AUDIT SERVICES Date: 5/11/2004 Sample Size Determination 80% 256 64 10 (*) 3 (*) 1 (*) --Confidence Level 90% 95% 421 597 106 150 17 (*) 24 (*) 4 (*) 6 (*) 2 (*) 3 (*) 1 (*) 1 (*) Time: 21:52 99% 1026 259 41 10 (*) 5 (*) 2 (*) 1% 2% Precision 5% Level 10% 15% 25% Estimated Mean: 400.000 NOTE (*): One or more sample sizes were under 30. along with the specified universe size. Explanation of Output The output for each cell in the output table will consist of (1) the necessary sample size or (2) the text “.-” will appear in this cell. (Rev. in this illustration. This occurred in the lower left cell for the sample illustration. The output also contains the estimated mean and standard deviation.

g. 1 for 1%. 1. the sample size is ( StdDev ⋅ N ) 2 n= ( E / ZVAL) 2 + N ⋅ ( StdDev ) 2 The value of n is rounded up or down to the nearest integer.959963984540 (95%).Unrestricted Variable Sample Size Determination RAT-STATS Companion Manual FORMULAS Let PREC = the precision percentage (e. where the right-tail area is expressed as a proportion between 0 and 1.. and 2. ZVAL is 1. 10 for 10%) ZVAL = the value from the standard normal (Z) distribution having a right-tail area equal to (100 . Page 4-6 (Rev.644853626951 (90%).Confidence Level)/2.575829303549 (99%). 10/2004) . N = the universe size Mean = estimated universe mean obtained from the probe sample or specified by the user StdDev = estimated universe standard deviation obtained from the probe sample or specified by the user E = maximum error = (PREC/100) A Mean A N For each selected value of PREC and ZVAL.281551565545(80%). 1.

Unrestricted Using Expected Error Rate This procedure estimates the mean and standard deviation of the difference (error) amounts by assuming (1) any item found to be in error is 100% in error and (2) the mean and standard deviation of the nonzero error amounts is the same as the mean and standard deviation of the reported (examined) amounts. Of interest is the required sample size necessary in order to obtain plus or minus 15% using a 90% confidence level. Consequently. Example 2. The mean and standard deviation of the error amounts are estimated by assuming the percentage of nonzero errors in the error population is equal to the expected error rate (one of the input values) and the nonzero errors resemble the reported amounts. this procedure will often give more reliable sample size estimates than those obtained using the Variable Unrestricted (Using Reported Amounts) module since the expected number of zero values in the error population is factored into the sample size calculation. (Rev.RAT-STATS Companion Manual Unrestricted Variable Sample Size Determination Variable Sample Size Determination . The corresponding input screen follows where 25% was specified for the “Other” precision level. that is. The estimated error rate is 15% for a universe of 10.000 and the standard deviation of the reported amounts is $125. This example illustrates another method of dealing with Situation 1. 10/2004) Page 4-7 . the mean and standard deviation of the nonzero errors are equal to the mean and standard deviation of the reported amounts.000 transactions. Comment.000. the mean reported amount is $300. Even though these assumptions may not be entirely true. The total reported amount is $3.

10/2004) .Unrestricted Variable Sample Size Determination RAT-STATS Companion Manual The text file output shown on the next page is obtained using this input screen. A sample size under 30 will be flagged using “(*)” in the program output and a note informing the user of this fact will also appear in the program output immediately following the sample sizes. Page 4-8 (Rev.

If the calculated sample size is 0. (Rev. a text value of “. The output also contains the estimated mean and standard deviation of the difference (error) values. The necessary sample size is the number of sample items necessary to obtain the specified sample precision at the specified confidence level.RAT-STATS Companion Manual Unrestricted Variable Sample Size Determination Date: 12/22/2004 DEPARTMENT OF HEALTH & HUMAN SERVICES OIG .. in this illustration.00 Estimated Standard Deviation: 117.00 Standard Deviation: 125.00 and $117.-”.-” will appear in this cell. the estimated mean and standard deviation are $45.000 Anticipated Error Rate: 15% Reported Amounts .Estimated Mean: 45..000.. a sample size of 287 is necessary to obtain a point estimate having a precision percentage of plus or minus 15% using a 90% confidence level.55. 10/2004) Page 4-9 .55 Explanation of Output The output for each cell in the output table will consist of (1) the necessary sample size or (2) the text “. respectively.. For example.00 Difference Values .Total Amount: 3.OFFICE OF AUDIT SERVICES Sample Size Determination Confidence Level 90% 95% 9486 9633 8219 8676 4248 5119 1559 2077 758 1044 287 403 Time: 10:14 1% 2% Precision 5% Level 10% 15% 25% 80% 9181 7370 3095 1008 474 176 99% 9784 9188 6443 3117 1675 675 Universe Size: 10. For this illustration.000.

1. where the right-tail area is expressed as a proportion between 0 and 1. ZVAL is 1.Unrestricted Variable Sample Size Determination RAT-STATS Companion Manual FORMULAS Let PREC = the precision percentage (e. 10 for 10%) ZVAL = the value from the standard normal (Z) distribution having a right-tail area equal to (100 . 1. and 2. 10/2004) . Page 4-10 (Rev. N = the universe size (input) TR = the total reported amount (input) :R = mean reported amount = TR / N σ R = standard deviation of reported amounts (input) $ = the estimated error rate (input) p $ D = the estimated mean of the difference (error) values = p $ µR µ $ D = estimated standard deviation of the difference (error) values σ = 2 2 $ [σ R $ )µR p + (1 − p ] $D A N E = maximum error = (PREC/100) A µ For each selected value of PREC and ZVAL. the sample size is $D ⋅ N )2 (σ n= $D )2 ( E / ZVAL) 2 + N ⋅ (σ The value of n is rounded up or down to the nearest integer.281551565545(80%).g.. 1 for 1%.575829303549 (99%).959963984540 (95%).644853626951 (90%).Confidence Level)/2.

The following input screen was used for this example. At a confidence level of 95%.000 (standard deviation).Total Sample Size is Unknown Example 3. Two strata have been defined: The highincome stratum (N1 = 100.000 and the estimated standard deviation is $5. what sample size is required to obtain a precision percentage of ± 10%? Solution.000 items). 10/2004) Page 4-11 .Stratified Stratified Sample Sizes . For the high-income stratum. the estimated mean of the audited amounts is $10. This example illustrates Situation 1.000. (Rev. These values for the low-income stratum are $5.RAT-STATS Companion Manual Stratified Variable Sample Size Determination Variable Sample Size Determination . Of interest is the total audit (claimed) amount for the universe.000 items) and the low-income stratum (N2 = 500.000 (mean) and $4.

54 -.000.00 4. If one or more of the sample sizes are under 30.000.000.000 600.000.Stratified Variable Sample Size Determination RAT-STATS Companion Manual The following output is obtained using the previous screen.000 500.TOTALS - = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = Sample Sizes for Stratum 1: High Income 80% 1653 418 67 17 (*) 8 (*) 3 (*) Confidence Level 90% 95% 2699 3795 687 972 111 157 28 (*) 40 13 (*) 18 (*) 5 (*) 7 (*) 99% 6406 1669 271 68 31 11 (*) 1% 2% Precision 5% Level 10% 15% 25% Sample Sizes for Stratum 2: Low Income 80% 6611 1671 268 68 30 11 (*) Confidence Level 90% 95% 10793 15180 2745 3888 442 627 111 157 50 70 18 (*) 26 (*) 99% 25624 6676 1081 271 121 44 1% 2% Precision 5% Level 10% 15% 25% Total Sample Sizes 80% 8264 2089 335 85 38 14 (*) Confidence Level 90% 95% 13492 18975 3432 4860 553 784 139 197 63 88 23 (*) 33 99% 32030 8345 1352 339 152 55 1% 2% Precision 5% Level 10% 15% 25% Page 4-12 (Rev.STD.RATIO -20.00% .OFFICE OF AUDIT SERVICES Sample Size Determination Date: 10/19/2004 Time: 12:02 THE ESTIMATES ARE BASED ON THE FOLLOWING ENTRIES: NBR 1 2 DESCRIPTION High Income Low Income -.00 4.579.00% 80.833. the note immediately following the total sample sizes will appear.DEV. -5.UNIVERSE -100. 10/2004) .33 -.00 5.000 -.00 5.MEAN -10. DEPARTMENT OF HEALTH & HUMAN SERVICES OIG .

000 and $5. In the final portion of the output.000. respectively.000. If any of the calculated samples sizes exceeds the corresponding universe size. and 157 items from stratum 2 with a sample mean and standard deviation of $5. The program reduced the calculated sample size to the universe size. When this data set (named STRATA.500.863 and is in fact (approximately) 10% of the point estimate.000 and $4. Consequently. At the 95% confidence level. the precision amount is 349. the computer output on the next page was obtained.000. The additional sampling units were then distributed among the remaining strata based on optimal allocation formulas. For 10% precision and 95% confidence.000.TXT) was used as input to the STRATIFIED VARIABLE APPRAISAL module.RAT-STATS Companion Manual Stratified Variable Sample Size Determination NOTE (*): One or more sample sizes were under 30. a data set was constructed that contained 40 items from stratum 1 with a sample mean and standard deviation of $10. notice that the resulting point estimate for the universe total is 3. This assumes that the resulting sample means and standard deviations are the same as the values used as input to this program. (Rev. a 95% confidence interval based on these sample sizes should result in a precision percentage of ±10%. You may need to increase the sample sizes in order to be in compliance with organizational objectives.043. the program will conclude with the following reminder: NOTE (#): The formulas calculated a sample size greater than the universe size. 10/2004) Page 4-13 . the total sample size required is n = 197 with n1 = 40 items to be obtained from the high-income stratum and n2 = 157 from the lowincome stratum. To demonstrate this. respectively. Discussion. The generated sample sizes were the result of mathematical formulas and did not incorporate management decisions concerning the purpose of the sample or current organizational sampling policies.

000 STANDARD DEVIATION 5.00 1.958 1.000.000.000 .22 KURTOSIS 2. 4.32% 1.500.022690920037 5.81 µ Approx.122.TXT Time: 10:36 STRATUM NUMBER 1 2 TOTALS SAMPLE SIZE 40 157 197 VALUE OF SAMPLE 400.42 SKEWNESS -.488 133.Stratified Variable Sample Size Determination RAT-STATS Companion Manual Date: 10/23/2004 DEPARTMENT OF HEALTH & HUMAN SERVICES OIG .041.042 159.584.883 10.OFFICE OF AUDIT SERVICES STRATIFIED VARIABLE APPRAISAL AUDIT/REVIEW: Two Strata Example DATA FILE USED: C:\Temp\STRATA.000.103.117 1.877.488 13.30 STANDARD ERROR (MEAN) 790. 5.512 1.883 103.000.756 POINT ESTIMATE 1.000.00 500.958.30% 1.185.000.684875121711 95% CONFIDENCE LEVEL 840.87 3.041.00 100.04 µ Approx.887 2.175.000.42 STANDARD ERROR (TOTAL) 79.042 15.00 785.000 LOWER LIMIT UPPER LIMIT PRECISION AMOUNT PRECISION PERCENT T-VALUE USED LOWER LIMIT UPPER LIMIT PRECISION AMOUNT PRECISION PERCENT T-VALUE USED LOWER LIMIT UPPER LIMIT PRECISION AMOUNT PRECISION PERCENT T-VALUE USED Stratum 2 MEAN / UNIVERSE STANDARD DEVIATION SKEWNESS KURTOSIS STANDARD ERROR (MEAN) STANDARD ERROR (TOTAL) POINT ESTIMATE Page 4-14 (Rev.041.000 3.175.D I F F E R E N C E ---------------------MEAN / UNIVERSE 10.303638588621 90% CONFIDENCE LEVEL 866.133.000 STANDARD ERROR 790.000.824.000.877. 10/2004) .99% 2.17 159.999.159.00 NONZERO ITEMS 40 157 197 Stratum 1 --------------------.30 319.000 CONFIDENCE LIMITS 80% CONFIDENCE LEVEL 896.

227.073.773.921 10. the values of the sample mean and standard deviation will likely not be exactly those specified in the input to this program.863 349.876 CONFIDENCE LIMITS 80% CONFIDENCE LEVEL 3.500.966 2.959963984540 600.386.034 315.863 9.926.226.944 2.000 LOWER LIMIT UPPER LIMIT PRECISION AMOUNT PRECISION PERCENT Z-VALUE USED LOWER LIMIT UPPER LIMIT PRECISION AMOUNT PRECISION PERCENT Z-VALUE USED LOWER LIMIT UPPER LIMIT PRECISION AMOUNT PRECISION PERCENT Z-VALUE USED Comments (1) When the sample of size n = 197 is obtained.227.156 3.56% 1.086.485 3.849.034 12.386.728.061.000.079 2.150.056 8.235.613.844 292.515 228.294.644853626951 95% CONFIDENCE LEVEL 3.705.000 178.184.RAT-STATS Companion Manual Stratified Variable Sample Size Determination LOWER LIMIT UPPER LIMIT PRECISION AMOUNT PRECISION PERCENT T-VALUE USED LOWER LIMIT UPPER LIMIT PRECISION AMOUNT PRECISION PERCENT T-VALUE USED LOWER LIMIT UPPER LIMIT PRECISION AMOUNT PRECISION PERCENT T-VALUE USED OVERALL POINT ESTIMATE / UNIVERSE STANDARD ERROR CONFIDENCE LIMITS 80% CONFIDENCE LEVEL 2.815.287001917850 90% CONFIDENCE LEVEL 2.764. 10/2004) Page 4-15 .97% µ 1.926. (Rev.043.938.226.22% 1.281551565545 90% CONFIDENCE LEVEL 3.043.654679995672 95% CONFIDENCE LEVEL 2.975287507703 3.061.137 3.37% 1.772.61% 1.52% 1.956.921 264.792.056 205.515 6.207.844 8.271.

000 1. Here this would be E = 350. if the largest audit 4 amount that you expect to see in the LOW INCOME stratum is L = $15. the approximate value of F for this stratum is σ$ = L− S . Then. E is the desired precision amount expressed as a percentage of the point estimate for the universe total. then the estimated standard deviation is σ$ = (15. the best the user can hope for is that the resulting precision percentage will be approximately 10%. The point estimate for the universe total was 3. In the previous example.500.000)/4 = $3.000. The input screen on the following page was used for this example.Total Sample Size is Known Example 4.Stratified Variable Sample Size Determination RAT-STATS Companion Manual Consequently. Page 4-16 (Rev. the specified precision was 10% of the point estimate. which used two strata -. 10/2004) .000 and the smallest value is S = $1.000. (2) For the preceding example. This is an illustration of situation 2.000.500. The total sample size is set at 500. In the formula section.the high-income stratum and the low-income stratum. a rough approximation for F can be obtained for each stratum by estimating (1) the largest value (L) that you expect to see in the sample for this stratum and (2) the smallest value (S) that you expect to see in this stratum. Notice that the user is unable to set the precision percentages for this situation. Stratified Sample Sizes . The situation is the same as that described in Example 3.000. (3) For situations in which you do not have an estimate of the universe standard deviation (F) from previous audit results.000.

000 5. 10/2004) Page 4-17 .e.000 Estimated Universe Size 100. Notice that the resulting strata ratios (i. 20% and 80%) are identical to those obtained in Example 3. (Rev.000 Stratum High Income Low Income The program output on the next page is obtained.000 4..000 Estimated Standard Deviation 5.000 500.RAT-STATS Companion Manual Stratified Variable Sample Size Determination The following estimates were used as input to the program: Estimated Mean 10.

which total n = 500. Call this SUM.2) = 100.22% The following sample sizes are based on a total sample size of 500.500.000.000) divided by SUM. This formula (borrowed from the Stratified Variable Appraisal formula section) is contained in the formula section to follow.000.000 = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = Precision Values: Confidence Level 80% 4.000)(5. the ratio for stratum 2 is .DEV. For this example. ∑ Niσ$i is (100.000.OFFICE OF AUDIT SERVICES Sample Size Determination Time: 13:07 THE ESTIMATES ARE BASED ON THE FOLLOWING ENTRIES: NBR 1 2 DESCRIPTION High Income Low Income -. The two sample sizes are n1 = 100 and n2 = 400.00 5.833. So.UNIVERSE -100.2.8 and n2 is (500)(. NOTE: This same discussion applies to Example 3.000)(5.MEAN -10.000.STD. -5. Stratum 1: High Income Sample Size Ratio 100 20.000) + (500. 99% 8. The ratio value for stratum 1 is (100.33 4. Similarly.09% 90% 5.000 .000)(4.00 -.8) = 400.TOTALS 5.Stratified Variable Sample Size Determination RAT-STATS Companion Manual Date: 10/23/2004 DEPARTMENT OF HEALTH & HUMAN SERVICES OIG . that is . n1 is (500)(.00% Discussion. 20% of the sample size is allocated to stratum 1. For this example.26% See the Discussion section.000. the precision amount will be Page 4-18 (Rev.00 4. 10/2004) .000.000 500.54 600.25% NOTE: 95% 6.000) = 2. that is.00 -.579.00% Stratum 2: Low Income Sample Size Ratio 400 80. What is the precision amount for this sampling design? This will be the value obtained by the Stratified Variable Appraisal program using these sample sizes and estimated standard deviations.

500.RAT-STATS Companion Manual Stratified Variable Sample Size Determination 2 ⎛ 100.000) + (5.000 195996 .26%.136. (Rev.000 ⎠ 100 ⎝ 500. This value is called PERC in the formula section to follow and matches with the highlighted value in the computer output.038.136 / 3.000 ⎜ ⎟ ⎟ ⎝ 100.000 ⎜ + 500. 10/2004) Page 4-19 .500.000)(500.000) = 3.000 2 2 ⎛ 500.038.000.000 − 100 ⎞ 5.000.000 ⎠ 400 2 = 219.000 − 400 ⎞ 4. The estimated universe total is $ = E(stratum mean)(stratum size) = (10.000. 100.000)(100. FORMULAS Total Sample Size (n) is Known Notation L = Number of strata Ni = the universe size for the i-th stratum (StdDev)i = estimated universe standard deviation for the i-th stratum SUM = ∑ N i ⋅ ( StdDev ) i i =1 L (Ratio)i = [Ni A (StdDev)i] / SUM The resulting sample size allocated to the i-th stratum is ni = n A (Ratio)i . T The resulting precision percentage is 100 A (219.000) = 6.

1 for 1%.g. 10 for 10%) ZVAL = the value from the standard normal (Z) distribution having a right-tail area equal to (100 .644853626951 (90%).281551565545 (80%). and 2. 1.575829303549 (99%). ZVAL is 1.. 1.959963984540 (95%). where the right-tail area is expressed as a proportion between zero and one.Confidence Level)/2.Stratified Variable Sample Size Determination RAT-STATS Companion Manual Total Sample Size (n) is Unknown Notation L = Number of strata Ni = the universe size for the i-th stratum N = the total universe size = ∑ Ni i= 1 L (Mean)i = estimated universe mean for the i-th stratum UnivTotal = estimated universe total = ∑N i =1 L i ⋅ ( Mean) i (StdDev)i = estimated universe standard deviation for the i-th stratum SUM1 = ∑ Ni ⋅ (StdDev)i ∑ Ni ⋅ (StdDev)i2 i= 1 i= 1 L L SUM2 = (Ratio)i = [Ni A (StdDev)i] / SUM1 PREC = the precision percentage (e. E = the precision amount = (PREC/100) A (UnivTotal) Page 4-20 (Rev. 10/2004) .

The ni values are then rounded up to the nearest integer. The remaining sample sizes are then obtained by applying the above formula and (1) omitting the i-th stratum in the denominator and (2) replacing n with n Ni (the total sample size for the remaining L-1 strata).959963984540 with 1. PERC = $ T ⎛ N i − ni ⎞ ( StdDev ) i2 N ⎜ ⎟ ∑ ni ⎝ Ni ⎠ i =1 L 2 i $ is the estimated total for the universe. n = 487. and 2. If the computed sample size for stratum i (ni) is larger than the universe size Ni. 2. where 1959963984540 . the value of n is treated as a floating point number (e. (1) the total sample size is ( SUM 1) 2 n= ( E / ZVAL) 2 + SUM 2 (2) the sample size allocated to the i-th stratum is ni = n A (Ratio)i Comments 1.644853626951 for a 90% interval.. The precision percentage at the 95% confidence level is ± PERC. 10/2004) Page 4-21 . After all strata sample sizes have been determined.263) and the strata sample sizes (ni) are calculated using this value.g. In the preceding calculation. (Rev. n is reset to the sum of the ni.575829303549 for a 99% interval. then ni is set equal to Ni. The value of T $ is obtained by multiplying and where T $= Ni by the estimated mean for stratum i and summing over the L strata.281551565545 for an 80% interval. T NOTE: $ ∑N µ i i =1 L i Replace 1.RAT-STATS Companion Manual Stratified Variable Sample Size Determination For each selected value of PREC and ZVAL. that is. 1.

For example. For this illustration. An approximate confidence interval for a universe proportion (discussed in many introductory statistics textbooks) is based on the normal approximation. attribute confidence intervals differ from the usual interval obtained by deriving the point estimate plus or minus the estimated precision. this interval is symmetric about the point estimate. The user may select any combination of the following confidence levels: 80%. Confidence intervals for attribute sampling are exact and are based on the hypergeometric distribution. Consequently. 90%. this confidence interval is approximate and is unreliable whenever the estimated proportion is very small or very Page 4-22 (Rev. This particular interval follows the “usual” procedure where the confidence interval is equal to (point estimate) ± (estimated precision). the width of the confidence interval is 4% and the confidence level is 95%. However. The resulting sample size is the smallest sample size capable of meeting the specified precision requirement at each of the specified confidence levels. The sample size is determined for specified degrees of precision (using the desired width of the confidence intervals) and for various levels of confidence. and 99%. 95%. where the estimated precision is half the width of the resulting confidence interval. As a result. such confidence intervals are usually not symmetric about the point estimate. Because of this. 10/2004) . the “desired precision” for the attribute sampling procedure must be specified as the desired width (rather than the half-width) of the confidence interval. that is. the point estimate might be 3% and the corresponding 95% confidence interval might be 2% to 6%.Attribute Sample Size Determination RAT-STATS Companion Manual Attribute Sample Size Determination This program determines the sample size for an attribute simple random sample.

This rate of occurrence is generally estimated from past experience. unless the sample size is extremely large. Example 5. It is estimated that 20% of the documents will not have the proper $ = . This is equal to the desired value of (upper confidence limit . Suppose that the desired precision range is 6%. This will produce the largest possible sample size (for fixed values of N and precision range) but the user will be guaranteed that the resulting confidence interval will meet the desired precision range. An audit is to be carried out using a universe of N = 10. 10/2004) Page 4-23 . A confidence level of 95% will be used.lower confidence limit). Since the exact procedure used in this (Rev. signature.5 from previous audit experience. either from similar systems or a past review of this universe. the estimate of p is p NOTE: This may be a rough guess if little information regarding this estimate is available $ = . The confidence interval using the RAT-STATS attribute sample size module discussed here is always exact. If the user has no idea as to the value of p.000 documents to determine what proportion (p) of the documents do not have the proper approval signature. where 3% is half the width of the resulting confidence interval. If no information concerning the rate of occurrence is available. The input screen includes (1) the size of the universe and (2) the anticipated rate of occurrence in the universe.RAT-STATS Companion Manual Attribute Sample Size Determination large. Consequently. If the confidence limits were symmetric about the point estimate. the most conservative procedure is to specify 50% for this value. the user would have specified the precision as ± 3% for this situation. this in no way affects the sample's validity but the resulting precision (confidence interval width) may not meet the desired precision requirement.20. If the actual rate of occurrence differs from the user-specified rate of occurrence. p should be used.

the user must specify the desired total width of the confidence interval. Page 4-24 (Rev.Attribute Sample Size Determination RAT-STATS Companion Manual program usually does not produce an interval symmetric about the point estimate. 10/2004) . The following input screen is used for this example: The resulting computer output (saved to a text file) is shown on the next page.

then the rate of occurrence in this sample would be 133/666. The necessary sample size is the number of sample items necessary to obtain the specified sample precision at each confidence level. The (Rev. The necessary sample size (highlighted) is n = 666.1710 = . in this illustration a sample size of 488 is necessary to obtain a confidence interval having a width of 6% using a 90% confidence level.2310. after the sample $ = .e. obtained using the Unrestricted Attribute Appraisal module.1710 to .. then the resulting 95% of 666 is obtained.079 Sample Size Anticipated Rate of Occurrence: 20% Desired Precision Range: 6% Universe Size: 10. The resulting confidence interval will have a width equal to . a text value of “. If the calculated sample size is zero.000 Explanation of Output The output for each cell in the output table will consist of (1) the necessary sample size or (2) the text “. 20%. This can be seen in the computer output below.06). if the resulting point estimate is close to p confidence interval for p should have a width approximately equal to . 6%).-”. If the resulting sample produced 133 documents not containing the proper signature. that is. As a result..06 (such as ...RAT-STATS Companion Manual Attribute Sample Size Determination Date: 10/19/2004 DEPARTMENT OF HEALTH & HUMAN SERVICES OIG . Discussion. with a width of .OFFICE OF AUDIT SERVICES Sample Size Determination 80% 314 Confidence Level 90% 95% 488 666 Time: 8:46 99% 1. For example. 10/2004) Page 4-25 .2310 .06 (i.-” will appear in this cell.20.

020% 90% CONFIDENCE LEVEL 1. 20%) is inside this interval (it always is).310 23.000 666 133 1.QUANTITY PERCENT Page 4-26 (Rev.QUANTITY PERCENT LOWER LIMIT .QUANTITY PERCENT LOWER LIMIT .QUANTITY PERCENT UPPER LIMIT .10% = 6% (the desired precision range).20 (i.100% 10.540% 2.710 17.805 18.. 10/2004) .754 17.590% 95% CONFIDENCE LEVEL 1.Attribute Sample Size Determination RAT-STATS Companion Manual width of this 95% confidence interval is 23.498% LOWER LIMIT . $ = .e.100% 2.997 19.QUANTITY PERCENT UPPER LIMIT .202 22.970% 150 1.259 22. Notice that p Date: 10/19/2004 Department of Health and Human Services OIG .050% 2. but it is not in the center.10% -17.QUANTITY PERCENT UPPER LIMIT .Office of Audit Services Single Stage Attribute Appraisal AUDIT/REVIEW: Example Time: 12:24 UNIVERSE SIZE SAMPLE SIZE CHARACTERISTIC(S) OF INTEREST QUANTITY IDENTIFIED IN SAMPLE PROJECTED QUANTITY IN UNIVERSE PERCENT STANDARD ERROR PROJECTED QUANTITY PERCENT CONFIDENCE LIMITS 80% CONFIDENCE LEVEL 1.

the user should enter 50% ( p resulting computer output is shown below.5 produces a very large value of n. $ = . but the user did have the guarantee that this value would be no more than 6%.5) in the Anticipated Error Rate box.5 is a user should be encouraged to use even a rough guess for the value of p very conservative procedure because with a sample of size n = 991. The Discussion. approximately 50% larger than the previous sample size of 666.21%. Here. To illustrate. DEPARTMENT OF HEALTH & HUMAN SERVICES OIG . quite likely the resulting confidence interval will have a width considerably less than the desired precision range of 6%.000 $ = . (Rev.250 and the confidence produced 248 not containing the proper signature.OFFICE OF AUDIT SERVICES Sample Size Determination Confidence Level 90% 95% 725 991 Date: 10/19/2004 Time: 12:33 80% Sample Size 466 99% 1. p interval width (using the highlighted values in the following computer output) is 5. Here. Repeat Example 5 where no information is available regarding the proportion of documents not containing the proper signature. the computer output below was obtained when the sample of 991 documents $ = 248/991 = . This value is less than 6%. 10/2004) Page 4-27 .580 Anticipated Rate of Occurrence: 50% Desired Precision Range: 6% Universe Size: 10. Using p $ = . The necessary sample size (highlighted) is now n = 991. The Solution. Example 6 illustrates how using p $ .RAT-STATS Companion Manual Attribute Sample Size Determination Example 6.

503 25. 10/2004) .307% LOWER LIMIT .025% 131 1.249 22.QUANTITY PERCENT LOWER LIMIT .340% 2.334 23.678 26.Attribute Sample Size Determination RAT-STATS Companion Manual Date: 10/19/2004 Department of Health and Human Services OIG .270% 95% CONFIDENCE LEVEL 2.QUANTITY PERCENT UPPER LIMIT .770 27.880% 2.000 991 248 2.288 22.780% 90% CONFIDENCE LEVEL 2.QUANTITY PERCENT Page 4-28 (Rev.727 27.Office of Audit Services Single Stage Attribute Appraisal AUDIT/REVIEW: Example Time: 12:43 UNIVERSE SIZE SAMPLE SIZE CHARACTERISTIC(S) OF INTEREST QUANTITY IDENTIFIED IN SAMPLE PROJECTED QUANTITY IN UNIVERSE PERCENT STANDARD ERROR PROJECTED QUANTITY PERCENT CONFIDENCE LIMITS 80% CONFIDENCE LEVEL 2.QUANTITY PERCENT UPPER LIMIT .QUANTITY PERCENT UPPER LIMIT .QUANTITY PERCENT LOWER LIMIT .700% 10.490% 2.

where k1 is the smallest value of k for which ∑ ⎛ k⎞ ⎛ N − k⎞ ⎟ ⎝i ⎠⎝n− i ⎠ > . k1.95.025 ⎛ N⎞ i= x ⎜ ⎟ ⎝n ⎠ n ⎜ ⎟⎜ The resulting 95% confidence interval for the total number of universe items in error is k1 to k2. p. where k2 is the largest value of k for which ⎛ k⎞ ⎛ N − k⎞ ⎜ ⎟⎜ ⎟ x ⎝i ⎠ ⎝ n − i ⎠ >.025 ∑ ⎛ N⎞ i=0 ⎜ ⎟ ⎝n ⎠ where N = universe size n = sample size k = total number of universe items in error x = number of sample items in error .” Consider the case where the specified confidence level is 95%.(confidence level)]/2 NOTE: Here. a sample item having the attribute of interest will be referred to as an item “in error. The upper limit of the 95% confidence interval for the universe total is. k2. say. 10/2004) Page 4-29 .025 = [1 . the “confidence level” is expressed as . the universe proportion. say. will be the “error rate.RAT-STATS Companion Manual Attribute Sample Size Determination FORMULAS In the discussion to follow. (Rev.” Consequently. The lower limit of the 95% confidence interval is.

Since (10.000)(. 10/2004) . that is. then (666)(. this program searches for the value of n that produces a confidence interval (k1 to k2) such that k1 and k2 satisfy the preceding two inequalities and k2 .1710 [i.e.. the resulting 95% confidence interval for the universe proportion (p) has a lower limit of .2490) = 2.20) . where. k1 = (10.. k2 = (10.569] and an upper limit of .490]. 215-218.000)(. Page 4-30 (Rev.k1 = 600.000)(. “A Note on Confidence Intervals for Proportions in Finite Populations. Vol.569 = 921. For a specified confidence level of 95%. 133. the upper confidence limit must be 600 more than the lower limit.” The American Statistician. But this is not a satisfactory value of n since k2 .20) = 60 (call this x). Here. which must equal 600 according to the previous discussion. in general. The anticipated rate of occurrence is used to specify the number of sample items that contain the characteristic of interest. For example.000. and x = 133 are used as input to the Unrestricted Attribute Appraisal module. it would be 20% of n. For the preceding example.2490 [i. If the values.1. Suppose that the universe size is N = 10.490 .1569) = 1.000)(.. if n = 666.06) is 600. suppose that n = 300 and (300)(.710] and an upper limit of . pp. and the desired precision range is 6%. we know that k2 = k1 + 600. k1 = (10. 600 is equal to N A (desired precision range)..000. Summary of program procedure.e.. If the values N = 10. 3. the resulting 95% confidence interval for the universe proportion (p) has a lower limit of .k1 = 2. n = 300. where n is the sample size determined by this program. and x = 60 are used as input to the Unrestricted Attribute Appraisal program. N = 10.e.000.e.1569 [i. No. the anticipated rate of occurrence (i. n = 666.2310 [i.1710) = 1. error rate) is 20%.Attribute Sample Size Determination RAT-STATS Companion Manual The procedure used to derive this confidence interval can be found in John P.e. Buonaccorsi (1987). 41.

e. since k2 . 10/2004) Page 4-31 . (Rev.2310) = 2..06 (i. This is satisfactory.310].RAT-STATS Companion Manual Attribute Sample Size Determination k2 = (10.k1 = 600 and the difference of the two proportions is . 6%).000)(.

Sign up to vote on this title
UsefulNot useful