17 views

Uploaded by api-3696796

save

- MVI1
- Clinic Class Test 5 M E
- std deviation
- Missing data & how to handle it.pptx
- Assignment i Love Statistic
- Ibm Spss Missing Values 19
- Projection and Regression
- LECTURE01_DataPreprocessing
- Sampling Distribution Web
- Lab 2 Basic Nuclear Counting Statistics-2
- Transportation Statistics: entire
- Corrections Sa Thesis as Well as Dun Sa Mga Tables, Graphs, Etc
- CCP303
- Part A
- Remedial Lessons 1 to 6
- eportfolio submission
- 1
- Asgmen Math
- BST510 Exam Feedback 2015-16 Plus(1)
- Regression Modeling Strategies_ With Applications to Linear Models by Frank E. Harrell
- TOM 302 Syllabus
- 1341752509_logo_File 2.pdf
- Manski 1993
- Social Science Computer Review 2013 O_Connor 229 35
- Computational Finance
- Applied Research
- 20598.pdf
- Hidden Connections Between Regression Models of Strain-Gage Balance Calibration Data
- Development of Time Estimation Model for Multistoried Building Structural Systems
- Management information systems. A case study over the last eight years in the Romanian organizations
- Chapter 2 RevLit
- Revised_Again Chapter 5
- Revised Chapter 3 (as of 2000 Hrs, 28 August 2007)
- Revised Thesis Again
- Revised Chapter 3
- Evaluation of Different Imputation Methods
- Regression Discussion
- Revised Conceptual Framework
- Corrections Sa Thesis as Well as Dun Sa Mga Tables, Graphs, Etc
- Proof in NR
- Revised Chapter 3
- Abstract
- Revised Chapter 5
- Comments Dun Sa Chap 5_revisions
- Chapter 1
- General Comments in Our Thesis Paper Entitled
- Distribution of the Deleted Data
- Errata Sheet
- Revised Defense Presentation
- Discussion of Results Complete
- Dist of Deleted Data
- Conceptual Framework
- Conclusion (the Edited One)
- My Own Errata
- Discussion of Results Complete
- Errata Sheet
- Final Rankings
- Distribution of Imputed Values Vs
- Evaluation of Imputation Methods

You are on page 1of 17

**5.4 Evaluation of Different Imputation Methods
**

To determine the effect of nonresponse rates in the results for each imputation method (IM), evaluation of different IMs was performed. In the evaluation of the different IMs, the results of each IM will be discussed independently. For each IM, the discussion of results will go as follows: (1) bias of the mean of the imputed data, (2) distribution of the imputed data using the Kolmogorov-Smirnov Goodness of Fit Test, and (3) other measures of variability using the mean deviation (MD), mean absolute deviation (MAD), and root mean square deviation (RMSD).

The table of results will contain the following columns: (a) variable of interest (VI), (b) nonresponse rate (NRR), (c) the bias of the mean of the imputed data, Bias ( y ' ), (d) percentage of correct distribution of the imputed data to the actual data set out of 1000 trials (PCD) , (e) MD, (f) MAD, and (g) RMSD.

5.4.1 Overall Mean Imputation Table 8 shows the results of the different criteria in evaluating the imputed data using the OMI method. Table 8: Criteria results for the OMI method

(a) VI TOTEX2 (b) NRR 10% 20% 30% 10% 20% 30% (c) BIAS( y ' ) 640.66 499.43 -222.76 -597.84 -2855.49 -6093.27 (d) PCD 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% (e) MD -6406.60 -2497.14 20310.91 5978.39 14277.43 742.53 (f) MAD 56929.61 59555.36 90396.26 77502.27 87469.87 62388.11 (g) RMSD 108547.82 119193.32 271775.35 167206.24 244758.00 151740.94

TOTIN2

Generated by Foxit PDF Creator © Foxit Software http://www.foxitsoftware.com For evaluation only.

1. Bias of the mean of the imputed data In (c) of Table 8, results show that for the bias of the mean of the imputed data, as the NRR increases, the bias for TOTEX2 slowly decreases in magnitude. The decrease in magnitude of the respondents’ mean as NRR increase is the rationale behind the decrease of the bias of the mean of the imputed data. As the magnitude of the respondents’ mean decreases, variability caused by imputing a single value (i.e. the mean of TOTEX1, the total expenditure of the first visit data, which is equal to 105566.9) that is higher than the mean of the actual data set also decreases.

On the other hand, the results shown for TOTIN2 are the opposite of TOTEX2 as NRR increases. The bias of the mean of the imputed data for TOTIN2 rapidly increases in magnitude as NRR increases. The rationale for this is the decrease in magnitude of the respondents’ mean as NRR increases. However, unlike in TOTEX2, the imputed values (i.e. the mean of TOTIN1, the total income for the first visit data, which is equal to 121820.7) are much lower than the actual mean of the data set.

2. Distribution of the Imputed Data Results in column (e) of Table 8 showed that in all NRRs and VIs, the OMI method failed to maintain the distribution of the actual data. This was expected primarily because for each missing observation for the VIs, the observations were replaced by a single value which is the overall mean of the first visit of the VIs.

Generated by Foxit PDF Creator © Foxit Software http://www.foxitsoftware.com For evaluation only.

Results from related studies that performed OMI stated that this method is one of the worst among all IM since it distorts the distribution of the data. The distribution of the data becomes too peaked which makes this method unsuitable for many post-analyses. (Cheng & Sy, 1999)

3. Other measures of Variability The three criteria in Table 8 under the columns (f), (g) and (h) show the other measures of variability of the imputed data. The values for the MAD and RMSD are increasing in magnitude as NRR increases for TOTEX2. The data which have the highest percentage of imputed values have the highest values for the three measures of variability in TOTEX2. It’s worth noting that a huge increase in magnitude is seen in all the three criterions from the twenty to thirty percent NRR for TOTEX2.

For TOTIN2, the data which have twenty percent imputed observations have the highest values in all the three measures of variability. Unlike for TOTEX2, surprisingly, values from the three measures of variability under the highest NRR have the lowest results.

5.4.2 Hot Deck Imputation Table 9 shows the results of the different criteria in evaluating imputed data using the hot deck imputation (HDI3) method with three imputation classes.

Generated by Foxit PDF Creator © Foxit Software http://www.foxitsoftware.com For evaluation only.

**Table 9: Criteria results for the HDI3 method
**

(a) VI TOTEX2 (b) NRR 10% 20% 30% 10% 20% 30% (c) BIAS( y ' ) 491.91 179.42 -606.37 -717.52 -3095.41 -6508.65 (d) PCD 100.00% 96.90% 0.00% 100.00% 100.00% 1.00% (e) MD 4919.40 897.18 -2021.19 -7175.25 -15477.09 -21695.52 (f) MAD 78071.61 78292.63 81395.79 105369.15 111748.04 115087.13 (g) RMSD 79251.22 67149.16 71390.65 242022.99 297151.50 313814.92

TOTIN2

1. Bias of the mean of the imputed data Similar to the results in the OMI method for the TOTIN2 variable, as the NRR increases, the bias of the mean of the imputed data rapidly increases. In the TOTEX2 variable, the biases fluctuated as the NRR increases. For TOTEX2 and TOTIN2, the data with the highest NRR has the largest bias. For the TOTEX2 variable, the data with twenty percent NRR provided the least bias. On the other hand, the data with the lowest NRR yielded the smallest bias for TOTIN2. 2. Distribution of the Imputed Data Results in column (e) shows that in TOTIN2, the data which contained ten and twenty percent imputation of the total number of observations, maintained the distribution of the actual data. In TOTEX2, only the data which contained ten percent imputations of the total number of observations maintained the distribution of the actual data for all the one thousand data sets. In the data which contained twenty percent imputations of the total number of observations, 969 out of the 1000 data sets maintained the distribution of the actual data set.

Generated by Foxit PDF Creator © Foxit Software http://www.foxitsoftware.com For evaluation only.

For TOTEX2 and TOTIN2, the data with the highest number of imputed observations failed to maintain the distribution of the actual data. Much worse, none of the simulated data set for TOTEX2 registered the same distribution as the actual. On the other hand, only a lone data set maintained the same distribution as the actual. The researchers look into the possibility that more than one recipient are having the same donor.

3. Other measures of variability The three criteria in Table 9 under the columns (f), (g) and (h) show the other measures of variability of the imputed data. For the variable TOTEX2, the following results were obtained: (i) data that contains twenty percent imputed value yielded the least values for the MD and RMSD, (ii) the data with the lowest number of imputations yielded the largest value for MD and RMSD and (iii) MAD is the only criterion which the values are increasing as NRR increases.

For the variable TOTIN2, the following results were obtained: (i) all the three criteria increases as NRR increases, (ii) results for the three criteria were larger than for TOTEX2, and (iii) the data with the largest number of imputations generated the highest value in the three criteria.

Generated by Foxit PDF Creator © Foxit Software http://www.foxitsoftware.com For evaluation only.

5.4.3 Deterministic Regression Imputation Table 10 shows the results of the different criteria in evaluating the imputed data using the deterministic regression imputation method with three imputation classes (DRI3).

**Table 10: Criteria results for the DRI3 method
**

(a) VI TOTEX2 (b) NRR 10% 20% 30% 10% 20% 30% (d) PCD 536.32 100.00% 1080.12 98.40% 398.39 100.00% 897.11 100.00% 356.50 (c) (e) MD 5363.47 5400.71 1328.06 9043.98 (f) MAD 33683.48 33782.60 32449.49 51363.17 (g) RMSD 70553.64 72487.39 72803.60 106374.39

BIAS( y ' )

TOTIN2

-1815.39 100.00% -9076.98 57429.24 148278.49 100.00% 1188.31 51886.73 131429.61

1. Bias of the mean of the imputed data Looking at Table 10, column (c), the bias of the VI is increasing in magnitude as the NRR increases for TOTEX2 and TOTIN2. Compared to OMI and HDI3 where the bias increases tremendously as NRR increases, the increase in bias for DRI3 is much slower. The bias of the data with twenty percent NRR is just twice the bias of the data set with ten percent NRR. For TOTEX2, this method produces larger bias for the mean of the imputed data in all NRR than the OMI and HDI3.

2. Distribution of the Imputed Data Contrary to the results in the OMI method under this criterion, results in column (e) shows that the imputed data maintained the distribution of the actual data in all NRR and VIs. It is even much better than HDI since all of the imputed data sets under all the NRRs and VIs preserved the same distribution as the actual data. It is

Generated by Foxit PDF Creator © Foxit Software http://www.foxitsoftware.com For evaluation only.

interesting to note that the regression models that were used in this study did not show the expected results that were mentioned in the related literature and provided a distinct result. Earlier studies that made use of categorical auxiliary variables, the matching variables that were transformed into dummy variables, concluded that DRI is just the same as the mean imputation. However, in this study, the independent variable was the first visit VIs and for each imputation class there is a fitted model which registered a good R2.

3. Other measures of variability The three criteria in Table 10 under the columns (f), (g) and (h) show the other measures of variability of the imputed data. For these criteria, the following results were obtained: First, results from the three criteria are almost stable as NRR increases for TOTEX2 and TOTIN2. The rate of change of the values for MD, MAD and RMSD is minimal compared to OMI and HDI3. Second, the MAD and RMSD have smaller values than for OMI and HDI3 for TOTEX2 and TOTIN2. Fitting models with high R2 was the key factor that made this method better than the other two IM previously evaluated.

5.4.4 Stochastic Regression Imputation Table 11 shows the results of the different criteria in evaluating the imputed data using the stochastic regression imputation method with three imputation classes (SRI3).

Generated by Foxit PDF Creator © Foxit Software http://www.foxitsoftware.com For evaluation only.

**Table 11: Criteria Results for the SRI3 method
**

(a) VI TOTEX2 (b) NRR 10% 20% 30% 10% 20% 30% (d) PCD 536.32 100.00% 1080.12 98.40% 398.39 100.00% 897.11 100.00% -1815.39 100.00% 356.50 100.00% (c) (e) MD 5363.47 5400.71 1328.06 9043.98 -9076.98 1188.31 (f) MAD 33683.48 33782.60 32449.49 51363.17 57429.24 51886.73 (g) RMSD 70553.64 72487.39 72803.60 106374.39 148278.49 131429.61

BIAS( y ' )

TOTIN2

1. Bias of the mean of the imputed data Looking at Table 11, column (c), for TOTEX2 and TOTIN2, values produced for this method yielded much better results than for DRI3. The bias for TOTEX2 and TOTIN2 do not follow the same scenario for the previous three method that as the NRR increases, the bias increases. The biases fluctuate from one NRR to another. Compared to the three previously evaluated, this method provided the least bias in the highest NRR for both TOTEX2 and TOTIN2. While the other methods reached a four digit bias, SRI3 generated only a three digit bias. Moreover, there is a huge disparity in the third NRR where it only produced less than twenty percent of the bias produced by its deterministic counterpart.

2. Distribution of the imputed data Results from the SRI3 performed better than HDI3 which also simulated the data 1000 times. Unlike in HDI3, SRI3 maintained the same distribution for all imputed data sets for the first and third nonresponse rates. The SRI3 also outperformed HDI3 for the twenty percent NRR. In earlier studies, the stochastic regression imputation performs better than any of the three methods used here.

Generated by Foxit PDF Creator © Foxit Software http://www.foxitsoftware.com For evaluation only.

The random residual was added to the deterministic predicted value to preserve the distribution of the data.

3. Other measures of variability The three criteria in Table 10 under the columns (f), (g) and (h) show the other measures of variability of the imputed data. For this criteria, the following results were obtained: First, similar to the results in measuring the bias of the mean of the imputed data, results in TOTIN2 for all the criteria fluctuates from one NRR to another. Second, in TOTEX2, only the RMSD criterion increase as NRR increases while the MAD and MD fluctuates from one NRR to another. Third, the data with the highest NRR yielded the lowest results for the MD criterion. Fourth, for TOTIN2, the data with twenty percent NRR yielded the largest values for the three criteria.

**5.5 Distribution of the True vs. Imputed Values
**

To provide additional information on the distribution of the imputed data that was discussed previously, the distribution of the true (deleted) values (TVs) and the imputed values (IVs) from each of the IMs for all the VIs and NRRs were obtained. Table 12, 13, and 14 shows the frequency distribution of the methods with their corresponding relative frequencies (RFs) for the first, second, and third NRR respectively. The RFs’ for the 1000 simulated data set from HDI3 and SRI3 were averaged. The first column represents the VIs frequency classes (FCs). This was the same classes that were used in the Kolomogorov - Smirnov Goodness of

Generated by Foxit PDF Creator © Foxit Software http://www.foxitsoftware.com For evaluation only.

Fit Test in determining the estimated percentage of similar distributions of the imputed data. For each NRR, the table containing the distribution of the actual and imputed values will go as follows: (a) VIs, (b) FCs, (c) RFs of the TVs (TV), (d) RFs of the OMI (OMI), (e) RFs of the HDI3 (HDI3), (f) RFs of the DRI3 (DRI3), and (g) RFs of the SRI3 (SRI3).

Table 12: Distribution of the TVs and IVs: 10% NRR 10% NRR (a) VI (b) FCs (c) TV 10.90% 9.70% 9.70% 11.40% 8.70% 9.70% 10.90% 11.10% 9.00% 8.90% (c) TV IMs (d) OMI 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 100.00% 0.00% 0.00% (e) HDI3* 13.90% 10.20% 9.70% 8.90% 9.10% 9.40% 9.40% 8.90% 8.90% 11.60% (f) DRI3 7.70% 8.70% 11.40% 12.30% 11.10% 12.60% 8.00% 11.40% 9.00% 7.70% (g) SRI3* 9.50% 8.70% 6.10% 9.50% 11.40% 11.10% 11.10% 8.50% 12.20% 12.10%

<37869.5 37869.5 – 47056.5 47056.5 – 54922.0 54922.0 – 62365.0 63265.0 – 73868.0 TOTEX2 73868.0 – 86103.0 86103.0 - 101947.0 101947.0 - 126254.5 126254.5 - 169964.0 >169964 (a) VI (b) FCs

IMs

(d) (e) (f) (g) OMI HDI3* DRI3 SRI3* <40570 9.70% 0.00% 15.10% 6.10% 9.10% 40570.0 – 51564.0 10.20% 0.00% 11.90% 8.70% 7.90% 51564.0 – 62006.5 9.40% 0.00% 10.10% 14.50% 8.30% 62006.5 – 73900.5 10.20% 0.00% 9.50% 10.70% 10.00% 73900.5 – 88127.0 9.00% 0.00% 9.60% 12.80% 12.40% TOTIN2 88127.0 - 104801.0 10.90% 0.00% 9.30% 9.20% 9.00% 104801.0 - 128000.0 11.90% 100.00% 9.80% 9.90% 10.50% 128000.0 - 161669.0 11.40% 0.00% 7.80% 11.10% 9.30% 161669.0 - 233907.0 7.70% 0.00% 8.00% 10.70% 11.20% >233907 9.90% 0.00% 8.90% 6.30% 12.30% * RF for each class was obtained by taking the average of the 1000 simulated data set.

Generated by Foxit PDF Creator © Foxit Software http://www.foxitsoftware.com For evaluation only.

Table 13: Distribution of the TVs and IVs: 20% NRR 20% NRR (a) VI (b) FCs (c) TV 9.40% 9.70% 11.60% 10.00% 9.60% 8.40% 9.60% 11.30% 9.70% 10.70% (c) TV (d) OMI 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 100.00% 0.00% 0.00% IMs (f) (e) HDI3* DRI3 14.30% 7.40% 10.40% 9.60% 9.70% 9.00% 9.00% 11.00% 9.20% 12.30% 9.40% 12.50% 9.30% 9.90% 8.70% 10.80% 8.70% 8.80% 11.30% 8.70% (g) SRI3* 8.20% 7.60% 8.20% 7.90% 10.30% 11.90% 10.30% 11.80% 11.70% 12.10%

<37869.5 37869.5 - 47056.5 47056.5 - 54922.0 54922.0 - 62365.0 63265.0 - 73868.0 TOTEX2 73868.0 - 86103.0 86103.0 - 101947.0 101947.0 - 126254.5 126254.5 - 169964.0 >169964 (a) VI (b) FCs

IMs (f) (g) (d) (e) OMI HDI3* DRI3 SRI3* <40570 10.00% 0.00% 15.70% 4.80% 11.80% 40570.0 - 51564.0 10.30% 0.00% 12.10% 11.90% 12.20% 51564.0 - 62006.5 11.70% 0.00% 10.10% 10.20% 11.30% 62006.5 - 73900.5 10.20% 0.00% 9.60% 11.70% 9.90% 73900.5 - 88127.0 8.60% 0.00% 9.50% 11.90% 8.50% TOTIN2 88127.0 - 104801.0 9.40% 0.00% 9.30% 9.60% 10.10% 104801.0 - 128000.0 9.10% 100.00% 9.70% 11.70% 9.00% 128000.0 - 161669.0 9.20% 0.00% 7.60% 9.80% 8.30% 161669.0 - 233907.0 11.30% 0.00% 7.80% 9.70% 8.90% >233907 10.20% 0.00% 8.70% 8.70% 10.10% * RF for each class was obtained by taking the average of the 1000 simulated data set.

Generated by Foxit PDF Creator © Foxit Software http://www.foxitsoftware.com For evaluation only.

Table 14: Distribution of the TVs and IVs: 30% NRR 30% NRR (a) VI (b) FCs (c) TV 9.80% 8.80% 9.60% 9.50% 11.00% 10.70% 10.70% 9.40% 11.00% 9.50% (c) TV (d) OMI 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 100.00% 0.00% 0.00% IMs (e) (f) HDI3* DRI3 14.30% 7.80% 10.40% 9.00% 9.70% 9.40% 8.90% 10.80% 9.20% 12.70% 9.40% 11.50% 9.40% 12.10% 8.70% 8.80% 8.70% 9.00% 11.30% 9.00% (g) SRI3* 10.30% 9.60% 8.30% 9.30% 10.10% 10.60% 9.80% 10.10% 8.10% 13.70%

<37869.5 37869.5 - 47056.5 47056.5 - 54922.0 54922.0 - 62365.0 63265.0 - 73868.0 TOTEX2 73868.0 - 86103.0 86103.0 - 101947.0 101947.0 - 126254.5 126254.5 - 169964.0 >169964 (a) VI (b) FCs

IMs (f) (g) (d) (e) OMI HDI3* DRI3 SRI3* < 40570 9.40% 0.00% 15.60% 6.50% 8.90% 40570.0 - 51564.0 9.00% 0.00% 12.10% 10.40% 8.20% 51564.0 - 62006.5 9.90% 0.00% 10.10% 10.80% 8.80% 62006.5 - 73900.5 10.70% 0.00% 9.60% 11.50% 10.10% 73900.5 - 88127.0 10.20% 0.00% 9.50% 12.20% 11.00% TOTIN2 88127.0 - 104801.0 10.30% 0.00% 9.30% 10.70% 10.20% 104801.0 - 128000.0 10.30% 100.00% 9.70% 10.50% 10.40% 128000.0 - 161669.0 9.80% 0.00% 7.60% 11.20% 10.80% 161669.0 - 233907.0 10.70% 0.00% 7.70% 8.20% 10.30% >233907 9.90% 0.00% 8.70% 8.00% 11.30% * RF for each class was obtained by taking the average of the 1000 simulated data set.

Generated by Foxit PDF Creator © Foxit Software http://www.foxitsoftware.com For evaluation only.

In all NRR, the results clearly illustrate the distortion of the distribution. Since the OMI method assigns the mean of the first visit VI to all the missing cases, all the data sets concentrated in one particular frequency class. The three other methods which implemented imputation classes, gave a better outcome than OMI by spreading the distribution of the imputed data.

For the HDI method, in all nonresponse rates, most of the imputed observations clustered in the first frequency class, that is less than 37859.5 for TOTEX2 and 40570.0 for TOTIN2. The clustering was also formed for the first and third nonresponse rate in last frequency class for TOTEX2 and for the all nonresponse rates in second frequency class for TOTIN2. The percentage of the data from the lowest class for TOTEX2 and TOTIN2, for all nonresponse rate ranges from 14-16% as compared to the actual percentage which only ranges from 9-11%.

While there is an over representation of the data for HDI3, an under representation was observed from the interval 86103-126254.5 for the 10% and 20% nonresponse imputed data sets respectively and from the interval 63265-101947 for the 30% nonresponse imputed data sets. The percentage from the interval indicated for the 10% and 20% under the actual data totaled about 30% while the imputed data only totaled less than 30%.

For the two regression imputation methods, unlike hot deck and OMI which had major cluster, produced more spread distribution although there are some areas that are under

Generated by Foxit PDF Creator © Foxit Software http://www.foxitsoftware.com For evaluation only.

represented. The failure to consider a random residual term in deterministic regression resulted into a severe under representation of the data in particular the first frequency class. On the other hand, the SRI which considered a random residual provided better results than DRI. However, there are some areas that the added random produced significant excess mostly from the last frequency class.

**5.6 Choosing the best imputation method
**

For this section, the rankings of all the tests are the basis to determine which of the following IMs will be chosen as the best IMs for this particular study and data. The selection of the best method will be independent for all VIs and NRRs. The ranking are based on a four-point system wherein the rank value of 4 denotes the worst IM for that specific criterion and 1 denotes the best IM for that criterion. In case of ties, the average ranks will be substituted. The IM with the smallest rank total will be declared the best IM for the particular VI and NRR. The ranking of IM will cover the following criteria: (a) Bias of the mean of the imputed data (N.B.), (b) percentage of correct distributions (PCD), and (c) Other measures of variability, namely, MD, MAD and RMSD. All in all, there are five criteria that each IM will be rank in.

Tables 15, 16 and 17 show the ranking of the different imputation methods for the 10%, 20% and 30% NRR respectively. For each NRR, the table containing the rankings of the IMs will go as follows: (a) VIs, (b) Criteria, (c) OMI, (d) HDI3, (e) DRI3, and (f) SRI3.

Generated by Foxit PDF Creator © Foxit Software http://www.foxitsoftware.com For evaluation only.

**Table 15: Ranking of the Different IMs: 10% NRR
**

10% NRR VI CRITERIA IMs OMI HDI3 DRI3 SRI3 3 1 4 2 4 1.3 1.3 1.3 3 1 4 2 3 4 1 2 4 3 1 2 17 10.3 11.3 9.3 4th 2nd 3rd 1st IMs OMI HDI3 DRI3 SRI3 1 2 4 3 4 1.3 1.3 1.3 1 2 4 3 3 4 1 2 3 4 1 2 12 13.3 11.3 11.3 3rd 4th 1st 1st

N.B. PCD MD TOTEX2 MAD RMSD TOTAL Category Rank VI CRITERIA

N.B. PCD MD TOTIN2 MAD RMSD TOTAL Category Rank

**Table 16: Ranking of the Different IMs: 20% NRR
**

20% NRR VI CRITERIA IMs OMI HDI3 DRI3 SRI3 2 1 4 3 4 3 1 2 2 1 4 3 3 4 1 2 4 2 1 3 15 11 11 13 4th 1st 1st 3rd IMs OMI HDI3 DRI3 SRI3 3 4 2 1 4 1.3 1.3 1.3 3 4 2 1 3 4 1 2 3 4 1 2 16 17.3 7.3 7.3 3rd 4th 1st 1st

N.B. PCD MD TOTEX2 MAD RMSD TOTAL Category Rank VI CRITERIA

N.B. PCD MD TOTIN2 MAD RMSD TOTAL Category Rank

Generated by Foxit PDF Creator © Foxit Software http://www.foxitsoftware.com For evaluation only.

**Table 17: Ranking of the different IMs: 30% NRR
**

30% NRR VI CRITERIA IMs OMI HDI3 DRI3 SRI3 1 3 4 2 4 3 1.5 1.5 1 3 4 2 3 4 1 2 4 2 1 3 13 15 11.5 10.5 3rd 4th 2nd 1st IMs OMI HDI3 DRI3 SRI3 3 4 2 1 4 3 1.5 1.5 3 4 2 1 3 4 1 2 3 4 1 2 16 19 7.5 7.5 3rd 4th 1st 1st

N.B. PCD MD TOTEX2 MAD RMSD TOTAL Category Rank VI CRITERIA

N.B. PCD MD TOTIN2 MAD RMSD TOTAL Category Rank

Rankings show that the two regression IMs provided better results than their model-free counterparts. For all the nonresponse rates under the TOTIN2 variable, the two regression imputation methods tied as the best IM, and surprisingly the HDI finished the worst IM behind OMI. Under the TOTEX2 variable, mixed rankings were seen for all nonresponse rates. The regression methods still provided good results. The SRI method finished first in the 10% and 30% NRR and ranked third in the 20% NRR while the DRI method finished third, first and second in the 10%, 20% and 30% NRR respectively. While the HDI was seen as the worst IM for TOTIN2, the OMI was concluded the worst IM for TOTEX2 by ranking last for both 10% and 20% NRR and third for the 30% NRR.

Generated by Foxit PDF Creator © Foxit Software http://www.foxitsoftware.com For evaluation only.

In conclusion, the best imputation method for this study is the SRI3 using the 1997 FIES data. It is very closely followed by the DRI3 method. No records in the results show that SRI3 method ranked last in all the criteria, NRRs and VIs, unlike for DRI3 which provided the worst IM in the bias of the mean of the imputed data and MD criteria. The researchers selected the HDI3 as the worst IM in this study. The HDI3 method fared the worst in most of the criteria in particular to the other measures of variability in the 20% and 30% NRR.

- MVI1Uploaded byVarun Nakra
- Clinic Class Test 5 M EUploaded byImranHashmi
- std deviationUploaded byapi-235355872
- Missing data & how to handle it.pptxUploaded byGuja Nagi
- Assignment i Love StatisticUploaded bykhairul hisyam
- Ibm Spss Missing Values 19Uploaded byjohnalis22
- Projection and RegressionUploaded byapi-26344229
- LECTURE01_DataPreprocessingUploaded bybilo044
- Sampling Distribution WebUploaded byAbdul Qadeer
- Lab 2 Basic Nuclear Counting Statistics-2Uploaded bySaRa
- Transportation Statistics: entireUploaded byBTS
- Corrections Sa Thesis as Well as Dun Sa Mga Tables, Graphs, EtcUploaded byapi-3696796
- CCP303Uploaded byapi-3849444
- Part AUploaded byphonebill2
- Remedial Lessons 1 to 6Uploaded byAlicia Yip
- eportfolio submissionUploaded byapi-219039968
- 1Uploaded byLehleh Hii
- Asgmen MathUploaded byZaff Mat Efron
- BST510 Exam Feedback 2015-16 Plus(1)Uploaded bycons the
- Regression Modeling Strategies_ With Applications to Linear Models by Frank E. HarrellUploaded byApoorva
- TOM 302 SyllabusUploaded bySv Jabbey
- 1341752509_logo_File 2.pdfUploaded bySyams Fathur
- Manski 1993Uploaded byrr
- Social Science Computer Review 2013 O_Connor 229 35Uploaded byToxynDx
- Computational FinanceUploaded byChaiyakorn Yingsaeree
- Applied ResearchUploaded byKrishna Teja Panchakarla
- 20598.pdfUploaded byAnonymous 8wAMvyQMf
- Hidden Connections Between Regression Models of Strain-Gage Balance Calibration DataUploaded byvandalashah
- Development of Time Estimation Model for Multistoried Building Structural SystemsUploaded byAnurag Gogna
- Management information systems. A case study over the last eight years in the Romanian organizationsUploaded bylucianlupud

- Chapter 2 RevLitUploaded byapi-3696796
- Revised_Again Chapter 5Uploaded byapi-3696796
- Revised Chapter 3 (as of 2000 Hrs, 28 August 2007)Uploaded byapi-3696796
- Revised Thesis AgainUploaded byapi-3696796
- Revised Chapter 3Uploaded byapi-3696796
- Evaluation of Different Imputation MethodsUploaded byapi-3696796
- Regression DiscussionUploaded byapi-3696796
- Revised Conceptual FrameworkUploaded byapi-3696796
- Corrections Sa Thesis as Well as Dun Sa Mga Tables, Graphs, EtcUploaded byapi-3696796
- Proof in NRUploaded byapi-3696796
- Revised Chapter 3Uploaded byapi-3696796
- AbstractUploaded byapi-3696796
- Revised Chapter 5Uploaded byapi-3696796
- Comments Dun Sa Chap 5_revisionsUploaded byapi-3696796
- Chapter 1Uploaded byapi-3696796
- General Comments in Our Thesis Paper EntitledUploaded byapi-3696796
- Distribution of the Deleted DataUploaded byapi-3696796
- Errata SheetUploaded byapi-3696796
- Revised Defense PresentationUploaded byapi-3696796
- Discussion of Results CompleteUploaded byapi-3696796
- Dist of Deleted DataUploaded byapi-3696796
- Conceptual FrameworkUploaded byapi-3696796
- Conclusion (the Edited One)Uploaded byapi-3696796
- My Own ErrataUploaded byapi-3696796
- Discussion of Results CompleteUploaded byapi-3696796
- Errata SheetUploaded byapi-3696796
- Final RankingsUploaded byapi-3696796
- Distribution of Imputed Values VsUploaded byapi-3696796
- Evaluation of Imputation MethodsUploaded byapi-3696796