## Are you sure?

This action might not be possible to undo. Are you sure you want to continue?

Ratings:

316 pages2 hours

* *

A New Concept for Tuning Design Weights in Survey Sampling: Jackknifing in Theory and Practice introduces the new concept of tuning design weights in survey sampling by presenting three concepts: calibration, jackknifing, and imputing where needed. This new methodology allows survey statisticians to develop statistical software for analyzing data in a more precisely and friendly way than with existing techniques.

Explains how to calibrate design weights in survey sampling Discusses how Jackknifing is needed in design weights in survey sampling Describes how design weights are imputed in survey samplingPublisher: Academic PressReleased: Nov 17, 2015ISBN: 9780081005958Format: book

**A New Concept for Tuning Design Weights in Survey Sampling **

**Jackknifing in Theory and Practice **

First Edition

Sarjinder Singh

*Texas A&M University-Kingsville, TX, USA *

Stephen A. Sedory

*Texas A&M University-Kingsville, TX, USA *

Maria del Mar Rueda

*University of Granada, Spain *

Antonio Arcos

*University of Granada, Spain *

Raghunath Arnab

*University of Botswana and University of KwaZulu-Natal, S. Africa *

**Cover image **

**Title page **

**Copyright **

**Dedication **

**Preface **

**Further studies **

**Acknowledgments **

**1: Problem of estimation **

**Abstract **

**1.1 Introduction **

**1.2 Estimation problem and notation **

**1.3 Modeling of jumbo pumpkins **

**1.4 The concept of jackknifing **

**1.5 Jackknifing the sample mean **

**1.6 Doubly jackknifed sample mean **

**1.7 Jackknifing a sample proportion **

**1.8 Jackknifing of a double suffix variable sum **

**1.9 Frequently asked questions **

**1.10 Exercises **

**2: Tuning of jackknife estimator **

**Abstract **

**2.1 Introduction **

**2.2 Notation **

**2.3 Tuning with a chi-square type distance function **

**2.4 Tuning with dell function **

**2.5 An important remark **

**2.6 Exercises **

**3: Model assisted tuning of estimators **

**Abstract **

**3.1 Introduction **

**3.2 Model assisted tuning with a chi-square distance function **

**3.3 Model assisted tuning with a dual-to-empirical log-likelihood (dell) function **

**3.4 Exercises **

**4: Tuned estimators of finite population variance **

**Abstract **

**4.1 Introduction **

**4.2 Tuned estimator of finite population variance **

**4.3 Tuning with a chi-square distance **

**4.4 Tuning of estimator of finite population variance with a dual-to-empirical log-likelihood (dell) function **

**4.5 Alternative tuning with a chi-square distance **

**4.6 Alternative tuning with a dell function **

**4.7 Exercises **

**5: Tuned estimators of correlation coefficient **

**Abstract **

**5.1 Introduction **

**5.2 Correlation coefficient **

**5.3 Tuned estimator of correlation coefficient **

**5.4 Exercises **

**6: Tuning of multicharacter survey estimators **

**Abstract **

**6.1 Introduction **

**6.2 Transformation on selection probabilities **

**6.3 Tuning with a chi-square distance function **

**6.4 Tuning of the multicharacter estimator of population total with dual-to-empirical log-likelihood function **

**6.5 Exercises **

**7: Tuning of the Horvitz–Thompson estimator **

**Abstract **

**7.1 Introduction **

**7.2 Jackknifed weights in the Horvitz–Thompson estimator **

**7.3 Tuning with a chi-square distance function while using jackknifed sample means **

**7.4 Tuning of the Horvitz–Thompson estimator with a displacement function **

**7.5 Exercises **

**8: Tuning in stratified sampling **

**Abstract **

**8.1 Introduction **

**8.2 Stratification **

**8.3 Tuning with a chi-square distance function using stratum-level known population means of an auxiliary variable **

**8.4 Tuning with dual-to-empirical log-likelihood function using stratum-level known population means of an auxiliary variable **

**8.5 Exercises **

**9: Tuning using multiauxiliary information **

**Abstract **

**9.1 Introduction **

**9.2 Notation **

**9.3 Tuning with a chi-square distance function **

**9.4 Tuning with empirical log-likelihood function **

**9.5 Exercises **

**10: A brief review of related work **

**Abstract **

**10.1 Introduction **

**10.2 Calibration **

**10.3 Jackknifing **

**Bibliography **

**Author Index **

**Subject Index **

**Academic Press is an imprint of Elsevier **

125 London Wall, London, EC2Y 5AS, UK

525 B Street, Suite 1800, San Diego, CA 92101-4495, USA

225 Wyman Street, Waltham, MA 02451, USA

The Boulevard, Langford Lane, Kidlington, Oxford OX5 1GB, UK

Copyright © 2016 Elsevier Ltd. All rights reserved.

No part of this publication may be reproduced or transmitted in any form or by any means, electronic or mechanical, including photocopying, recording, or any information storage and retrieval system, without permission in writing from the publisher. Details on how to seek permission, further information about the Publisher's permissions policies and our arrangements with organizations such as the Copyright Clearance Center and the Copyright Licensing Agency, can be found at our website: **www.elsevier.com/permissions. **

This book and the individual contributions contained in it are protected under copyright by the Publisher (other than as may be noted herein).

**Notices **

Knowledge and best practice in this field are constantly changing. As new research and experience broaden our understanding, changes in research methods, professional practices, or medical treatment may become necessary.

Practitioners and researchers must always rely on their own experience and knowledge in evaluating and using any information, methods, compounds, or experiments described herein. In using such information or methods they should be mindful of their own safety and the safety of others, including parties for whom they have a professional responsibility.

To the fullest extent of the law, neither the Publisher nor the authors, contributors, or editors, assume any liability for any injury and/or damage to persons or property as a matter of products liability, negligence or otherwise, or from any use or operation of any methods, products, instructions, or ideas contained in the material herein.

ISBN: 978-0-08-100594-1

**British Library Cataloguing in Publication Data **

A catalogue record for this book is available from the British Library

**Library of Congress Cataloging-in-Publication Data **

A catalog record for this book is available from the Library of Congress

For information on all Academic Press publications visit our website at **http://store.elsevier.com/ **

**Dedication **

**Sarjinder Singh; Stephen A. Sedory; Maria del Mar Rueda; Antonio Arcos Arcos; Raghunath Arnab **

This monograph, A New Concept for Tuning Design Weights in Survey Sampling: Jackknifing in Theory and Practice,

introduces a fresh survey methodology that can be used to estimate population parameters and that, with the help of a computer, yet requiring minimal effort, leads to the construction of confidence intervals. The most important motivation for using the newly tuned estimation survey methodology is that it simplifies the estimation of the variance of those estimators used for estimating population parameters such as the mean, the total, the finite population variance, and the correlation coefficient, under various sampling schemes. The main ideas derive from the standard survey sampling methods of jackknifing, calibration, and imputation, but lead to an important modification of these existing methods. This new methodology should prove helpful to all government organizations, private organizations, and academic institutions that engage in survey sampling. The proposed new concept of tuning design weights leads to a computer friendly estimation survey methodology. Thus, it encourages development of a new statistical analysis software package in a language such as R, SAS, FORTRAN, or C++. Recall that the sampling design and estimation process has two main aspects: a selection process that includes the rules and operations by which some members of the population are included in the sample, and an estimation process for computing the sample statistics, which are sample estimates of population values. A good sample design and estimation methodology requires the balance of several important criteria: goal orientation, measurability, practicality, economy, time, simplicity, and reliability.

One need not take our word concerning the virtues of the new concept of tuning design weights which leads to the newly tuned estimation survey methodology developed here. One can easily test/apply these methods by executing the R codes that are provided. The authors are confident that one will find the performance of the developed theory amazing. In the 1940s (**Cochran, 1940, ratio estimator), survey methodologists began looking to the future for a reliable, trustworthy, and easily applicable estimation methodology. It appears that for students currently doing a Ph.D. in survey sampling, the future is here. **

Professor W.G. Cochran (1909–1980) **http://isi.cbs.nl/Nlet/images/N87_CochranWG.jpg **

In **Chapter 1, we discuss the problem of estimation in survey sampling. The Statistical Jumbo Pumpkin Model (SJPM) is developed, which can produce very light to very heavy pumpkins, and is naturally correlated with their known circumferences. R code, used for generating the SJPM, is provided. A population of pumpkins is generated using the SJPM and is displayed. A sample of pumpkins is taken from the generated population and is also displayed. The idea of jackknifing a sample statistic is discussed. Examples of jackknifing a sample mean and a sample proportion are provided. The effect of double jackknifing on a sample mean is discussed. The concept of jackknifing a sum of doubly subscripted variables is also introduced. At the end of the chapter, a few unsolved exercises, which include jackknifing of sample geometric mean, sample harmonic mean, and sample median, are provided for practice and further investigation. **

In **Chapter 2, we introduce the new concept of tuning design weights, which, in turn, leads to a computer friendly tuning methodology. The proposed methodology results in an estimator that is equivalent to the linear regression estimator of the population mean or total. The newly tuned estimation methodology is able to efficiently and effectively estimate the variance of the estimator of the population mean. The estimation of variance of the estimator is a key feature of the proposed newly tuned estimation methodology. In Chapter 2, we restrict ourselves to the use of a simple random sampling (SRS) scheme. We show that under an SRS scheme, the linear regression estimator is a result of tuning the jackknife sample mean estimator with the chi-squared type distance function. Tuning of the sample jackknife estimator under a dual-to-empirical log-likelihood (dell) function leads to a newly tuned dell-estimator of the population mean. The coverage by the newly tuned estimators, under the chi-square distance and the dell function, is investigated for the SJPM. The R codes for studying the reliability of the tuned estimators under the chi-squared distance function and the dell function are also provided. Numerical illustrations for both methods are also presented. At the end of the chapter, a few unsolved exercises are given for practice and further investigation. The tuning of a nonresponse in sampling theory is addressed in one of the unsolved exercises. The tuning of a sensitive variable, in the context of estimating the population mean of a sensitive variable, is also introduced through an unsolved exercise. In addition, tuned estimators of geometric mean and harmonic mean are also suggested for further investigation. **

In **Chapter 3, the problem of estimating a population mean with the help of the newly tuned model assisted estimation methodology is developed by assuming that the auxiliary information is available at the unit level in the population. The newly tuned model assisted estimator does an excellent job for small samples in the case of both the chi-squared type distance function, and the dual-to-empirical log-likelihood (dell) function, insofar as the weight of pumpkins is concerned. The R codes for studying the reliability of the tuned model assisted estimators for the case of the chi-squared and the dell functions are also provided. At the end of the chapter, a few unsolved model assisted exercises are given for practice and further investigation. A new model assisted tuning of a nonresponse along with the concept of imputation is addressed in one of the exercises. **

In **Chapter 4, we present the problem of estimating the finite population variance with the help of the newly tuned estimation methodology. The newly tuned model assisted estimator of finite population variance is studied under both the chi-squared and the dual-to-empirical log-likelihood (dell) functions. The R code for studying the reliability of estimating the finite population variance of the weight of pumpkins, using circumference as an auxiliary variable, is provided. Here, it is suggested that only a slice of a pumpkin be removed from the sample instead of a complete pumpkin. Such a proposed method of jackknifing has been called partial jackknifing. An alternative tuned estimator of the finite population variance and the finite population variance estimator are also introduced and studied. At the end of the chapter, a few interesting, unsolved exercises for estimating the finite population variance are given for practice and further investigation. The exercises include new model assisted tuned estimators of finite population variance, as well as a few new tuning constraints that make use of jackknifed sample geometric mean and jackknifed sample harmonic mean. **

In **Chapter 5, the problem of tuning the estimator of the correlation coefficient using empirical log-likelihood style techniques is considered. In practice, it is difficult to estimate the variance of the estimator of the correlation coefficient. However, the proposed newly tuned method leads to a very successful estimator of variance of the estimator of the correlation coefficient when auxiliary information is available. A numerical illustration is also provided to estimate the correlation between the weight and circumference of pumpkins. The R code, used in the simulation study of reliability of the estimators, is also provided. At the end of the chapter, several exercises involving estimation of the correlation coefficient are provided for practice and further investigation. In addition, exercises leading toward tuned estimators of the ratio of two population means, two population variances, and the population regression coefficient are included for further investigation. **

Until this point, we have considered only the simple random sampling (SRS) scheme at the selection stage of the sample. In **Chapter 6, the problem of estimating a population total using the concept of a multi-character survey is addressed. The newly tuned multi-character survey estimators that we introduce are studied for the probability proportional to size and with replacement (PPSWR) sampling scheme. Simulation results with the relevant R code are reported. The code is also utilized to compute numerical illustrations. At the end of the chapter, a few unsolved exercises for further investigation are provided. **

In **Chapter 7, we consider the tuning of the estimator of population total using probability proportional to size and without replacement (PPSWOR) sampling scheme. A new linear regression type estimator under PPSWOR sampling scheme is proposed. The proposed tuned estimator is free of second order inclusion probabilities and provides a simple and safe solution to the important problem of estimating variance. The tuned regression type estimator and an estimator obtained by optimizing a displacement function are tested through simulation studies. The R code used in the simulation study of the reliability of the estimators is also provided. The tuned estimators are also supported by two numerical illustrations that consider the problem of estimating the weights of pumpkins using their known average circumference as a tuning variable and their known top size as an auxiliary variable at the selection stage. At the end of the chapter, a few exercises concerning the estimation of population parameters under the PPSWOR sampling scheme are provided for further investigation. **

In **Chapter 8, we consider the problem of tuning the estimators of population mean in stratified random sampling design, using both the chi-squared type, and the dual-to-empirical log-likelihood (dell) type distance functions, when the population mean of the auxiliary variable at the stratum level is available or known. The tuned estimators are supported with a simulation that considers the problem of estimating the average weights of pumpkins. Here, the pumpkins are divided into three different strata, consisting of Sumbo, Mumbo, and Jumbo pumpkins, based on their known circumferences. The adjusted tuned estimator of variance of the proposed estimator is found to be very effective in the case of small samples. The R code used in the simulation study for the case of chi-squared distance and dual to log-likelihood functions is also provided. Numerical illustrations for determining the average weight of the Sumbo, Mumbo, and Jumbo pumpkins using proportional allocation and newly tuned stratified sampling estimation methodology, along with R code, are presented. In the exercises, we consider the situation in which the pooled mean of the auxiliary variable across all strata is known, but the means at stratum level are unknown. An idea for extending the work to multistage stratified random sampling is introduced in an exercise at the end of the chapter. **

In **Chapter 9, we introduce the idea of using multiauxiliary information for semituning the estimators of population mean and finite population variance under simple random sampling. The use of multiauxiliary information for fully tuning estimators is addressed in the exercises. A new set of three tuning constraints is also introduced. These constraints make use of jackknifed sample arithmetic mean, jackknifed sample geometric mean, and jackknifed sample harmonic mean of three different auxiliary variables. **

In **Chapter 10, a very short review of the relevant literature is given. The concept of calibration and jackknifing is discussed in layman language. **

Author and selected subject indexes have been also included at the end.

Tuning of estimators of any parameter, such as median, mode, distribution function, for any sampling design, such as small area estimation, post-stratification, adaptive clustering, panel-surveys, systematic sampling, dual or multiple frame surveys, and longitudinal surveys etc., can also be investigated, if needed.

The authors would like to thank Mr. Steven Mathews, Academic Press, for his guidance and help during the preparation of final version of the monograph. The authors would like to thank Mr. Paul A. Rios, Graphic Designer, Marketing and Communications, Texas A&M University-Kingsville for his kind help in designing the cover page and creating the pumpkin in **Figure 1.1. The authors are also thankful to a student at TAMUK, Lane Christiansen, for his help in going through the entire manuscript. We duly acknowledge the use of R-software to the, R Development Core Team (2009). R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria. ISBN 3-900051-07-0, http://www.R-project.org. Permission from Daniel Breze (Director) and Liliana Pinkasovych (Editorial Officer) International Statistical Institute, The Netherlands, to print the picture of Professor W.G. Cochran is duly acknowledged. Support for this research for two of the authors, Maria del Mar Rueda and Antonio Arcos, was provided in part by grant MTM2009-10055 from MEC, and is also acknowledged. The authors are also thankful to Dr. Glyn Jones and the reviewers for their recommendations to the publisher. Special thanks are due to Mr. Poulouse Joseph, Project Manager, Elsevier for his great patience during the editing process. **

Finally, the authors would like to thank their department colleagues, their family members, and their other friends who always inspired them to complete this monograph.

*Remark*: The first author would like to mention that the first draft of this monograph was written while he was seeking employment and staying at the home of one of his friends, Mr. Sher Singh Sidhu, Melbourne, Australia, for a period of about

You've reached the end of this preview. Sign up to read more!

Page 1 of 1

Close Dialog## Are you sure?

This action might not be possible to undo. Are you sure you want to continue?

Loading