Professional Documents
Culture Documents
A Modern Approach
ISBN-13: 9788170751021
ISBN-10: 8170751020
Published by:
Hindustan Publishing Corporation (India), 4805/24 Bharat Ram Road, Daryaganj, Delhi-110002
Printed at:
Compudata Services, Okhla Phase - II, New Delhi 110020 (India)
ACKNOWLEDGEMENT
The journey of writing this book has been more difficult and gratifying than I could have imagined. Without
the best life partner, colleagues, members of staff, friends, and well-wishers, none of this would have been
possible. Therefore, it is my prime duty to acknowledge their benign support and contributions.
My first and foremost sincere acknowledgements go to Prof. (Dr.) Kuldeep Kumar, M.Sc., Ph.D. (Kent),
FSS, C. Stat, Professor of Statistics, Centre for Data Analytics, Faculty of Business, Bond University, Gold
Coast, Queensland 4229, Australia and the Editor (Book Review), Journal of Royal Statistical Society (Series
A) & Significance. Without his motivation, review of chapters, support, and insights into the subjects, the
writing of this book would not have materialized.
I am highly obliged to Anna Heath, Membership Manager, Royal Statistical Society, LON-
DON EC1Y 8LX for making inquiries amongst the members and authorities of the Society to select
a genuine person to write the Foreword of the Book.
I acknowledge the regular help extended by Dr. Uday Shankar Sinha, Dr. Banshi Prasad Bhagat, Dr.
Dhyanendra Kumar, Dr. Raja Ram Singh and Dr. Suresh Prasad Srivastava, Former Heads, Department of
Zoology, Veer Kunwar Singh University, Arrah as well as Dr. Ram Randhir Singh, Department of Zoology,
Veer Kunwar Singh University, Arrah. Whenever required, they always spared time from their busy schedule
which enabled me to bring this wonderful workout.
It is my proud privilege to pay my gratitude and regard to Dr. Anil Kumar Sinha, Prof and Head, Department
of Zoology, Veer Kunwar Singh University, Arrah for his blessings and good wishes and also for providing
me all the bits of help as well as valuable suggestions for this work.
I hereby also acknowledge Ms. Deepa Sonal, Assistant Professor, Department of Computer Science, Patna
Women’s College, Patna, for writing some part of a chapter: ‘Fundamentals of Computers and Introduction
to C++’ of this book. I appreciate her efforts.
The author extends special thanks to his colleagues Dr. Kizar Ahmed Sumon, Associate Professor,
Department of Fisheries Management, Bangladesh Agricultural University, Mymensingh-2202, Bangladesh,
and Ilham Zulfahmi of Department of Biology, Faculty of Science and Technology, Ar-Raniry State Islamic
University, Koperma Darussalam, Banda Aceh City-23111, Indonesia and Dr. Muhammad Akram (Associate
Professor) Chairperson, Department of Eastern Medicine Government College University, Faisalabad,
Pakistan for their valuable suggestion which has improved the quality of the book.
The kind support of Dr. Phillip Robinson, Associate Professor and Head (Coordinator, DBT Star Scheme
and DBT PG Programme sponsored by DST and DBT, New Delhi), Department of Biotechnology, K.S.
Rangasamy College of Technology, Tiruchengode, Tamil Nadu is sincerely acknowledged. His constant
supervision and availability all the time are phenomenal.
(vi) Statistics: A Modern Approach
The efforts of Dr. Kumar Satyendra Yadav, Associate Professor, Department of Statistics, Patna University,
Patna have proven highly fruitful for this project. His sincere efforts in cross-checking all the references, and
preparing the lists of entries have proven a good deal of help indeed. His contributions toward making this
book worth browsing for the reader are hereby sincerely acknowledged.
I shall fail in my duty if I do not acknowledge the endeavours of Dr. Mohita Sardana and Dr. Amit
Priyadarshi of the Department of Zoology, Veer Kunwar Singh University, Arrah; Dr. Vijay Raj Kumawat,
Assistant Professor, Department of English, Veer Kunwar Singh University, Arrah, and Dr. Md. Alimul
Haque, Assistant Professor, Department of Computer Science, Veer Kunwar Singh University, Arrah. Their
efforts towards materializing this book shall always remain in my heart. I shall always fondly cherish those
unfailing commitments toward their vision of academic affluence among the teaching-learning fraternity.
In this series of thanksgiving, I tender my sincerest thanks to all my family members and all my dears too.
Their support, warmth, and affection even beyond the day hours cannot go unacknowledged.
I wish to acknowledge the efforts made by various Institutions/Organizations /Societies in developing
and providing Open Access Resources/External Resources for the benefit of teacher/student communities
and the general public at large.
Last but not the least, I tender my supreme thanks to every one of our publishers, Hindustan Publishing
Corporation (India) New Delhi. Without their support bringing out this book would not have been possible.
Their team was always at my disposal, whenever required. I convey my sincere thanks to them too.
Statistics, a multi-disciplinary branch with its principles, concepts, methods and values’ find many applications
in Mathematical Sciences, Life Sciences, Research Methodology, Health Sciences, Medical Sciences, Planning,
Machine Learning, Artificial Intelligence, Internet of Things, most recently Programming and Data Analysis,
Environmental Sciences, Meteorology, Economics, Psychology, Social Sciences, Commerce, Chartered Accountancy,
Geography, Education, Management, Mass Communication and Journalism, Trade and other disciplines.
Statistics always finds a way of being useful for research across any discipline. It functions as a tool in designing
research, analysing the data and drawing conclusions. Descriptive statistics help to develop indices from the raw
data, whereas inferential statistics is concerned with the process of generalization.
In the changing world scenario, interdisciplinary and multidisciplinary learning must be prioritized, along
with the development of multiple abilities and the inclusion of missing views. The UGC has implemented several
initiatives to improve learning and education efficiency and academic excellence. Implementation of a Choice Based
Credit System (CBCS) is one of them. CBCS is an internationally recognized system that provides opportunities
to learn as well as additional outlets for the holistic development of a student.
Moreover, The New Education Policy (NEP) approved by The Union Cabinet in July 2020 is expected to
bring a spate of big reforms. It may be implemented by 2022 or 2023. Holistic and multidisciplinary education
across the sciences, social sciences, arts, humanities and sports will ensure unity and integrity of knowledge. A
higher proportion of students will receive vocational education and there will be a shift toward multi-disciplinary
Institutions. At all higher Education Institutions, a Department of Statistics shall be established and strengthened.
I am fascinated by Statistics and I enjoy working in this field. I like teaching statistics as well, and I hope that I
cancommunicate some of my enthusiasm to my students, the majority of whom are compelled to take my classes
as part of their studies. It’s often a losing battle; however, some of them come in with a negative attitude toward
statistics, possibly exacerbated by the belief that statistics is some kind of magical procedure that will do their
thinking for them or a set of tricks and manipulations whose purpose is to twist reality to deceive others.
Some people find it challenging because they lack basic mathematical skills. Although an introductory-level
statistics course does not need mathematics beyond what a high school freshman or sophomore should be able
to achieve, many adults and college students lack even that level of proficiency in mathematics. Others struggle
because they attempt to pass the course through rote memorization. Statistics is a problem-solving subject, and
memorizing will not help you when you need problem-solving skills.
Statistics is a subject that aims to encourage students to think in new ways and see the world around them
from fresh angles. Once you have mastered numbers, you will beable to see the world in ways you did not know
existed before.
Proficiency in statistics is quickly becoming a requirement in many domains of work. It’s also becoming a
prerequisite to be a thoughtful memberof modern society, as we are constantly assaulted with statistical data and
arguments, many of which are debatable. Much of modern finance relies on Statistics and Probability. Without
statistical analysis, much of modern science, both physical and social, would come to a halt. Iattempted to address
statistics as both a professional requirement and a component of the intellectual content.
(viii) Statistics: A Modern Approach
Contents
Acknowledgements (v)
Preface (vii)
1. Introduction 1–31
1.1 The history of Statistics 1
1.2 Need of Statistics 8
1.3 Definitions of Statistics and Biostatistics 9
1.4 Branches of Statistics 13
1.5 Scope of Statistics/Biostatistics 19
1.6 Internet of Things and Biostatistics 20
1.7 Machine Learning and Biostatistics 21
1.8 Artificial Intelligence and Statistics 24
1.9 Research Methodology and Statistics 24
1.10 Applications of Statistics 25
1.11 Constraints and Limitations of Statistics 28
2. Concept 32–56
2.1 Concept of Biostatistics 32
2.2 Terminology used in Statistics 33
2.3 Notations used in Statistics 35
2.4 Logarithms 38
2.5 Set Theory 41
2.6 Permutation and Combination 44
2.7 Level of Significance and Confidence Limit 48
2.8 Statistical Error 51
Author Index
Dunnett C 393 G I
Dyken M 50 Galton F 3, 12, 98, 213, 304 Iamblichus 234
Garrett HE 204 Ihaka R 162
E Garwood F 381 Indrayan A 6, 32
Edgeworth FY 3, 304 Gauss JCF 2, 213, 242 Ioannidis JPA 232
Edward W 501 Gayon J 5 Irwin JO 408
Ellison AM 70 Geary RC 212 Iverson MG 6, 32
Enus P 9 Geiringer H 32
Ercan I 8 Gentlman R 162 J
Erlang AK 225 Gergen M 6, 32 Jacobson JO 78
Excoffier L 460 Gibbs GR 111 Jeffreys H 227
Ezra RA 45 GLivenko VI 258 Johannsen 7
Goodman M 192
F Gosset WS 5, 225, 422 K
Fano U 286 Gotelli NJ 70 Kafla F 67, 233
Farr W 66 Graunt J 2 Kahane J-P 297
Fazl A 2 Green PE 201 Kahn R 145
Feigelson ED 14 Guilford JP 397 Kaism 92
Feller W 213 Kelly 292
Fermat de P 2, 45, 204 H Kendall M 11, 308, 381
Fibonacci L 45 Haenszel W 410 Kendall MG 73, 191, 269, 291
Fine 205 Halley E 233 Keuls M 392
Finley AO 7 Hansen MH 73 Kiaer AN 3, 67
Fisher RA 4, 218, 370, 392, 437 Hardy A 4 Kim K 168
Fiske M 79 Harper WM 358 King WI 10, 28, 304
Fletsch CE 96 Harvey AS 291 Kish L 51
Fontana A 106 Harvey C 90 Kolmogorov AN 204, 425, 516
Fontana, A., & Frey, J. H. (1994). Hay I 112 Kozachecko 297
Interviewing: The art of science. In Hayden EC 1, 6, 32 Krammer 392
D. Denzin & Y. Heath TH 234 Kruskal W 194, 197, 514
Fontana, A., & Frey, J. H. (1994).
Heikalabad SR 140 Kumar R 367
Interviewing: The art of science. In
D. Denzin & Y. Helmert 423
Fontana, A., & Frey, J. H. (1994). Herzberg FS 91 L
Interviewing: The art of science. In Hichcock DB 5 Landwehr JM 32
D. Denzin & Y. Hidalgo B 192 Laplace P 204, 213
Forsythe AB 423 HIrach 358 Laplace PS 2, 66, 367, 443
Frank ES 32 Hochberg Y 393 Lauro C 16
Franklin J 2 Holm S 391 Lazarsfeld P 79
Frey JH 106 Horvitz DG 78 Legendre AM 2, 337
Friden 8 Hotelling H 185, 187, 322, 350 Lehman 437
Friedman M 199, 454 Hurwitz WN 73 Levene 423
Freud S 96
Author Index 559
Subject Index
Absolute dispersion 269 Basic rules of classification 116 Canonical Correlation Analysis 193
Accidental sampling 77 Bayes’ theorem 217 Cardinality number 43
Active Learning 22, 23 Bean box plot 105 Cartesian graph 128
Actuarial science 17 Benjamini-Hochberg procedure Cartesian products of set 43
Additive laws of probability 209 393 Case study method of data collec-
Additive Rule 209 Between Group Variability 453 tion 96
Advance Survey Statistics 15 Biased error 71 Categorical data 399
Algebraic dispersion 269 Bi-logarithmic graph 127 Causal hypothesis 370
Alternative Hypothesis 370 Binary data 90 Central Editing 111
AMOVA 460 Binomial distributions 218 Central Limit Theorem 443
Analysis of Nominal scale data 176 Binomial test 410, 447 Centroid method 186
Analytical Statistics 13 Biographical data35 Characteristic (Index) of log 40
ANCOVA 460 Biological hypothesis 370 Check list Scale Questions 113
Anderson–Darling test 502 Biometric monitoring 16 Chemometrics 17
ANORVA 460 Biometrics 16 Chi square test 178, 396
ANOVA 451 Biostatistics 12, 14 Chronological classification 117
Anti-logarithm 40 Bivariate analysis 181 Chronological data 35
Applications of Statistics 25 Bivariate data 87 Class boundary 123
Applied statistics 14 Bivariate Frequency Table 126 Class interval 122
A-priory 112 Bivariate sample 33 Class limits 123
Area chart 135 Bivariate Statistics 14 Classical Probability 208
Array data 34 Blocking 452 Classification of Data 108, 116
Artificial Intelligence and Statistics Bonferroni Test 391 Classification of Scales 61
24 Bowley's measure of skewness 292 Closed end coded structured Ques-
Assumed mean method 236 Box Plot 102, 129 tions 113
Astrostatistics 14 Branches of Statistics 13 Closed-Ended Class Interval 122
Automatic editing 109 British Imperial System 59 Cluster method 75, 192
Averages or means 235 Bubble chart 135 Cluster sampling method 75
Avian point count 78 Business analytics 17 Cochran–Mantel–Haenszel χ^2 test
Axiomatic definition of probability Business Statistics 14 410
205 C coefficient 414 Cochran's Q Test 177
Bar Graph 128 Call playback responses 78 Coding method 472
Basic Concept of Computer 138 Candlestick chart 139 Coding of data 111
562 Statistics: A Modern Approach
Coefficient of Alienation 310 Cumulative frequency distribution Elements of an ideal table 118
Coefficient of determination 310 125 Ensemble Learning 23
Coefficient of non-determination Curve fitting theory 361 Environmental monitoring 21
310 Curvilinear correlation 308 Environmental statistics 17
Coefficient of Quartile deviation Curvilinear regression 337 Epidemiology 17
272 Dark data 86 Estimation of population mean 79
Coefficient of variation 269, 286 Data 34, 83 Estimation of population propor-
Collection of data 94 Deciles 260 tion 79
Column Graph 129 Deductive Inference 23 Exact test 368
Combination 44 Definitions of Statistics 9 Exclusive Class Interval 122
Combined arithmetic mean 239 Degree of freedom 399, 435 Experimental Design 15, 376, 452
Combined geometric mean 244 Demography 17 Experimental method: data collec-
Common statistical Software 167 Dependent t-test 431 tion 97
Comparative rating scale 102 Derived variables 34 Exploratory factor analysis 188
Comparative Scaling Technique 62 Descriptive data 87 Exponential graph 127
Complex Hypothesis 369 Descriptive rating scale 102 External Hardware 138
Computer Software 167 Descriptive Statistics 13 Extrapolation 358
Concept of probability 204 Determination of Sample size 70 Factor Analysis 186
Concept of Statistics 32 Device Management 141 FANOVA 460
Concordant (Agreeable) orders 332 Dichotomous Question 113 Fiducial limits 380, 381
Confidence Limit 48, 380 Direct sampling method 77 Field Editing 110
Confirmatory factor analysis 188 Directional hypothesis 370 File Management 140
Conjoint analysis 192, 201 Discontinuous variables 34 Finite data 87
Conservative test 368 Discordant (Non-agreeable) orders Finite Population 33
Constant Sum Scaling 63 332 First Coefficient of Skewness 291
Constraints of Statistics 28 Discrete data 35, 85, 236 First order correlation 309
Continuous data 35, 86, 236 Discrete frequency distribution 125 First-order central tendency 234
Continuous frequency distribution Discrete variable 117 First-order partial correlation 321
126 Discriminant Analysis 192 Fisher Irwin Test 408
Continuous variable 34, 329 Discriminate regression 343 Fisher’s approach 418
Control Statements 148 Distance dispersion 269 Fisher’s LSD test 392, 408
Correlation Coefficient 304 DMCT 392, 476 Fisher's exact test 178, 408
Correspondence Analysis 192 DMRT 392, 476 Fixed-effects model 453
Crammer correlation 414 Document schedule 100 Flow chart 133
Critical value 368 Dot plot chart 132 Focus group method: data collec-
Cross tabulations 182 Drak data 86 tion 95
Crossover error rate 52 Ecological Statistics 16 Forensic statistics 17
Cross-sectional data 88 Econometrics 17 Four-way ANOVA 455
Crowd-sourced data 88 Economical Statistics 15 Fractional Scale 62
Cumulative frequency table 127 Editing of Data 108 Frequency distribution 121, 124
Cumulative sampling table 71 Elastic Net regression 343 Frequency Distribution Graph 129
Subject Index 563
Frequency of class interval 124 In vivo coding 112 Least square method 360
Frequency Polygon 129 Inclusive Class Interval 122 Leptokurtic curve 296
Friedman two-way ANOVA 199 Independent two-sample t-test 425 Level of significance 48, 367
Fundamentals of Computer 138 Individual data 35 Likelihood-ratio test 413
Funnel chart 131 Inductive Learning 23 Likert Scale Questions 99, 113
Gantt chart 133 Inductive Statistics 13 Limitations of Statistics 28, 29
Generations of Computers 139 Inferential Statistics 13 Line Graph 130
Genetic Epidemiology 15 Infinite data 87 Line intercepts method 78
Geographical classification 118 Infinite Population 33 Line transect 78
Geographical data 88 Interactive editing 110 Linear correlation 308
Geometric mean 242 Intercept 337 Linear regression 337
Geometric probability distribution Internal Hardware 138 Linear scale 61
212 Internet 145 Linux 144
Geostatistics 17 Internet of Things and Statistics 20 Logarithm 38
GPS telemetry 82 Interpolation 358 Logical Hypothesis 370
Graphic rating scale 102 Interquartile mean 240 Logistic regression analysis 192,
Graphical editing 110 Interval data 86 338
Grouped frequency distribution 125 Interval Scale 62, 101 Lorenz curve 135
G-test 409 Interview method of data collection Machine data 89
Harmonic mean 246 95 Machine learning 18
Harmonic sampling 82 Introduction to C++ 146 Machine Learning and Statistics 21
Health Statistics 14 Introduction to Excel 156 MANCOVA 464
Heronian mean 240 Introduction to GraphPad Prism Mann Whitney Wilcoxon Test 194
Hierarchy chart 134 157 Mann–Whitney U test 194, 427,
Histogram 129 Introduction to Python 165 508
History of Statistics 1 Introduction to R 162 MANOVA 460
Holm-Bonferroni Method 391, Introduction to SPSS 160 Mantissa of log 40
443, 446 Judgment sampling method 77 Manual editing 109
Horvitz-Thompson estimator 76 Jurimetrics 18 Marine radar sampling 82
Hypergeometric probability distri- Kelly's coefficient of skewness 292 Maximum likelihood method 187
bution 212 Kendall correlation 308 McNemar's test 177, 412
Hypothesis 367 Kendall τ 331 Mean deviation 275
Hypothesis for population mean Kolmogorov-Smirnov Test 516 Meaning of Statistics 9
381 Krusall Wallis Test 194, 371, 514 Measurement 57
Hypothesis for proportion of popu- Kruskal–Wallis one-way ANOVA Measures of averages 233
lation 387 502 Measures of central Tendency 233
Hypothesis of correlation coeffi- Kurtosis 296 Measures of dispersion 268
cient 378
Lack-of-fit error 452 Median 249, 259
I/O System Management 141
LASSO regression 343 Median skewness 292
Image factoring 187
Latin square design 473 Medical Statistics 14
In house editing 111
Laws of probability 209 Memory Management 140
564 Statistics: A Modern Approach
Statistical Lab 171 The portmanteau test 413 Ungrouped frequency distribution
Statistical learning 23 Third-order central tendency 231 124
Statistical mechanics 18 Three-way ANOVA 455 Univariate analysis 176
Statistical physics 18 Three-way Table 120 Univariate data 87
Statistical signal processing 19 Time plot 130 Univariate Frequency Table 126
Statistical thermodynamics 19 Time-series data 88 Univariate sample 33
StatTools 168 Tools of measurement 59 Univariate Statistics 14
Stem plot122, 130 Transactional data 88 Unsupervised Learning 22
Step deviation method 236 Transductive Learning 23 Value coding 112
Steps in testing of hypothesis 372 Transfer Learning 23 Variables 33
Stepwise regression 337 Trellis chart 134 Variance 286
Stock chart 133 Trimean 240 Variants of Chi square test 408
Stratified sampling method 74 Trimmed mean 240 Vector data 88
Stratum 35 Trivariate Frequency Table 126 Venn chart 135
Structural equation modelling 188, Truncated mean 240 Verbal scale 64
193 t-test 422 Versus coding 112
Sub-coding 112 Tukey median 241 Violin box plot 105
Supervised Learning 22 Tukey’s HSD test 392, 448, 476 Visual search 78
Survey schedule 100 Tukey–Duckworth test 502 Visualization of data 127
Survey Statistics 15 Two-dimensional array 156 Vital Statistics 2, 14
Survival analysis 15 Two-sample z-test432 Voluntary Sampling 78
Systematic (determinate) error 54 Two-way ANOVA 455 Wald–Wolfowitz run test 510
Systematic Sample 73 Two-Way Table 120 Warner proposed sample model 76
Tabulation of Data 116 Types of Correlation Coefficient Waterfall chart 133
Temporal classification 117 307 Weighted arithmetic mean 239
Terminology used in Statistics 33 Types of data 83 Welch's t-test 371, 429
Test Statistics 16 Types of measurement 57 Wilcoxon rank-sum test 502
Testing of hypotheses 366 Types of probability distribution Wilcoxon signed-rank test 194, 425
The Facet Satisfaction Scale 61 211 Wilcoxon-Man-Whitney U Test
The International System of Units Types of regression analysis 341 508
57 Types of sets 43 Yates correction 406
The modern form of the metric sys- Types of table 119 Yule's coefficient 292
tem 58 UMPT 369 Zero correlation 309
The Sign Test 502 Unbiased character 68 Zero order correlation 309
The US Customary System 59 Unbiased error 71 z-test 437
Unequal Class Interval 122 ϕ coefficient 308, 414