Professional Documents
Culture Documents
5 Sampling Technique in Python
5 Sampling Technique in Python
Population
• The population is the set of all observations (individuals, objects, events, or procedures) and is usually very large and diverse.
Sample
• A sample is a subset of observations from the population that ideally is a true representation of the population.
1 of 8 02-03-2023, 14:35
5_Sampling_Technique_in_Python http://localhost:8888/nbconvert/html/Test/5_Sampling_Technique_in_Python.ipynb?download...
Out[2]:
survived pclass sex age sibsp parch fare embarked class who adult_male deck embark_town alive alone
290 1 1 female 26.0 0 0 78.8500 S First woman False NaN Southampton yes True
261 1 3 male 3.0 4 2 31.3875 S Third child False NaN Southampton yes False
623 0 3 male 21.0 0 0 7.8542 S Third man True NaN Southampton no True
866 1 2 female 27.0 1 0 13.8583 C Second woman False NaN Cherbourg yes False
572 1 1 male 36.0 0 0 26.3875 S First man True E Southampton yes True
318 1 1 female 31.0 0 2 164.8667 S First woman False C Southampton yes False
199 0 2 female 24.0 0 0 13.0000 S Second woman False NaN Southampton no True
186 1 3 female NaN 1 0 15.5000 Q Third woman False NaN Queenstown yes False
565 0 3 male 24.0 2 0 24.1500 S Third man True NaN Southampton no False
696 0 3 male 44.0 0 0 8.0500 S Third man True NaN Southampton no True
2 of 8 02-03-2023, 14:35
5_Sampling_Technique_in_Python http://localhost:8888/nbconvert/html/Test/5_Sampling_Technique_in_Python.ipynb?download...
Out[3]:
survived pclass sex age sibsp parch fare embarked class who adult_male deck embark_town alive alone
556 1 1 female 48.0 1 0 39.6000 C First woman False A Cherbourg yes False
289 1 3 female 22.0 0 0 7.7500 Q Third woman False NaN Queenstown yes True
289 1 3 female 22.0 0 0 7.7500 Q Third woman False NaN Queenstown yes True
876 0 3 male 20.0 0 0 9.8458 S Third man True NaN Southampton no True
200 0 3 male 28.0 0 0 9.5000 S Third man True NaN Southampton no True
418 0 2 male 30.0 0 0 13.0000 S Second man True NaN Southampton no True
3 of 8 02-03-2023, 14:35
5_Sampling_Technique_in_Python http://localhost:8888/nbconvert/html/Test/5_Sampling_Technique_in_Python.ipynb?download...
2. Systemic sampling :targeting population from head or tail position or from certain fixed intervals
In [4]: df.head(5)
Out[4]:
survived pclass sex age sibsp parch fare embarked class who adult_male deck embark_town alive alone
2 1 3 female 26.0 0 0 7.9250 S Third woman False NaN Southampton yes True
In [5]: df.tail(5)
Out[5]:
survived pclass sex age sibsp parch fare embarked class who adult_male deck embark_town alive alone
886 0 2 male 27.0 0 0 13.00 S Second man True NaN Southampton no True
887 1 1 female 19.0 0 0 30.00 S First woman False B Southampton yes True
888 0 3 female NaN 1 2 23.45 S Third woman False NaN Southampton no False
889 1 1 male 26.0 0 0 30.00 C First man True C Cherbourg yes True
890 0 3 male 32.0 0 0 7.75 Q Third man True NaN Queenstown no True
4 of 8 02-03-2023, 14:35
5_Sampling_Technique_in_Python http://localhost:8888/nbconvert/html/Test/5_Sampling_Technique_in_Python.ipynb?download...
In [13]: select_index=list(range(1,len(df),50))
df.iloc[select_index]
#we choose from 50 fixed interval
Out[13]:
survived pclass sex age sibsp parch fare embarked class who adult_male deck embark_town alive alone
101 0 3 male NaN 0 0 7.8958 S Third man True NaN Southampton no True
151 1 1 female 22.0 1 0 66.6000 S First woman False C Southampton yes False
201 0 3 male NaN 8 2 69.5500 S Third man True NaN Southampton no False
301 1 3 male NaN 2 0 23.2500 Q Third man True NaN Queenstown yes False
401 0 3 male 26.0 0 0 8.0500 S Third man True NaN Southampton no True
451 0 3 male NaN 1 0 19.9667 S Third man True NaN Southampton no False
501 0 3 female 21.0 0 0 7.7500 Q Third woman False NaN Queenstown no True
551 0 2 male 27.0 0 0 26.0000 S Second man True NaN Southampton no True
601 0 3 male NaN 0 0 7.8958 S Third man True NaN Southampton no True
651 1 2 female 18.0 0 1 23.0000 S Second woman False NaN Southampton yes False
701 1 1 male 35.0 0 0 26.2875 S First man True E Southampton yes True
751 1 3 male 6.0 0 1 12.4750 S Third child False E Southampton yes False
801 1 2 female 31.0 1 1 26.2500 S Second woman False NaN Southampton yes False
851 0 3 male 74.0 0 0 7.7750 S Third man True NaN Southampton no True
5 of 8 02-03-2023, 14:35
5_Sampling_Technique_in_Python http://localhost:8888/nbconvert/html/Test/5_Sampling_Technique_in_Python.ipynb?download...
survived pclass sex age sibsp parch fare embarked class who adult_male deck embark_town alive alone
2 1 3 female 26.0 0 0 7.9250 S Third woman False NaN Southampton yes True
9 1 2 female 14.0 1 0 30.0708 C Second child False NaN Cherbourg yes False
19 1 3 female NaN 0 0 7.2250 C Third woman False NaN Cherbourg yes True
22 1 3 female 15.0 0 0 8.0292 Q Third child False NaN Queenstown yes True
6 of 8 02-03-2023, 14:35
5_Sampling_Technique_in_Python http://localhost:8888/nbconvert/html/Test/5_Sampling_Technique_in_Python.ipynb?download...
In [8]: df.iloc[::5].head(10)
#iloc property gets, or sets, the value(s) of the specified indexes.
Out[8]:
survived pclass sex age sibsp parch fare embarked class who adult_male deck embark_town alive alone
15 1 2 female 55.0 0 0 16.0000 S Second woman False NaN Southampton yes True
25 1 3 female 38.0 1 5 31.3875 S Third woman False NaN Southampton yes False
7 of 8 02-03-2023, 14:35
5_Sampling_Technique_in_Python http://localhost:8888/nbconvert/html/Test/5_Sampling_Technique_in_Python.ipynb?download...
In [9]: df.sample(5,random_state=1)
Out[9]:
survived pclass sex age sibsp parch fare embarked class who adult_male deck embark_town alive alone
862 1 1 female 48.0 0 0 25.9292 S First woman False D Southampton yes True
223 0 3 male NaN 0 0 7.8958 S Third man True NaN Southampton no True
84 1 2 female 17.0 0 0 10.5000 S Second woman False NaN Southampton yes True
680 0 3 female NaN 0 0 8.1375 Q Third woman False NaN Queenstown no True
535 1 2 female 7.0 0 2 26.2500 S Second child False NaN Southampton yes False
In [10]: df.sample(5,random_state=1)
#same random sample is produced even if we run the code multiple times
Out[10]:
survived pclass sex age sibsp parch fare embarked class who adult_male deck embark_town alive alone
862 1 1 female 48.0 0 0 25.9292 S First woman False D Southampton yes True
223 0 3 male NaN 0 0 7.8958 S Third man True NaN Southampton no True
84 1 2 female 17.0 0 0 10.5000 S Second woman False NaN Southampton yes True
680 0 3 female NaN 0 0 8.1375 Q Third woman False NaN Queenstown no True
535 1 2 female 7.0 0 2 26.2500 S Second child False NaN Southampton yes False
8 of 8 02-03-2023, 14:35