Professional Documents
Culture Documents
Workshop 3
Tamur Khan 1608275
In the cell below change the value of the id variable to the last 3 digits of your student number (or last 4 if that sequence has a leading zero). Follow the instructions that
are given when the cell is run.
In [2]: id = 275
np.random.seed(id)
numbers = np.sort(np.random.choice(range(10), size = 5, replace=False))
print('Modify the dataset so that it only contains records for the following digits:')
print(numbers)
Modify the dataset so that it only contains records for the following digits:
[3 4 5 6 9]
Out[3]:
f000 f001 f002 f003 f004 f005 f006 f007 f008 f009 ... f775 f776 f777 f778 f779 f780 f781 f782 f783 target
0 0 0 0 0 0 0 0 0 0 0 ... 0 0 0 0 0 0 0 0 0 5
1 0 0 0 0 0 0 0 0 0 0 ... 0 0 0 0 0 0 0 0 0 0
2 0 0 0 0 0 0 0 0 0 0 ... 0 0 0 0 0 0 0 0 0 4
3 0 0 0 0 0 0 0 0 0 0 ... 0 0 0 0 0 0 0 0 0 1
4 0 0 0 0 0 0 0 0 0 0 ... 0 0 0 0 0 0 0 0 0 9
5 0 0 0 0 0 0 0 0 0 0 ... 0 0 0 0 0 0 0 0 0 2
6 0 0 0 0 0 0 0 0 0 0 ... 0 0 0 0 0 0 0 0 0 1
7 0 0 0 0 0 0 0 0 0 0 ... 0 0 0 0 0 0 0 0 0 3
8 0 0 0 0 0 0 0 0 0 0 ... 0 0 0 0 0 0 0 0 0 1
9 0 0 0 0 0 0 0 0 0 0 ... 0 0 0 0 0 0 0 0 0 4
10 0 0 0 0 0 0 0 0 0 0 ... 0 0 0 0 0 0 0 0 0 3
11 0 0 0 0 0 0 0 0 0 0 ... 0 0 0 0 0 0 0 0 0 5
12 0 0 0 0 0 0 0 0 0 0 ... 0 0 0 0 0 0 0 0 0 3
13 0 0 0 0 0 0 0 0 0 0 ... 0 0 0 0 0 0 0 0 0 6
14 0 0 0 0 0 0 0 0 0 0 ... 0 0 0 0 0 0 0 0 0 1
15 0 0 0 0 0 0 0 0 0 0 ... 0 0 0 0 0 0 0 0 0 7
16 0 0 0 0 0 0 0 0 0 0 ... 0 0 0 0 0 0 0 0 0 2
17 0 0 0 0 0 0 0 0 0 0 ... 0 0 0 0 0 0 0 0 0 8
18 0 0 0 0 0 0 0 0 0 0 ... 0 0 0 0 0 0 0 0 0 6
19 0 0 0 0 0 0 0 0 0 0 ... 0 0 0 0 0 0 0 0 0 9
20 0 0 0 0 0 0 0 0 0 0 ... 0 0 0 0 0 0 0 0 0 4
21 0 0 0 0 0 0 0 0 0 0 ... 0 0 0 0 0 0 0 0 0 0
22 0 0 0 0 0 0 0 0 0 0 ... 0 0 0 0 0 0 0 0 0 9
23 0 0 0 0 0 0 0 0 0 0 ... 0 0 0 0 0 0 0 0 0 1
24 0 0 0 0 0 0 0 0 0 0 ... 0 0 0 0 0 0 0 0 0 1
25 0 0 0 0 0 0 0 0 0 0 ... 0 0 0 0 0 0 0 0 0 2
f000 f001 f002 f003 f004 f005 f006 f007 f008 f009 ... f775 f776 f777 f778 f779 f780 f781 f782 f783 target
26 0 0 0 0 0 0 0 0 0 0 ... 0 0 0 0 0 0 0 0 0 4
27 0 0 0 0 0 0 0 0 0 0 ... 0 0 0 0 0 0 0 0 0 3
28 0 0 0 0 0 0 0 0 0 0 ... 0 0 0 0 0 0 0 0 0 2
29 0 0 0 0 0 0 0 0 0 0 ... 0 0 0 0 0 0 0 0 0 7
... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ...
59970 0 0 0 0 0 0 0 0 0 0 ... 0 0 0 0 0 0 0 0 0 2
59971 0 0 0 0 0 0 0 0 0 0 ... 0 0 0 0 0 0 0 0 0 2
59972 0 0 0 0 0 0 0 0 0 0 ... 0 0 0 0 0 0 0 0 0 0
59973 0 0 0 0 0 0 0 0 0 0 ... 0 0 0 0 0 0 0 0 0 9
59974 0 0 0 0 0 0 0 0 0 0 ... 0 0 0 0 0 0 0 0 0 2
59975 0 0 0 0 0 0 0 0 0 0 ... 0 0 0 0 0 0 0 0 0 4
59976 0 0 0 0 0 0 0 0 0 0 ... 0 0 0 0 0 0 0 0 0 6
59977 0 0 0 0 0 0 0 0 0 0 ... 0 0 0 0 0 0 0 0 0 7
59978 0 0 0 0 0 0 0 0 0 0 ... 0 0 0 0 0 0 0 0 0 3
59979 0 0 0 0 0 0 0 0 0 0 ... 0 0 0 0 0 0 0 0 0 1
59980 0 0 0 0 0 0 0 0 0 0 ... 0 0 0 0 0 0 0 0 0 3
59981 0 0 0 0 0 0 0 0 0 0 ... 0 0 0 0 0 0 0 0 0 6
59982 0 0 0 0 0 0 0 0 0 0 ... 0 0 0 0 0 0 0 0 0 6
59983 0 0 0 0 0 0 0 0 0 0 ... 0 0 0 0 0 0 0 0 0 2
59984 0 0 0 0 0 0 0 0 0 0 ... 0 0 0 0 0 0 0 0 0 1
59985 0 0 0 0 0 0 0 0 0 0 ... 0 0 0 0 0 0 0 0 0 2
59986 0 0 0 0 0 0 0 0 0 0 ... 0 0 0 0 0 0 0 0 0 6
59987 0 0 0 0 0 0 0 0 0 0 ... 0 0 0 0 0 0 0 0 0 0
59988 0 0 0 0 0 0 0 0 0 0 ... 0 0 0 0 0 0 0 0 0 7
59989 0 0 0 0 0 0 0 0 0 0 ... 0 0 0 0 0 0 0 0 0 8
59990 0 0 0 0 0 0 0 0 0 0 ... 0 0 0 0 0 0 0 0 0 9
59991 0 0 0 0 0 0 0 0 0 0 ... 0 0 0 0 0 0 0 0 0 2
59992 0 0 0 0 0 0 0 0 0 0 ... 0 0 0 0 0 0 0 0 0 9
59993 0 0 0 0 0 0 0 0 0 0 ... 0 0 0 0 0 0 0 0 0 5
f000 f001 f002 f003 f004 f005 f006 f007 f008 f009 ... f775 f776 f777 f778 f779 f780 f781 f782 f783 target
59994 0 0 0 0 0 0 0 0 0 0 ... 0 0 0 0 0 0 0 0 0 1
59995 0 0 0 0 0 0 0 0 0 0 ... 0 0 0 0 0 0 0 0 0 8
59996 0 0 0 0 0 0 0 0 0 0 ... 0 0 0 0 0 0 0 0 0 3
59997 0 0 0 0 0 0 0 0 0 0 ... 0 0 0 0 0 0 0 0 0 5
59998 0 0 0 0 0 0 0 0 0 0 ... 0 0 0 0 0 0 0 0 0 6
59999 0 0 0 0 0 0 0 0 0 0 ... 0 0 0 0 0 0 0 0 0 8
In [4]: df1.columns
Out[4]: Index(['f000', 'f001', 'f002', 'f003', 'f004', 'f005', 'f006', 'f007', 'f008',
'f009',
...
'f775', 'f776', 'f777', 'f778', 'f779', 'f780', 'f781', 'f782', 'f783',
'target'],
dtype='object', length=785)
In [5]: df1.shape
Out[6]:
f000 f001 f002 f003 f004 f005 f006 f007 f008 f009 ... f775 f776 f777 f778 f779 f780 f781 f782 f783 target
0 0 0 0 0 0 0 0 0 0 0 ... 0 0 0 0 0 0 0 0 0 5
2 0 0 0 0 0 0 0 0 0 0 ... 0 0 0 0 0 0 0 0 0 4
4 0 0 0 0 0 0 0 0 0 0 ... 0 0 0 0 0 0 0 0 0 9
7 0 0 0 0 0 0 0 0 0 0 ... 0 0 0 0 0 0 0 0 0 3
9 0 0 0 0 0 0 0 0 0 0 ... 0 0 0 0 0 0 0 0 0 4
10 0 0 0 0 0 0 0 0 0 0 ... 0 0 0 0 0 0 0 0 0 3
11 0 0 0 0 0 0 0 0 0 0 ... 0 0 0 0 0 0 0 0 0 5
12 0 0 0 0 0 0 0 0 0 0 ... 0 0 0 0 0 0 0 0 0 3
13 0 0 0 0 0 0 0 0 0 0 ... 0 0 0 0 0 0 0 0 0 6
18 0 0 0 0 0 0 0 0 0 0 ... 0 0 0 0 0 0 0 0 0 6
19 0 0 0 0 0 0 0 0 0 0 ... 0 0 0 0 0 0 0 0 0 9
20 0 0 0 0 0 0 0 0 0 0 ... 0 0 0 0 0 0 0 0 0 4
22 0 0 0 0 0 0 0 0 0 0 ... 0 0 0 0 0 0 0 0 0 9
26 0 0 0 0 0 0 0 0 0 0 ... 0 0 0 0 0 0 0 0 0 4
27 0 0 0 0 0 0 0 0 0 0 ... 0 0 0 0 0 0 0 0 0 3
30 0 0 0 0 0 0 0 0 0 0 ... 0 0 0 0 0 0 0 0 0 3
32 0 0 0 0 0 0 0 0 0 0 ... 0 0 0 0 0 0 0 0 0 6
33 0 0 0 0 0 0 0 0 0 0 ... 0 0 0 0 0 0 0 0 0 9
35 0 0 0 0 0 0 0 0 0 0 ... 0 0 0 0 0 0 0 0 0 5
36 0 0 0 0 0 0 0 0 0 0 ... 0 0 0 0 0 0 0 0 0 6
39 0 0 0 0 0 0 0 0 0 0 ... 0 0 0 0 0 0 0 0 0 6
43 0 0 0 0 0 0 0 0 0 0 ... 0 0 0 0 0 0 0 0 0 9
44 0 0 0 0 0 0 0 0 0 0 ... 0 0 0 0 0 0 0 0 0 3
45 0 0 0 0 0 0 0 0 0 0 ... 0 0 0 0 0 0 0 0 0 9
47 0 0 0 0 0 0 0 0 0 0 ... 0 0 0 0 0 0 0 0 0 5
48 0 0 0 0 0 0 0 0 0 0 ... 0 0 0 0 0 0 0 0 0 9
49 0 0 0 0 0 0 0 0 0 0 ... 0 0 0 0 0 0 0 0 0 3
f000 f001 f002 f003 f004 f005 f006 f007 f008 f009 ... f775 f776 f777 f778 f779 f780 f781 f782 f783 target
50 0 0 0 0 0 0 0 0 0 0 ... 0 0 0 0 0 0 0 0 0 3
53 0 0 0 0 0 0 0 0 0 0 ... 0 0 0 0 0 0 0 0 0 4
54 0 0 0 0 0 0 0 0 0 0 ... 0 0 0 0 0 0 0 0 0 9
... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ...
59942 0 0 0 0 0 0 0 0 0 0 ... 0 0 0 0 0 0 0 0 0 9
59943 0 0 0 0 0 0 0 0 0 0 ... 0 0 0 0 0 0 0 0 0 4
59945 0 0 0 0 0 0 0 0 0 0 ... 0 0 0 0 0 0 0 0 0 5
59947 0 0 0 0 0 0 0 0 0 0 ... 0 0 0 0 0 0 0 0 0 5
59948 0 0 0 0 0 0 0 0 0 0 ... 0 0 0 0 0 0 0 0 0 9
59951 0 0 0 0 0 0 0 0 0 0 ... 0 0 0 0 0 0 0 0 0 4
59955 0 0 0 0 0 0 0 0 0 0 ... 0 0 0 0 0 0 0 0 0 3
59956 0 0 0 0 0 0 0 0 0 0 ... 0 0 0 0 0 0 0 0 0 5
59957 0 0 0 0 0 0 0 0 0 0 ... 0 0 0 0 0 0 0 0 0 3
59959 0 0 0 0 0 0 0 0 0 0 ... 0 0 0 0 0 0 0 0 0 6
59960 0 0 0 0 0 0 0 0 0 0 ... 0 0 0 0 0 0 0 0 0 5
59961 0 0 0 0 0 0 0 0 0 0 ... 0 0 0 0 0 0 0 0 0 3
59964 0 0 0 0 0 0 0 0 0 0 ... 0 0 0 0 0 0 0 0 0 3
59966 0 0 0 0 0 0 0 0 0 0 ... 0 0 0 0 0 0 0 0 0 6
59968 0 0 0 0 0 0 0 0 0 0 ... 0 0 0 0 0 0 0 0 0 5
59969 0 0 0 0 0 0 0 0 0 0 ... 0 0 0 0 0 0 0 0 0 9
59973 0 0 0 0 0 0 0 0 0 0 ... 0 0 0 0 0 0 0 0 0 9
59975 0 0 0 0 0 0 0 0 0 0 ... 0 0 0 0 0 0 0 0 0 4
59976 0 0 0 0 0 0 0 0 0 0 ... 0 0 0 0 0 0 0 0 0 6
59978 0 0 0 0 0 0 0 0 0 0 ... 0 0 0 0 0 0 0 0 0 3
59980 0 0 0 0 0 0 0 0 0 0 ... 0 0 0 0 0 0 0 0 0 3
59981 0 0 0 0 0 0 0 0 0 0 ... 0 0 0 0 0 0 0 0 0 6
59982 0 0 0 0 0 0 0 0 0 0 ... 0 0 0 0 0 0 0 0 0 6
59986 0 0 0 0 0 0 0 0 0 0 ... 0 0 0 0 0 0 0 0 0 6
59990 0 0 0 0 0 0 0 0 0 0 ... 0 0 0 0 0 0 0 0 0 9
f000 f001 f002 f003 f004 f005 f006 f007 f008 f009 ... f775 f776 f777 f778 f779 f780 f781 f782 f783 target
59992 0 0 0 0 0 0 0 0 0 0 ... 0 0 0 0 0 0 0 0 0 9
59993 0 0 0 0 0 0 0 0 0 0 ... 0 0 0 0 0 0 0 0 0 5
59996 0 0 0 0 0 0 0 0 0 0 ... 0 0 0 0 0 0 0 0 0 3
59997 0 0 0 0 0 0 0 0 0 0 ... 0 0 0 0 0 0 0 0 0 5
59998 0 0 0 0 0 0 0 0 0 0 ... 0 0 0 0 0 0 0 0 0 6
In [7]: list(df2['target'].unique()) #proving the new dataframe only contains my specified records.
Out[7]: [5, 4, 9, 3, 6]