P02 Introduction To Numpy Ans

P02: INTRODUCTION TO NUMPY
1. ARRAY CREATION
A. Creating an array from a list or tuple
B. Creating arrays initialized with zeros or ones
C. Generating number sequences
D. Random number generator
E. Seeding the random generator
F. Exercise 1
2. SECTION 2: NUMERICAL OPERATIONS
A. Exercise 2
3. SECTION 3: INDEXING, SLICING AND ITERATING
A. Simple Indexing and Slicing
B. Indexing with Integer Arrays
C. Indexing with Boolean Arrays
D. Exercise 3
4. SECTION 4: MATRIX COMPUTATION
A. Exercise 4
5. SECTION 5: COPIES vs VIEWS
A. Reference copy
B. View copy
C. Copy
6. SECTION 6: MANIPULATING ARRAYS
A. Exercise 5
7. SECTION 7: ARRAY BROADCASTING
A. Exercise 6
Reference:
The numpy reference and user guide are available in this link
This tutorial is adapted from SciPy Quickstart Tutorial. If you need more exercises, you can work on
100 numpy exercises
INTRODUCTION
What is numpy?
NumPy is the fundamental package required for high performance scientific computing and data
analysis. It issued by practically all other scientific tools, such as Panda and Scikit. The following code
shows how to import the numpy library.
In [1]: import numpy as np # import numpy library
Why numpy?
Numpy is much faster for scientific computation compared to conventional Python methods.
Provides standard mathematical functions for fast operations on entire arrays of data without
having to write loops (easier to write)
Provides tools for reading / writing array data to disk
Linear algebra, random number generation capabilities
SECTION 1: ARRAY CREATION

Creating an array from a list or tuple
np.array :
Creates a numpy array using a normal python list or tuple.
Creating 1-D array

In [2]:
a = np.array([2,3,4])
Out[2]: array([2, 3, 4])
In [3]:
list1 = [2,4,6,8]
a = np.array(list1)
Out[3]: array([2, 4, 6, 8])
Displaying the information about the array
In [4]:
print('Shape of a =', a.shape) # number items in each dimension
print('Number of dimensions of a =', a.ndim) # number of dimensions
print('Number of items of vector a =', len(a)) # number of items in 1st dimension. S

# this corresponds to number of i
Shape of a = (4,)
Number of dimensions of a = 1
Number of items of vector a = 4
Displaying the type of the array or its elements
In [5]:
print('Type of a =', type(a)) # a is a list
print('Type of items in a =', a.dtype) # the items stored in a is an integer
Type of a = <class 'numpy.ndarray'>
Type of items in a = int32
Creating 2-D array

In [6]:
a = np.array([[1, 2, 3, 4, 5],[6, 7, 8, 9, 10], [11, 12, 13, 14, 15]])
Out[6]: array([[ 1, 2, 3, 4, 5],
[ 6, 7, 8, 9, 10],
[11, 12, 13, 14, 15]])
In [7]:
print('Shape of a =', a.shape) # number of items in each dimension
print('Number of dimensions of a =', a.ndim) # number of dimensions
print('Number of rows of matrix a =', len(a)) # number of items in 1st dimension. S

# this corresponds to number of r
Shape of a = (3, 5)
Number of dimensions of a = 2
Number of rows of matrix a = 3
Arrays of different type

The type is automatically inferred by the values in the list
In [8]:
a = np.array([(1.5, 2, 3), (4, 5, 6)])
print(a)
print(a.dtype)
[[1.5 2. 3. ]
[4. 5. 6. ]]
float64
The type of the array can also be explicitly specified at creation time:
In [9]:
a = np.array( [ [1,1], [0,1] ], dtype=bool)
Out[9]: array([[ True, True],
[False, True]])
Creating arrays initialized with zeros or ones

np.ones and np.zeros :
Creates array initialized to ones and zeros, respectively
In [10]:
all_zeros = np.zeros((3, 5))
all_zeros
Out[10]: array([[0., 0., 0., 0., 0.],
[0., 0., 0., 0., 0.],
[0., 0., 0., 0., 0.]])
In [11]:
all_ones = np.ones(3)
all_ones
Out[11]: array([1., 1., 1.])

Generating number sequences
Generating evenly spaced values within an interval
np.arange
np.arange returns evenly spaced values within a given interval. Values are generated within the
half-open interval [start, stop] . This means the interval includes start but excluding stop .
For integer arguments, the function is equivalent to the Python built-in range function, but returns
an ndarray (numpy array type)rather than a list .
In [12]:
a = np.arange( 10, 30, 5 ) # the parameters are start, end, step. Note
print(a)
[10 15 20 25]
np.arange can also work with float argument as well
In [13]:
b = np.arange( 0, 2, 0.3 ) # it accepts float arguments as well
print(b)
[0. 0.3 0.6 0.9 1.2 1.5 1.8]
np.arange also accepts one parameter ( end ). By default, start = 0 and step = 1
In [14]:
c = np.arange(10) # starts = 0 and step = 1
Out[14]: array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])
Generating a certain number of items within an interval

np.linspace
Returns a particular number of numbers that are evenly spaced over a specified interval.
In [15]:
from numpy import pi
a = np.linspace( 0, 2, 9 ) # generates a total of 9 numbers ranging from

print('a = ', a)
a = [0. 0.25 0.5 0.75 1. 1.25 1.5 1.75 2. ]
In [16]:
x = np.linspace( 0, 2*pi, 10) # useful to evaluate function at lots of points
print('x = ', x)
f = np.sin(x)
print('f = ', f)
x = [0. 0.6981317 1.3962634 2.0943951 2.7925268 3.4906585
4.1887902 4.88692191 5.58505361 6.28318531]
f = [ 0.00000000e+00 6.42787610e-01 9.84807753e-01 8.66025404e-01
3.42020143e-01 -3.42020143e-01 -8.66025404e-01 -9.84807753e-01
-6.42787610e-01 -2.44929360e-16]
Random number generator

np.random.rand() :
Creates an array of the given shape and populate it with random samples from a uniform
distribution over [0, 1] (i.e., inclusive of 0 but exclusive of 1)
In [17]:
np.random.rand() # generate a random scalar number uniform in [0,1)
Out[17]: 0.013906036200248773
In [18]:
np.random.rand(4) # generates an array of size 4
Out[18]: array([0.46231009, 0.87923559, 0.96660348, 0.65176363])
In [19]:
np.random.rand(3,2) # generates a matrix of size 3x2
Out[19]: array([[0.88904638, 0.69065972],
[0.62905978, 0.48270902],
[0.80594488, 0.56769997]])
np.random.randint(low, high=None, size=None) :
Returns an array of size size . The array is populated with random integers from the discrete
uniform distribution in the interval [low, high] (excluding high ). If high is None (the
default), then results are from [0, low] .
In [20]:
np.random.randint(0, 20+1) # generate a random integer scalar between 0 and 20
Out[20]: 0
In [21]:
np.random.randint(0, 21, (3,2))# generates a matrix of size (3,2)
Out[21]: array([[ 7, 18],
[11, 14],
[11, 11]])
np.random.randn :
Returns a sample (or samples) from the standard normal distribution．
In [22]:
np.random.randn() # univariate 'normal' (Gaussian distribution, with a mean = 0, vari
Out[22]: 1.4924468880710682
The following code show how to create a 2x2 matrix whose numbers are sampled from a normal
distribution
In [23]:
np.random.randn(2, 2)
Out[23]: array([[ 0.40004349, -0.01968118],

[-1.09945574, -0.73972181]])
Seeding the random generator

np.random.seed()
By default, random number generators uses the system time for random seeding. This allows you to
generate different numbers each time you run your simultation.
But, sometimes you may want repeatable result. To produce repeatable results, seed your generator
using a constant number. Try running the following two cells. Notice that when we apply the same
seed, the random number generator will generate the same number sequences.
In [24]:
np.random.seed(4)
print(np.random.rand(4))
[0.96702984 0.54723225 0.97268436 0.71481599]
[0.69772882 0.2160895 0.97627445 0.00623026]
In [25]:
np.random.seed(4)
[0.96702984 0.54723225 0.97268436 0.71481599]
[0.69772882 0.2160895 0.97627445 0.00623026]
Exercise 1
Q1. Create a vector containing the first 10 odd integers using np.arange .
Answer:
array([ 1, 3, 5, 7, 9, 11, 13, 15, 17, 19])
In [26]:
np.arange(1, 20, 2)
Out[26]: array([ 1, 3, 5, 7, 9, 11, 13, 15, 17, 19])
Q2. Create a uniform subdivision of the interval -1.3 to 2.5 with 64 subdivisions using
np.linspace .
Answer:
array([-1.3 , -1.23968254, -1.17936508, -1.11904762, -1.05873016,
-0.9984127 , -0.93809524, -0.87777778, -0.81746032, -0.75714286,
-0.6968254 , -0.63650794, -0.57619048, -0.51587302, -0.45555556,
-0.3952381 , -0.33492063, -0.27460317, -0.21428571, -0.15396825,
-0.09365079, -0.03333333, 0.02698413, 0.08730159, 0.14761905,
0.20793651, 0.26825397, 0.32857143, 0.38888889, 0.44920635,
0.50952381, 0.56984127, 0.63015873, 0.69047619, 0.75079365,
0.81111111, 0.87142857, 0.93174603, 0.99206349, 1.05238095,
1.11269841, 1.17301587, 1.23333333, 1.29365079, 1.35396825,
1.41428571, 1.47460317, 1.53492063, 1.5952381 , 1.65555556,
1.71587302, 1.77619048, 1.83650794, 1.8968254 , 1.95714286,
2.01746032, 2.07777778, 2.13809524, 2.1984127 , 2.25873016,
2.31904762, 2.37936508, 2.43968254, 2.5 ])
In [27]:
np.linspace(-1.3, 2.5, 64)
Out[27]: array([-1.3 , -1.23968254, -1.17936508, -1.11904762, -1.05873016,
-0.9984127 , -0.93809524, -0.87777778, -0.81746032, -0.75714286,
-0.6968254 , -0.63650794, -0.57619048, -0.51587302, -0.45555556,
-0.3952381 , -0.33492063, -0.27460317, -0.21428571, -0.15396825,
-0.09365079, -0.03333333, 0.02698413, 0.08730159, 0.14761905,
0.20793651, 0.26825397, 0.32857143, 0.38888889, 0.44920635,
0.50952381, 0.56984127, 0.63015873, 0.69047619, 0.75079365,
0.81111111, 0.87142857, 0.93174603, 0.99206349, 1.05238095,
1.11269841, 1.17301587, 1.23333333, 1.29365079, 1.35396825,
1.41428571, 1.47460317, 1.53492063, 1.5952381 , 1.65555556,
1.71587302, 1.77619048, 1.83650794, 1.8968254 , 1.95714286,
2.01746032, 2.07777778, 2.13809524, 2.1984127 , 2.25873016,
2.31904762, 2.37936508, 2.43968254, 2.5 ])

Q3. Create a matrix of size (4, 5). Each item is a random integer between 0 and 10. Use
np.random.randint .
Sample solution (Numbers in matrix must be integers between 0 and 10):
array([[ 7, 9, 8, 4, 2],
[ 6, 10, 4, 3, 0],
[ 7, 5, 5, 9, 6],
[ 6, 8, 2, 5, 8]])
In [28]:
np.random.randint(0, 11,(4,5))
Out[28]: array([[ 7, 9, 8, 4, 2],
[ 6, 10, 4, 3, 0],
[ 7, 5, 5, 9, 6],
[ 6, 8, 2, 5, 8]])
Q4. Create a matrix of numbers randomly distributed between 0 and 1, having a size of (3,4). Use
np.random.rand .
Sample solution (numbers in matrix must be between 0 and 1):
array([[ 0.11798675, 0.88034312, 0.57913209, 0.06656284],
[ 0.66977722, 0.36144647, 0.7793515 , 0.85881387],
[ 0.39195763, 0.64347722, 0.17478051, 0.30934148]])
In [29]:
np.random.rand(3,4)
Out[29]: array([[0.01902474, 0.03978036, 0.94500385, 0.4463492 ],
[0.44134853, 0.06570954, 0.17586123, 0.86588276],
[0.84352812, 0.92515316, 0.87030648, 0.83525671]])

Q5. Create a dataset by sampling from a normal distribution. The dataset contains 1200 samples
(rows). Each point has 2 attributes (columns). Use np.random.randn .
Sample solution (the number in matrix is sampled from a normal distribution):
array([[-0.08722439, 0.2021376 ],
[-0.68195243, 0.3402292 ],
[ 0.92526745, -1.51547866],
...,
[-0.81260973, 0.45430727],
[-0.11204656, 0.72001995],
[-0.88341863, -0.43912629]])
In [30]:
a = np.random.randn(1200, 2)
SECTION 2: NUMERICAL OPERATIONS

Arithmetic operations
Typically, arithmetic operators on arrays apply element-wise. A new array is created and filled with
the result.
In [31]:
a = np.array ([ 2, 4, 6, 8])
b = np.array ([ 1, 2, 3, 1])
In [32]:
c = a - b # element-wise subtraction, a and b will remains unchanged
print(a)
print(b)
print(c)
[2 4 6 8]
[1 2 3 1]
[1 2 3 7]
In [33]:
a + b # element-wise addition
Out[33]: array([3, 6, 9, 9])
In [34]:
a + 2 # element-wise addition with constant
Out[34]: array([ 4, 6, 8, 10])
In [35]:
a / b # element-wise division
Out[35]: array([2., 2., 2., 8.])
In [36]:
a / 2 # element-wise division with constant
Out[36]: array([1., 2., 3., 4.])
In [37]:
a * b # element-wise multiplication
Out[37]: array([ 2, 8, 18, 8])
In [38]:
a ** b # Element-wise power
Out[38]: array([ 2, 16, 216, 8], dtype=int32)
In [39]:
a += b # this will change the content of a
Out[39]: array([3, 6, 9, 9])
In [40]:
a *= b # this will change the content of a
Out[40]: array([ 3, 12, 27, 9])
Unlike in many matrix languages, the product operator * performs elementwise multiplication,
NOT matrix multiplication
In [41]:
A = np.array( [[1,1], [0,1]] )
B = np.array( [[2,0], [3,4]] )
A*B
Out[41]: array([[2, 0],
[0, 4]])
Comparison and Logical operations

In [42]:
a1 = np.array([13, 22, 37, 42])
a2 = np.array([13, 22, 37, 42])
b = np.array([2, 4, 7, 5])
In [43]:
a1 < 35 # element-wise comparison
Out[43]: array([ True, True, False, False])
In [44]:
a1 == 42
Out[44]: array([False, False, False, True])
In [45]:
a1 == a2
Out[45]: array([ True, True, True, True])

In [46]:
np.array_equal(a1, a2)
Out[46]: True
In [47]:
np.array_equal(a1, b)
Out[47]: False
In [48]:
bool1 = np.array([True, True, False, False])
bool2 = np.array([False, True, False, True])
all_true = np.array([True, True, True, True])
all_false = np.array([False, False, False, False])
In [49]:
np.all(bool1) # Checks if every value is True
Out[49]: False
In [50]:
np.all(all_true)
Out[50]: True
In [51]:
np.any(bool1) # Checks if any value is True
Out[51]: True
In [52]:
np.any(all_false)
Out[52]: False
In [53]:
bool1 & bool2 # Element-wise AND operation
Out[53]: array([False, True, False, False])
In [54]:
bool1 | bool2 # Element-wise OR operation
Out[54]: array([ True, True, False, True])
Comparing arrays
In [55]:
a = np.array([1, 2, 3, 4])
b = np.array([4, 2, 2, 4])
In [56]:
a == b
Out[56]: array([False, True, False, True])
In [57]:
np.maximum (a, b) # Get the element-wise maximum of all corresponding items in a and
Out[57]: array([4, 2, 3, 4])
In [58]:
np.minimum (a, b) # Get the element-wise minimum of all corresponding items in a and
Out[58]: array([1, 2, 2, 4])
Applying numpy functions

In [59]:
a = np.random.randint(-10, 10, 5)
Out[59]: array([-1, 6, -6, 9, 5])
In [60]:
np.sin(a)
Out[60]: array([-0.84147098, -0.2794155 , 0.2794155 , 0.41211849, -0.95892427])
In [61]:
np.exp(a)
Out[61]: array([3.67879441e-01, 4.03428793e+02, 2.47875218e-03, 8.10308393e+03,
1.48413159e+02])
In [62]:
np.abs(a)
Out[62]: array([1, 6, 6, 9, 5])
In [63]:
np.max(a) # get the item with the biggest value
Out[63]: 9
In [64]:
np.min(a) # get the item with the smallest value
Out[64]: -6
ndarray functions
All ndarrays come with a set of functions that can be invoked to operate on its data.
In [65]:
a = np.random.randint(0, 20, 10)
Out[65]: array([10, 7, 7, 2, 9, 11, 4, 17, 1, 2])

In [66]: print('sum: ', a.sum()) # compute the sum
print('mean: ', a.mean()) # compute the mean
print('std: ', a.std()) # compute the standard deviation
print('min: ', a.min()) # get the smallest value in array
print('argmin: ', a.argmin()) # find the index of the item with the smalles
print('max: ', a.max()) # get the maximum value
print('argmax: ', a.argmax()) # find the index of the item with the biggest
sum: 70
mean: 7.0
std: 4.732863826479693
min: 1
argmin: 8
max: 17
argmax: 7
ndarray operations on 2-D arrays
In [67]:
a = np.random.randint (0, 20, (3, 2))
Out[67]: array([[ 7, 1],
[ 5, 7],
[19, 19]])
In [68]:
print('sum: ', a.sum()) # compute the sum
print('mean: ', a.mean()) # compute the mean
print('std: ', a.std()) # compute the standard deviation
print('min: ', a.min()) # get the smallest value in array
print('argmin: ', a.argmin()) # find the index of the item with the smalles
print('max: ', a.max()) # get the maximum value
print('argmax: ', a.argmax()) # find the index of the item with the biggest
sum: 58
mean: 9.666666666666666
std: 6.896053621859068
min: 1
argmin: 1
max: 19
argmax: 4
By default, these operations apply to the 2-D array as though it were a list of numbers, regardless of
its shape.
However, by specifying the axis parameter you can apply an operation along the specified axis of an
array:
In [69]:
a.sum(axis=0) # axis = 0 refers to the column dimension (sum all items in eac
Out[69]: array([31, 27])
In [70]:
a.sum(axis=1) # axis = 1 refers to the row dimension (sum all items in each row
Out[70]: array([ 8, 12, 38])

Exercise 2
Q1. Create a random vector with values [0, 1) of size 30. Then compute its summation, mean and
standard deviation values.
Sample solution:
[ 0.31826591 0.64980313 0.9074009 0.01855129 0.1471393 0.21421904
0.35889039 0.26098793 0.67206902 0.22928994 0.67655462 0.46969651
0.59789036 0.77442506 0.786646 0.30591469 0.36607411 0.08582968
0.82389183 0.93058502 0.25422148 0.24471819 0.90520293 0.48908817
0.55304988 0.73125874 0.84016257 0.18487058 0.80443297 0.69220994]
summation = 15.2933401657
mean = 0.509778005522
std = 0.273489774136
In [71]:
Z = np.random.rand(30)
print(Z)
print('summation = ', Z.sum())
print('mean = ', Z.mean())
print('std = ', Z.std())
[0.31826591 0.64980313 0.9074009 0.01855129 0.1471393 0.21421904
0.35889039 0.26098793 0.67206902 0.22928994 0.67655462 0.46969651
0.59789036 0.77442506 0.786646 0.30591469 0.36607411 0.08582968
0.82389183 0.93058502 0.25422148 0.24471819 0.90520293 0.48908817
0.55304988 0.73125874 0.84016257 0.18487058 0.80443297 0.69220994]
summation = 15.293340165651928
mean = 0.5097780055217309
std = 0.2734897741357628
Q2. Create a 3x3 array with random integer values [0, 10) . Then, find the following:
1. maximum value in the whole matrix

2. minimum value of each row of the matrix
3. sum of each column of the matrix
Sample solution:
[[0 6 6]
[3 7 4]
[3 6 5]]
maximum value in the whole matrix: 7
minimum value of each row of the matrix: [0 3 3]
sum of each column of the matrix: [ 6 19 15]
In [72]:
Z = np.random.randint(0, 10, (3,3))
print(Z, '\n')
print('maximum value in the whole matrix:', Z.max())
print('minimum value of each row of the matrix:', Z.min(axis=1))
print('sum of each column of the matrix:', Z.sum(axis=0))
[[0 6 6]
[3 7 4]
[3 6 5]]
maximum value in the whole matrix: 7
minimum value of each row of the matrix: [0 3 3]
sum of each column of the matrix: [ 6 19 15]
Q3. Generate the array containing the value of f (x) = sin(x) for −2π ≤ x ≤ 2π . Compute
f (x)
for 100 evenly spaced points in x.
Then, plot the values using the following command:
import matplotlib.pyplot as plt
plt.plot(x, y)
plt.show()
In [73]:
angles = np.linspace(-2*np.pi, 2*np.pi, 100)
sinvalues = np.sin(angles)
plt.plot(angles, sinvalues)
plt.show()
2
x
Q4. Generate the array containing the value of f (x) for −5 . The array should
1 −
= e 2
≤ x ≤ 5
√2π
contain f (x) for 100 evenly spaced points in x. Then, plot out the computed values. Refer to Q3 on
how to plot the graph.
In [74]:
x = np.linspace(-5, 5, 100)
fx = np.exp(-x**2/2)/np.sqrt(2*np.pi)
plt.plot(x, fx)
plt.show()
Q5. Given the vector a = np.array([12, 14, 20, 26, 39, 49]) , find the closest value to
query = 27 . The algorithm to achieve this is as follows:
compute the absolute difference between all items with the query item (Useful command:
np.abs )
get the item with the smallest absolute distance (Useful command: np.argmin )
Ans:
The closest number in a to 27 is 26
In [75]:
a = np.array([12, 14, 20, 26, 39, 49])
query = 27
index = (np.abs(a-query)).argmin()
print('The closest number in a to', query, 'is', a[index])
The closest number in a to 27 is 26
Q6. Randomly generate a vector a of size 10 with values betwen 100 and 200. Compute the L2
norm of a Then normalize the vector a to get an by dividing each item with the L2 norm.
For example, given a = [4, 3] , the L2 norm of a is √4 and an = [0.8, 0.6]

2 2
+ 3 = 5
In [76]:
import numpy as np
a = np.array([4, 3])
print('a: ', a)
L2ofA = np.sqrt(np.sum(a**2))
print('Norm 2 = ', L2ofA)
print('Normalized a = ', a/L2ofA)
a: [4 3]
Norm 2 = 5.0
Normalized a = [0.8 0.6]
In [77]:
print('a: ', a)
L2ofA = np.sqrt(np.sum(a**2))
print('Norm 2 = ', L2ofA)
print('Normalized a = ', a/L2ofA)
a: [178 181 132 136 166 180 136 128 197 186]
Norm 2 = 518.2721292911668
Normalized a = [0.34344891 0.34923738 0.25469245 0.26241041 0.32029505 0.34730789
0.26241041 0.2469745 0.38010919 0.35888482]
Q7. Given two vectors d1 = (5, 0, 3, 0, 2, 0, 0, 2, 0, 0) and d2 = (3, 0, 2, 0, 1,

1, 0, 1, 0, 1) , compute the cosine similarity between d1 and d2 .
n
∑ a i bi
i=1
cos(a, b) =
n n
2 2
√∑ a √∑ b
i=1 i i=1 i
The steps are as follows:
1. Compute A = ∑ = sum(5(3) + 0(0) + ... (0)(1)) = 25

n
ai b i
i=1
2. Compute B = √∑ = sqrt(sum(5(5) + 0(0) + ... + 0(0))) = 6.48

n 2
a
i=1 i
3. Compute C = √∑ = sqrt(sum(3(3) + 0(0) + ... + 1(1))) = 4.12

n 2
b
i=1 i
4. cos(d1, d2) = A / (B*C) = 0.9356
In [78]:
d1 = np.array([5, 0, 3, 0, 2, 0, 0, 2, 0, 0])
d2 = np.array([3, 0, 2, 0, 1, 1, 0, 1, 0, 1])
A = np.sum(d1*d2)
B = np.sqrt(np.sum(d1*d1))
C = np.sqrt(np.sum(d2*d2))
cos = A/(B*C)
print(cos)
0.9356014857063997
SECTION 3: INDEXING, SLICING AND ITERATING

Simple Indexing and Slicing
One-dimensional arrays can be indexed, sliced and iterated over, much like lists and other Python
sequences.
In [79]:
print(a)
[65 60 37 21 18 96 26 24 39 83]
In [80]:
a[2] # get item 2
Out[80]: 37
In [81]:
a[2:5] # start = 2, end = 5 (not inclusive). This gets item 2 up to item 4
Out[81]: array([37, 21, 18])
In [82]:
a[:3] # equivalent to a[0:3]
Out[82]: array([65, 60, 37])
In [83]:
a[7:] # gets item at position 7 to the last position
Out[83]: array([24, 39, 83])
We can change the content of the array using slicing.
In [84]:
a[2] = -1 # overwritting the third item
Out[84]: array([65, 60, -1, 21, 18, 96, 26, 24, 39, 83])
In [85]:
a[::2] = -1000 # equivalent to a[0:10:2] = -1000
Out[85]: array([-1000, 60, -1000, 21, -1000, 96, -1000, 24, -1000,
83])
Negative indexing
Numpy array also supports negative indexing where we index from the end of the list.
In [86]:
print(a)
[24 19 66 85 65 23 56 14 10 5]
In [87]:
a[-1]
Out[87]: 5
In [88]:
a[-3:] # get the last three items
Out[88]: array([14, 10, 5])
Multidimensional indexing
The following codes show how to index a 2-D array.
In [89]:
a = np.random.randint(0, 100, (3, 5))
Out[89]: array([[18, 25, 92, 31, 75],
[31, 98, 27, 99, 37],
[39, 87, 49, 25, 58]])
In [90]:
a[2, 3] # item at row 2, column 3 (third row, fourth column)
Out[90]: 25
In [91]:
a[:, 1] # get the column 1 of a
Out[91]: array([25, 98, 87])
In [92]:
a[:2, 1] # first two items in column 1 of a
Out[92]: array([25, 98])
In [93]:
a[:, 1:3] # get column 1 and 2 (not inclusive of 3)
Out[93]: array([[25, 92],
[98, 27],
[87, 49]])
When fewer indices are provided than the number of axes, the missing indices are considered
complete slices:
In [94]:
a[2] # get row 2
Out[94]: array([39, 87, 49, 25, 58])
In [95]:
a[1:] # get row 1 to the last row
Out[95]: array([[31, 98, 27, 99, 37],
[39, 87, 49, 25, 58]])
Indexing with Integer Arrays

Indexing a 1-D array
In [96]:
a = np.array([10, 11, 13, 0, 10, 10, 16, 12, 18, 9])
selected = [2, 3, 5] #select third, fourth and sixth item from array
b = a[selected]
Out[96]: array([13, 0, 10])
Indexing a 2-D array
In [97]:
a = np.array([[2, 5, 1, 1, 2],
[5, 7, 2, 7, 35],
[1, 9, 8, 49, 9]])
selected_row = (2, 1)
selected_col = (3, 4)
b = a[selected_row, selected_col] # extracts item at location (2, 3) and (

b
Out[97]: array([49, 35])
Indexing with Boolean Arrays

We can also index an array using boolean arrays.
1-D indexing using Boolean Arrays
In [98]:
a = np.array([23, -56, 2, 3, -57])
a
Out[98]: array([ 23, -56, 2, 3, -57])
In [99]:
selected = [True, False, True, False, False]
a[selected]
Out[99]: array([23, 2])
In [100…
all_pos = a > 0
a[all_pos] # extract all positive numbers from a
Out[100… array([23, 2, 3])
In [101…
isEven = a % 2 == 0
a [isEven] # extract all even numbers from a
Out[101… array([-56, 2])
2-D indexing using Boolean Arrays
In [102…
a = np.array([[2, 5, 1, 1, 2],
[5, 7, 2, 7, 35],
[1, 9, 8, 49, 9]])
In [103…
sel_rows = [False, True, True]
a[sel_rows] # select the last two rows
Out[103… array([[ 5, 7, 2, 7, 35],
[ 1, 9, 8, 49, 9]])
In [104… sel_cols = [True, True, False, False, False]
a[:, sel_cols] # select the first two columns
Out[104… array([[2, 5],
[5, 7],
[1, 9]])
Exercise 3
Q1. Create a vector of zeros (integers) of size 10. Use slicing to set the first two items with 2, the
next 3 items with 4 and the remaining items with 6.
In [105…
a = np.zeros(10, dtype = 'int')
a[:2] = 2
a[2:5] = 4
a[5:] = 6
Out[105… array([2, 2, 4, 4, 4, 6, 6, 6, 6, 6])
Q2. Generate a random matrix of size 6x6. Then write the codes to extract the following items from
the matrix.
In [106…
a = np.random.randint(1,100, (6,6))
print(a, '\n')
print ('Matrix (a):')
print(a[2], '\n')
print ('Matrix (b):')
print(a[:,2], '\n')
print ('Matrix (c):')
print(a[0,3:5], '\n')
print ('Matrix (d):')
print(a[2:5], '\n')
print ('Matrix (e):')
print(a[:, [2, 4]], '\n')
print ('Matrix (f):')
print(a[4:,4:])
[[87 2 90 53 59 14]
[92 98 88 66 53 73]
[56 63 49 24 95 60]
[32 95 35 55 69 24]
[ 9 35 63 30 79 89]
[98 90 37 94 75 49]]
Matrix (a):
[56 63 49 24 95 60]
Matrix (b):
[90 88 49 35 63 37]
Matrix (c):
[53 59]
Matrix (d):
[[56 63 49 24 95 60]
[32 95 35 55 69 24]
[ 9 35 63 30 79 89]]
Matrix (e):
[[90 59]
[88 53]
[49 95]
[35 69]
[63 79]
[37 75]]
Matrix (f):
[[79 89]
[75 49]]
Q3. Randomly generate an array comprising all numbers of size 20 with numbers between 0 and
100. Then, extract the array of numbers that are both even and divisible by 3.
In [107…
a = np.random.randint(0,101,20)
print(a)
selected = np.logical_and(a%2 == 0, a%3 == 0)
print(a[selected])
[89 45 55 72 31 49 60 98 91 28 97 68 72 88 95 62 41 38 1 95]
[72 60 72]
SECTION 4: MATRIX COMPUTATION

Basic matrix computation
In [108…
A = np.array([[1, 0], [2, 5], [3, 1]], dtype = 'int')
print(A)
B = np.array([[4, 0.5], [2, 5], [0, 1]], dtype = 'int')
print(B)
[[1 0]
[2 5]
[3 1]]
[[4 0]
[2 5]
[0 1]]
In [109…
A + B # A plus B
Out[109… array([[ 5, 0],
[ 4, 10],
[ 3, 2]])
In [110…
A - B # A subtract B
Out[110… array([[-3, 0],
[ 0, 0],
[ 3, 0]])
Matrix multiplication
np.dot :
In numpy, matrix multiplication is performed through np.dot . The operator * only performs
element-wise multiplication.
In [111…
m1 = np.array([[1, 2],[1, 4]])
print (m1)
m2 = np.array([[0, 1],[1, 2]])
print (m2)
v1 = np.array([2, 3])
print(v1)
v2 = np.array([1, 5])
print(v2)
[[1 2]
[1 4]]
[[0 1]
[1 2]]
[2 3]
[1 5]
In [112…
np.dot(m1, m2) # matrix matrix multiplication
Out[112… array([[2, 5],
[4, 9]])
In [113…
m1.dot(m2) # matrix matrix multiplication
Out[113… array([[2, 5],
[4, 9]])
In [114…
np.dot(m1, v1) # matrix vector multiplication
Out[114… array([ 8, 14])
In [115…
np.dot(v1, v2) # vector-vector multiplication (or dot product)
Out[115… 17
Transpose a matrix
Transpose of a matrix is obtained by changing rows to columns and columns to rows. In other
words, transpose of A[][] is obtained by changing A[i][j] to A[j][i].
In [116…
A.T
Out[116… array([[1, 2, 3],
[0, 5, 1]])
Invert a matrix
Inversion can only work on square matrix
In [117…
b = np.random.randint(0, 10, (3, 3))
print(b)
[[8 0 4]
[5 9 8]
[5 6 2]]
In [118…
np.linalg.inv(b)
Out[118… array([[ 0.1 , -0.08 , 0.12 ],
[-0.1 , 0.01333333, 0.14666667],
[ 0.05 , 0.16 , -0.24 ]])
Exercise 4
Q1. First, set the seed for your random generator to 42. Then, randomly sample from the standard
normal distribution: (a) predicted output vector h and (b) actual output vector y , both having a
size of (m,) where m = 50 (Use np.random.randn ).
Then, compute the RMSE.
1 m
(i) (i) 2
RM SE = √ ∑ (h − y )
m i=1
1
T
= √ (h − y) (h − y)
m
Expected answer: 1.2192

In [119… np.random.seed(42)
m = 50
h = np.random.randn(m)
y = np.random.randn(m)
e = h - y
rmse = np.sqrt(np.dot(e.T, e)/m)
rmse
Out[119… 1.2192273122578396
Q2. Create the matrix X = np.array([[1, 1],[2, 1],[3, 1]]) and y = np.array([1, 3,
7]) . Then, compute the normal equation.
T −1 T
h = (X X) X y
Expected answer: [3, -2.33333333]
In [120…
X = np.array([[1, 1],[2, 1],[3, 1]])
y = np.array([1, 3, 7])
h = np.dot(np.dot(np.linalg.inv(np.dot(X.T, X)), X.T), y)
Out[120… array([ 3. , -2.33333333])
SECTION 5: COPIES vs VIEWS

When operating and manipulating arrays, there are three possible scenarios on how the data is
updated or returned:
1. Reference copy: Returns the reference to the same data.

2. View copy: Similar to reference copy where it returns the reference to the same data. However,
it may change the view (namely the shape) of the data. However, the underlying data are the
same.
3. Copy: Copies the data into a new array. Any changes to the new array would not affect the old
one.
This may be quite confusing for beginners. But, it is important to understand this concepts.
Reference copy
Returns the reference to the same data. This is done through the operator = .
In [121…
a = np.arange(12)
print("a = ", a)
a = [ 0 1 2 3 4 5 6 7 8 9 10 11]
In [122… b = a
print("b = ", b)
b is a # for simple assignment, a and b are two names for the same nd
b = [ 0 1 2 3 4 5 6 7 8 9 10 11]
Out[122… True
Changing the content of a affects the content of b
In [123…
a[0] = -99 # change the content of a affects b as well
print(a)
print(b)
[-99 1 2 3 4 5 6 7 8 9 10 11]
[-99 1 2 3 4 5 6 7 8 9 10 11]
In [124…
b.shape = 3,4 # change the shape of b
print("b = ")
print(b)
print("a = ") # changes to b affects a as well

print(a)
b =
[[-99 1 2 3]
[ 4 5 6 7]
[ 8 9 10 11]]
a =
[[-99 1 2 3]
[ 4 5 6 7]
[ 8 9 10 11]]
View copy
Different array objects can share the same data but present it differently as follows:
Display it in a different shape

Display only portion of it
The following code creates three different view of the same data a .
First, we create our array a .
In [125…
a = np.array([[1, 2, 3],[4, 5, 6]])
Out[125… array([[1, 2, 3],
[4, 5, 6]])
When we flatten an array using .ravel() , it returns a view of the original data.
In [126…
b = a.ravel()
Out[126… array([1, 2, 3, 4, 5, 6])
When we reshape an array using .reshape() , it returns a view of the original data.
In [127…
c = a.reshape(3, 2)
Out[127… array([[1, 2],
[3, 4],
[5, 6]])
When we slice an array, we return a view of it:
In [128…
d = a[ : , 1:3] # spaces added for clarity; could also be written "s = a[:,1:3]"
Out[128… array([[2, 3],
[5, 6]])
Changing the content in b or c or d will change a as well. The changes will be reflected in all
views as well.
In [129… b[-1] = -9999
print('a:\n', a)
print('\nb:\n', b)
print('\nc:\n', c)
print('\nd:\n', d)
a:
[[ 1 2 3]
[ 4 5 -9999]]
b:
[ 1 2 3 4 5 -9999]
c:
[[ 1 2]
[ 3 4]
[ 5 -9999]]
d:
[[ 2 3]
[ 5 -9999]]
Copy
The copy method makes a complete copy of the array and its data.
In [130…
a = np.arange(15).reshape(3, 5)
d = a.copy() # a new array object with new data is created
In [131…
d is a # d is not the same as a
Out[131… False
In [132…
d[0,0] = 9999 # Changing the content of d does not affect a
print("a = ")
print(a)
print("d = ")
print(d)
a =
[[ 0 1 2 3 4]
[ 5 6 7 8 9]
[10 11 12 13 14]]
d =
[[9999 1 2 3 4]
[ 5 6 7 8 9]
[ 10 11 12 13 14]]
SECTION 6: MANIPULATING ARRAYS

We have already seen how we can manipulate shape of the arrays through the commands
.ravel() and .reshape() .
In this section, we further see how to:

1. combine matrices and
2. create a new dimensions to an array.
In [133…
a = np.array([[1,2,3], [4,5,6]])
Out[133… array([[1, 2, 3],
[4, 5, 6]])
In [134…
b = np.array([[23,23,34], [43,54,63]])
Out[134… array([[23, 23, 34],
[43, 54, 63]])

hstack
Stacks two arrays horizontally. The two arrays must have the same number of rows.
In [135…
c = np.hstack((a,b))
Out[135… array([[ 1, 2, 3, 23, 23, 34],
[ 4, 5, 6, 43, 54, 63]])

vstack
Stacks two arrays vertically. The two arrays must have the same number of columns.
In [136…
d = np.vstack((a,b))
Out[136… array([[ 1, 2, 3],
[ 4, 5, 6],
[23, 23, 34],
[43, 54, 63]])

Notes: numpy array is not designed to dynamically add or remove items. If you need to expand or
shrink your arrays consistently, it is better to represent the data using a normal Python list.
newaxis
We can use newaxis to convert an array (vector) of size n into a 2-D array (matrix) of size nx1 .
In [137…
a = np.array([4, 2])
print(a)
print(a.shape)
[4 2]
(2,)
In [138…
b = a[:, np.newaxis]
print(b)
print('shape =', b.shape)
[[4]
[2]]
shape = (2, 1)
Exercise 5
Q1. Create the following matrix:
array([[0, 1, 2],
[3, 4, 5],
[6, 7, 8]])
Use the command arange and reshape to do this.
In [139…
np.arange(9).reshape(3,3)
Out[139… array([[0, 1, 2],
[3, 4, 5],
[6, 7, 8]])
Q2. Create the matrix a as follows: generate a 10x10 matrix of zeros. Then, frame it with a border
of ones.
In [140…
a = np.zeros((10,10))
a = np.hstack((a, np.ones((10,1))))
a = np.hstack((np.ones((10,1)), a))
a = np.vstack((a, np.ones((1,12))))
a = np.vstack((np.ones((1,12)), a))
print(a)
[[1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
[1. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 1.]
[1. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 1.]
[1. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 1.]
[1. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 1.]
[1. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 1.]
[1. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 1.]
[1. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 1.]
[1. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 1.]
[1. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 1.]
[1. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 1.]
[1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]]
Q3. Create a 8x8 matrix and the fill it with a checkerboard pattern.
First, create a zero matrix of size 8x8

Set the odd rows, even numbers to 1 using slicing
Set the even rows, odd numbers to 1 using slicing
In [141…
A = np.zeros((8,8))
A[1::2,::2] = 1
A[::2,1::2] = 1
print(A)
[[0. 1. 0. 1. 0. 1. 0. 1.]
[1. 0. 1. 0. 1. 0. 1. 0.]
[0. 1. 0. 1. 0. 1. 0. 1.]
[1. 0. 1. 0. 1. 0. 1. 0.]
[0. 1. 0. 1. 0. 1. 0. 1.]
[1. 0. 1. 0. 1. 0. 1. 0.]
[0. 1. 0. 1. 0. 1. 0. 1.]
[1. 0. 1. 0. 1. 0. 1. 0.]]
SECTION 7: ARRAY BROADCASTING

By default, numpy performs element-wise operations. This requires the input arrays to be of the
same shape.
When the input arrays have different shape, it may perform broadcasting before carrying out the
operation. Subject to certain constraints, the smaller array is "broadcast" across the larger array so
that they have compatible shapes.
2-D array and scalar

Let's take an example when we want multiply an array with a scalar:
In [142…
a = np.array([1, 2, 3])
b = 2
a * b
Out[142… array([2, 4, 6])
Note that numpy does not actually stretches b to create a new array but is smart enough to use
the original scalar value without actually making copies so that broadcasting operations are as
memory and computationally efficient as possible.
2-D array and 1-D array

Now, let's try to add a 2-D array with a 1-D array. If the size of the last dimension of the 2-D array is
the same as the number of items in 1-D array, then numpy will peform broadcasting by default.
In the following example,

Note that the shape of a is (4, 3) and the shape of b is (3,). Since the size of last dimension (i.e., 3)
is the same as the number of items in b , numpy will broadcast b to be added to each row of a .
In [143…
a = np.array([[ 0, 0, 0],
[10, 10, 10],
[20, 20, 20],
[30, 30, 30]])
b = np.array([0, 1, 2])
a + b
Out[143… array([[ 0, 1, 2],
[10, 11, 12],
[20, 21, 22],
[30, 31, 32]])

For broadcasting to work, the number of dimensions must be consistent, i.e., the size of the last
dimension of the 2-D array must be the same as the number of elements in the 1-D array.
2-D array and 2-D array
If you want to broadcast the column-wise addition of a matrix of shape (4,3) with a vector (4,),
ensure that you convert the vector into a 2-D array of shape (4, 1) to align the dimensions. Numpy
will then broadcast the operation across all columns.
In [144…
a = np.array([[ 0, 0, 0],
[10, 10, 10],
[20, 20, 20],
[30, 30, 30]])
print(a)
b = np.array([0, 10, 20, 30]).reshape(-1, 1)
print()
print(b)
a + b
[[ 0 0 0]
[10 10 10]
[20 20 20]
[30 30 30]]
[[ 0]
[10]
[20]
[30]]
Out[144… array([[ 0, 0, 0],
[20, 20, 20],
[40, 40, 40],
[60, 60, 60]])
Exercise 6
Q1. Randomly generate a matrix of size (5, 3). Normalize the columns such that the sum of each
column is equal to 1. Use broadcasting in your code to simplify your task.
In [145…
X = np.random.randn(5, 3)
colsum = X.sum(axis = 0) # get the sum of all columns
Xn = X / colsum # broadcasting: divide each column with its sum
print(Xn)
print(Xn.sum(axis = 0)) # show the sum of all columns
[[ 4.09447953 -3.06776877 -3.4267377 ]
[ 2.32088156 -1.17625763 4.04002818]
[ -5.45648524 1.273197 2.5751977 ]
[ 0.21536214 -13.99360956 -0.26510723]
[ -0.174238 17.96443895 -1.92338094]]
[1. 1. 1.]

P02 Introduction To Numpy Ans

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

P02 Introduction To Numpy Ans

Uploaded by

Copyright:

Available Formats

P02: INTRODUCTION TO NUMPY

SECTION 1: ARRAY CREATION

Creates a numpy array using a normal python list or tuple.

Creating 1-D array

Out[2]: array([2, 3, 4])

Out[3]: array([2, 4, 6, 8])

Displaying the information about the array

print('Number of dimensions of a =', a.ndim) # number of dimensions

print('Number of items of vector a =', len(a)) # number of items in 1st dimension. S

Number of items of vector a = 4

Displaying the type of the array or its elements

print('Type of items in a =', a.dtype) # the items stored in a is an integer

Type of a = <class 'numpy.ndarray'>

Type of items in a = int32

Creating 2-D array

Out[6]: array([[ 1, 2, 3, 4, 5],

[11, 12, 13, 14, 15]])

print('Number of dimensions of a =', a.ndim) # number of dimensions

print('Number of rows of matrix a =', len(a)) # number of items in 1st dimension. S

Number of rows of matrix a = 3

Arrays of different type

Out[9]: array([[ True, True],

Creating arrays initialized with zeros or ones

Creates array initialized to ones and zeros, respectively

Out[10]: array([[0., 0., 0., 0., 0.],

[0., 0., 0., 0., 0.],

[0., 0., 0., 0., 0.]])

Out[11]: array([1., 1., 1.])

np.arange can also work with float argument as well

[0. 0.3 0.6 0.9 1.2 1.5 1.8]

Out[14]: array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])

Generating a certain number of items within an interval

a = np.linspace( 0, 2, 9 ) # generates a total of 9 numbers ranging from

a = [0. 0.25 0.5 0.75 1. 1.25 1.5 1.75 2. ]

x = [0. 0.6981317 1.3962634 2.0943951 2.7925268 3.4906585

4.1887902 4.88692191 5.58505361 6.28318531]

f = [ 0.00000000e+00 6.42787610e-01 9.84807753e-01 8.66025404e-01

3.42020143e-01 -3.42020143e-01 -8.66025404e-01 -9.84807753e-01

Random number generator

Out[18]: array([0.46231009, 0.87923559, 0.96660348, 0.65176363])

Out[19]: array([[0.88904638, 0.69065972],

Out[21]: array([[ 7, 18],

Returns a sample (or samples) from the standard normal distribution．

Out[23]: array([[ 0.40004349, -0.01968118],

Seeding the random generator

[0.96702984 0.54723225 0.97268436 0.71481599]

[0.69772882 0.2160895 0.97627445 0.00623026]

[0.96702984 0.54723225 0.97268436 0.71481599]

[0.69772882 0.2160895 0.97627445 0.00623026]

array([ 1, 3, 5, 7, 9, 11, 13, 15, 17, 19])

Out[26]: array([ 1, 3, 5, 7, 9, 11, 13, 15, 17, 19])

array([-1.3 , -1.23968254, -1.17936508, -1.11904762, -1.05873016,

-0.9984127 , -0.93809524, -0.87777778, -0.81746032, -0.75714286,

-0.6968254 , -0.63650794, -0.57619048, -0.51587302, -0.45555556,

-0.3952381 , -0.33492063, -0.27460317, -0.21428571, -0.15396825,

-0.09365079, -0.03333333, 0.02698413, 0.08730159, 0.14761905,

0.20793651, 0.26825397, 0.32857143, 0.38888889, 0.44920635,

0.50952381, 0.56984127, 0.63015873, 0.69047619, 0.75079365,

0.81111111, 0.87142857, 0.93174603, 0.99206349, 1.05238095,

1.11269841, 1.17301587, 1.23333333, 1.29365079, 1.35396825,

1.41428571, 1.47460317, 1.53492063, 1.5952381 , 1.65555556,

1.71587302, 1.77619048, 1.83650794, 1.8968254 , 1.95714286,

2.01746032, 2.07777778, 2.13809524, 2.1984127 , 2.25873016,