You are on page 1of 32

P02: INTRODUCTION TO NUMPY

1. ARRAY CREATION
A. Creating an array from a list or tuple
B. Creating arrays initialized with zeros or ones
C. Generating number sequences
D. Random number generator
E. Seeding the random generator
F. Exercise 1
2. SECTION 2: NUMERICAL OPERATIONS
A. Exercise 2
3. SECTION 3: INDEXING, SLICING AND ITERATING
A. Simple Indexing and Slicing
B. Indexing with Integer Arrays
C. Indexing with Boolean Arrays
D. Exercise 3
4. SECTION 4: MATRIX COMPUTATION
A. Exercise 4
5. SECTION 5: COPIES vs VIEWS
A. Reference copy
B. View copy
C. Copy
6. SECTION 6: MANIPULATING ARRAYS
A. Exercise 5
7. SECTION 7: ARRAY BROADCASTING
A. Exercise 6

Reference:

The numpy reference and user guide are available in this link

This tutorial is adapted from SciPy Quickstart Tutorial. If you need more exercises, you can work on
100 numpy exercises

INTRODUCTION
What is numpy?
NumPy is the fundamental package required for high performance scientific computing and data
analysis. It issued by practically all other scientific tools, such as Panda and Scikit. The following code
shows how to import the numpy library.
In [1]: import numpy as np # import numpy library

Why numpy?

Numpy is much faster for scientific computation compared to conventional Python methods.
Provides standard mathematical functions for fast operations on entire arrays of data without
having to write loops (easier to write)
Provides tools for reading / writing array data to disk
Linear algebra, random number generation capabilities

SECTION 1: ARRAY CREATION


Creating an array from a list or tuple
np.array :

Creates a numpy array using a normal python list or tuple.

Creating 1-D array


In [2]:
a = np.array([2,3,4])

Out[2]: array([2, 3, 4])

In [3]:
list1 = [2,4,6,8]

a = np.array(list1)

Out[3]: array([2, 4, 6, 8])

Displaying the information about the array

In [4]:
print('Shape of a =', a.shape) # number items in each dimension

print('Number of dimensions of a =', a.ndim) # number of dimensions

print('Number of items of vector a =', len(a)) # number of items in 1st dimension. S


# this corresponds to number of i

Shape of a = (4,)

Number of dimensions of a = 1

Number of items of vector a = 4

Displaying the type of the array or its elements

In [5]:
print('Type of a =', type(a)) # a is a list

print('Type of items in a =', a.dtype) # the items stored in a is an integer

Type of a = <class 'numpy.ndarray'>

Type of items in a = int32

Creating 2-D array


In [6]:
a = np.array([[1, 2, 3, 4, 5],[6, 7, 8, 9, 10], [11, 12, 13, 14, 15]])

Out[6]: array([[ 1, 2, 3, 4, 5],

[ 6, 7, 8, 9, 10],

[11, 12, 13, 14, 15]])

In [7]:
print('Shape of a =', a.shape) # number of items in each dimension

print('Number of dimensions of a =', a.ndim) # number of dimensions

print('Number of rows of matrix a =', len(a)) # number of items in 1st dimension. S


# this corresponds to number of r

Shape of a = (3, 5)

Number of dimensions of a = 2

Number of rows of matrix a = 3

Arrays of different type


The type is automatically inferred by the values in the list

In [8]:
a = np.array([(1.5, 2, 3), (4, 5, 6)])

print(a)

print(a.dtype)

[[1.5 2. 3. ]

[4. 5. 6. ]]

float64

The type of the array can also be explicitly specified at creation time:

In [9]:
a = np.array( [ [1,1], [0,1] ], dtype=bool)

Out[9]: array([[ True, True],

[False, True]])

Creating arrays initialized with zeros or ones


np.ones and np.zeros :

Creates array initialized to ones and zeros, respectively

In [10]:
all_zeros = np.zeros((3, 5))

all_zeros

Out[10]: array([[0., 0., 0., 0., 0.],

[0., 0., 0., 0., 0.],

[0., 0., 0., 0., 0.]])

In [11]:
all_ones = np.ones(3)

all_ones

Out[11]: array([1., 1., 1.])


Generating number sequences
Generating evenly spaced values within an interval
np.arange

np.arange returns evenly spaced values within a given interval. Values are generated within the
half-open interval [start, stop] . This means the interval includes start but excluding stop .
For integer arguments, the function is equivalent to the Python built-in range function, but returns
an ndarray (numpy array type)rather than a list .

In [12]:
a = np.arange( 10, 30, 5 ) # the parameters are start, end, step. Note
print(a)

[10 15 20 25]

np.arange can also work with float argument as well

In [13]:
b = np.arange( 0, 2, 0.3 ) # it accepts float arguments as well

print(b)

[0. 0.3 0.6 0.9 1.2 1.5 1.8]

np.arange also accepts one parameter ( end ). By default, start = 0 and step = 1

In [14]:
c = np.arange(10) # starts = 0 and step = 1

Out[14]: array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])

Generating a certain number of items within an interval


np.linspace

Returns a particular number of numbers that are evenly spaced over a specified interval.

In [15]:
from numpy import pi

a = np.linspace( 0, 2, 9 ) # generates a total of 9 numbers ranging from


print('a = ', a)

a = [0. 0.25 0.5 0.75 1. 1.25 1.5 1.75 2. ]

In [16]:
x = np.linspace( 0, 2*pi, 10) # useful to evaluate function at lots of points

print('x = ', x)

f = np.sin(x)

print('f = ', f)

x = [0. 0.6981317 1.3962634 2.0943951 2.7925268 3.4906585

4.1887902 4.88692191 5.58505361 6.28318531]

f = [ 0.00000000e+00 6.42787610e-01 9.84807753e-01 8.66025404e-01

3.42020143e-01 -3.42020143e-01 -8.66025404e-01 -9.84807753e-01

-6.42787610e-01 -2.44929360e-16]

Random number generator


np.random.rand() :

Creates an array of the given shape and populate it with random samples from a uniform
distribution over [0, 1] (i.e., inclusive of 0 but exclusive of 1)

In [17]:
np.random.rand() # generate a random scalar number uniform in [0,1)

Out[17]: 0.013906036200248773

In [18]:
np.random.rand(4) # generates an array of size 4

Out[18]: array([0.46231009, 0.87923559, 0.96660348, 0.65176363])

In [19]:
np.random.rand(3,2) # generates a matrix of size 3x2

Out[19]: array([[0.88904638, 0.69065972],

[0.62905978, 0.48270902],

[0.80594488, 0.56769997]])
np.random.randint(low, high=None, size=None) :

Returns an array of size size . The array is populated with random integers from the discrete
uniform distribution in the interval [low, high] (excluding high ). If high is None (the
default), then results are from [0, low] .

In [20]:
np.random.randint(0, 20+1) # generate a random integer scalar between 0 and 20

Out[20]: 0

In [21]:
np.random.randint(0, 21, (3,2))# generates a matrix of size (3,2)

Out[21]: array([[ 7, 18],

[11, 14],

[11, 11]])
np.random.randn :

Returns a sample (or samples) from the standard normal distribution.

In [22]:
np.random.randn() # univariate 'normal' (Gaussian distribution, with a mean = 0, vari

Out[22]: 1.4924468880710682

The following code show how to create a 2x2 matrix whose numbers are sampled from a normal
distribution

In [23]:
np.random.randn(2, 2)

Out[23]: array([[ 0.40004349, -0.01968118],


[-1.09945574, -0.73972181]])

Seeding the random generator


np.random.seed()

By default, random number generators uses the system time for random seeding. This allows you to
generate different numbers each time you run your simultation.

But, sometimes you may want repeatable result. To produce repeatable results, seed your generator
using a constant number. Try running the following two cells. Notice that when we apply the same
seed, the random number generator will generate the same number sequences.

In [24]:
np.random.seed(4)

print(np.random.rand(4))

print(np.random.rand(4))

[0.96702984 0.54723225 0.97268436 0.71481599]

[0.69772882 0.2160895 0.97627445 0.00623026]

In [25]:
np.random.seed(4)

print(np.random.rand(4))

print(np.random.rand(4))

[0.96702984 0.54723225 0.97268436 0.71481599]

[0.69772882 0.2160895 0.97627445 0.00623026]

Exercise 1
Q1. Create a vector containing the first 10 odd integers using np.arange .

Answer:

array([ 1, 3, 5, 7, 9, 11, 13, 15, 17, 19])

In [26]:
np.arange(1, 20, 2)

Out[26]: array([ 1, 3, 5, 7, 9, 11, 13, 15, 17, 19])

Q2. Create a uniform subdivision of the interval -1.3 to 2.5 with 64 subdivisions using
np.linspace .

Answer:

array([-1.3 , -1.23968254, -1.17936508, -1.11904762, -1.05873016,

-0.9984127 , -0.93809524, -0.87777778, -0.81746032, -0.75714286,

-0.6968254 , -0.63650794, -0.57619048, -0.51587302, -0.45555556,

-0.3952381 , -0.33492063, -0.27460317, -0.21428571, -0.15396825,

-0.09365079, -0.03333333, 0.02698413, 0.08730159, 0.14761905,

0.20793651, 0.26825397, 0.32857143, 0.38888889, 0.44920635,

0.50952381, 0.56984127, 0.63015873, 0.69047619, 0.75079365,

0.81111111, 0.87142857, 0.93174603, 0.99206349, 1.05238095,

1.11269841, 1.17301587, 1.23333333, 1.29365079, 1.35396825,

1.41428571, 1.47460317, 1.53492063, 1.5952381 , 1.65555556,

1.71587302, 1.77619048, 1.83650794, 1.8968254 , 1.95714286,

2.01746032, 2.07777778, 2.13809524, 2.1984127 , 2.25873016,

2.31904762, 2.37936508, 2.43968254, 2.5 ])

In [27]:
np.linspace(-1.3, 2.5, 64)

Out[27]: array([-1.3 , -1.23968254, -1.17936508, -1.11904762, -1.05873016,

-0.9984127 , -0.93809524, -0.87777778, -0.81746032, -0.75714286,

-0.6968254 , -0.63650794, -0.57619048, -0.51587302, -0.45555556,

-0.3952381 , -0.33492063, -0.27460317, -0.21428571, -0.15396825,

-0.09365079, -0.03333333, 0.02698413, 0.08730159, 0.14761905,

0.20793651, 0.26825397, 0.32857143, 0.38888889, 0.44920635,

0.50952381, 0.56984127, 0.63015873, 0.69047619, 0.75079365,

0.81111111, 0.87142857, 0.93174603, 0.99206349, 1.05238095,

1.11269841, 1.17301587, 1.23333333, 1.29365079, 1.35396825,

1.41428571, 1.47460317, 1.53492063, 1.5952381 , 1.65555556,

1.71587302, 1.77619048, 1.83650794, 1.8968254 , 1.95714286,

2.01746032, 2.07777778, 2.13809524, 2.1984127 , 2.25873016,

2.31904762, 2.37936508, 2.43968254, 2.5 ])


Q3. Create a matrix of size (4, 5). Each item is a random integer between 0 and 10. Use
np.random.randint .

Sample solution (Numbers in matrix must be integers between 0 and 10):

array([[ 7, 9, 8, 4, 2],

[ 6, 10, 4, 3, 0],

[ 7, 5, 5, 9, 6],

[ 6, 8, 2, 5, 8]])

In [28]:
np.random.randint(0, 11,(4,5))

Out[28]: array([[ 7, 9, 8, 4, 2],

[ 6, 10, 4, 3, 0],

[ 7, 5, 5, 9, 6],

[ 6, 8, 2, 5, 8]])
Q4. Create a matrix of numbers randomly distributed between 0 and 1, having a size of (3,4). Use
np.random.rand .

Sample solution (numbers in matrix must be between 0 and 1):

array([[ 0.11798675, 0.88034312, 0.57913209, 0.06656284],

[ 0.66977722, 0.36144647, 0.7793515 , 0.85881387],

[ 0.39195763, 0.64347722, 0.17478051, 0.30934148]])

In [29]:
np.random.rand(3,4)

Out[29]: array([[0.01902474, 0.03978036, 0.94500385, 0.4463492 ],

[0.44134853, 0.06570954, 0.17586123, 0.86588276],

[0.84352812, 0.92515316, 0.87030648, 0.83525671]])


Q5. Create a dataset by sampling from a normal distribution. The dataset contains 1200 samples
(rows). Each point has 2 attributes (columns). Use np.random.randn .

Sample solution (the number in matrix is sampled from a normal distribution):

array([[-0.08722439, 0.2021376 ],

[-0.68195243, 0.3402292 ],

[ 0.92526745, -1.51547866],

...,

[-0.81260973, 0.45430727],

[-0.11204656, 0.72001995],

[-0.88341863, -0.43912629]])

In [30]:
a = np.random.randn(1200, 2)

SECTION 2: NUMERICAL OPERATIONS


Arithmetic operations
Typically, arithmetic operators on arrays apply element-wise. A new array is created and filled with
the result.

In [31]:
a = np.array ([ 2, 4, 6, 8])

b = np.array ([ 1, 2, 3, 1])

In [32]:
c = a - b # element-wise subtraction, a and b will remains unchanged

print(a)

print(b)

print(c)

[2 4 6 8]

[1 2 3 1]

[1 2 3 7]

In [33]:
a + b # element-wise addition

Out[33]: array([3, 6, 9, 9])

In [34]:
a + 2 # element-wise addition with constant

Out[34]: array([ 4, 6, 8, 10])

In [35]:
a / b # element-wise division

Out[35]: array([2., 2., 2., 8.])

In [36]:
a / 2 # element-wise division with constant

Out[36]: array([1., 2., 3., 4.])

In [37]:
a * b # element-wise multiplication

Out[37]: array([ 2, 8, 18, 8])

In [38]:
a ** b # Element-wise power

Out[38]: array([ 2, 16, 216, 8], dtype=int32)

In [39]:
a += b # this will change the content of a

Out[39]: array([3, 6, 9, 9])

In [40]:
a *= b # this will change the content of a

Out[40]: array([ 3, 12, 27, 9])

Unlike in many matrix languages, the product operator * performs elementwise multiplication,
NOT matrix multiplication

In [41]:
A = np.array( [[1,1], [0,1]] )

B = np.array( [[2,0], [3,4]] )

A*B

Out[41]: array([[2, 0],

[0, 4]])

Comparison and Logical operations


In [42]:
a1 = np.array([13, 22, 37, 42])

a2 = np.array([13, 22, 37, 42])

b = np.array([2, 4, 7, 5])

In [43]:
a1 < 35 # element-wise comparison

Out[43]: array([ True, True, False, False])

In [44]:
a1 == 42

Out[44]: array([False, False, False, True])

In [45]:
a1 == a2

Out[45]: array([ True, True, True, True])


In [46]:
np.array_equal(a1, a2)

Out[46]: True

In [47]:
np.array_equal(a1, b)

Out[47]: False

In [48]:
bool1 = np.array([True, True, False, False])

bool2 = np.array([False, True, False, True])

all_true = np.array([True, True, True, True])

all_false = np.array([False, False, False, False])

In [49]:
np.all(bool1) # Checks if every value is True

Out[49]: False

In [50]:
np.all(all_true)

Out[50]: True

In [51]:
np.any(bool1) # Checks if any value is True

Out[51]: True

In [52]:
np.any(all_false)

Out[52]: False

In [53]:
bool1 & bool2 # Element-wise AND operation

Out[53]: array([False, True, False, False])

In [54]:
bool1 | bool2 # Element-wise OR operation

Out[54]: array([ True, True, False, True])

Comparing arrays
In [55]:
a = np.array([1, 2, 3, 4])

b = np.array([4, 2, 2, 4])

In [56]:
a == b

Out[56]: array([False, True, False, True])

In [57]:
np.maximum (a, b) # Get the element-wise maximum of all corresponding items in a and

Out[57]: array([4, 2, 3, 4])

In [58]:
np.minimum (a, b) # Get the element-wise minimum of all corresponding items in a and

Out[58]: array([1, 2, 2, 4])

Applying numpy functions


In [59]:
a = np.random.randint(-10, 10, 5)

Out[59]: array([-1, 6, -6, 9, 5])

In [60]:
np.sin(a)

Out[60]: array([-0.84147098, -0.2794155 , 0.2794155 , 0.41211849, -0.95892427])

In [61]:
np.exp(a)

Out[61]: array([3.67879441e-01, 4.03428793e+02, 2.47875218e-03, 8.10308393e+03,

1.48413159e+02])

In [62]:
np.abs(a)

Out[62]: array([1, 6, 6, 9, 5])

In [63]:
np.max(a) # get the item with the biggest value

Out[63]: 9

In [64]:
np.min(a) # get the item with the smallest value

Out[64]: -6

ndarray functions
All ndarrays come with a set of functions that can be invoked to operate on its data.

In [65]:
a = np.random.randint(0, 20, 10)

Out[65]: array([10, 7, 7, 2, 9, 11, 4, 17, 1, 2])


In [66]: print('sum: ', a.sum()) # compute the sum

print('mean: ', a.mean()) # compute the mean

print('std: ', a.std()) # compute the standard deviation

print('min: ', a.min()) # get the smallest value in array

print('argmin: ', a.argmin()) # find the index of the item with the smalles
print('max: ', a.max()) # get the maximum value

print('argmax: ', a.argmax()) # find the index of the item with the biggest

sum: 70

mean: 7.0

std: 4.732863826479693

min: 1

argmin: 8

max: 17

argmax: 7

ndarray operations on 2-D arrays

In [67]:
a = np.random.randint (0, 20, (3, 2))

Out[67]: array([[ 7, 1],

[ 5, 7],

[19, 19]])

In [68]:
print('sum: ', a.sum()) # compute the sum

print('mean: ', a.mean()) # compute the mean

print('std: ', a.std()) # compute the standard deviation

print('min: ', a.min()) # get the smallest value in array

print('argmin: ', a.argmin()) # find the index of the item with the smalles
print('max: ', a.max()) # get the maximum value

print('argmax: ', a.argmax()) # find the index of the item with the biggest

sum: 58

mean: 9.666666666666666

std: 6.896053621859068

min: 1

argmin: 1

max: 19

argmax: 4

By default, these operations apply to the 2-D array as though it were a list of numbers, regardless of
its shape.

However, by specifying the axis parameter you can apply an operation along the specified axis of an
array:

In [69]:
a.sum(axis=0) # axis = 0 refers to the column dimension (sum all items in eac

Out[69]: array([31, 27])

In [70]:
a.sum(axis=1) # axis = 1 refers to the row dimension (sum all items in each row

Out[70]: array([ 8, 12, 38])


Exercise 2
Q1. Create a random vector with values [0, 1) of size 30. Then compute its summation, mean and
standard deviation values.

Sample solution:

[ 0.31826591 0.64980313 0.9074009 0.01855129 0.1471393 0.21421904

0.35889039 0.26098793 0.67206902 0.22928994 0.67655462 0.46969651

0.59789036 0.77442506 0.786646 0.30591469 0.36607411 0.08582968

0.82389183 0.93058502 0.25422148 0.24471819 0.90520293 0.48908817

0.55304988 0.73125874 0.84016257 0.18487058 0.80443297 0.69220994]

summation = 15.2933401657

mean = 0.509778005522

std = 0.273489774136

In [71]:
Z = np.random.rand(30)

print(Z)

print('summation = ', Z.sum())

print('mean = ', Z.mean())

print('std = ', Z.std())

[0.31826591 0.64980313 0.9074009 0.01855129 0.1471393 0.21421904

0.35889039 0.26098793 0.67206902 0.22928994 0.67655462 0.46969651

0.59789036 0.77442506 0.786646 0.30591469 0.36607411 0.08582968

0.82389183 0.93058502 0.25422148 0.24471819 0.90520293 0.48908817

0.55304988 0.73125874 0.84016257 0.18487058 0.80443297 0.69220994]

summation = 15.293340165651928

mean = 0.5097780055217309

std = 0.2734897741357628

Q2. Create a 3x3 array with random integer values [0, 10) . Then, find the following:

1. maximum value in the whole matrix


2. minimum value of each row of the matrix
3. sum of each column of the matrix

Sample solution:

[[0 6 6]

[3 7 4]

[3 6 5]]

maximum value in the whole matrix: 7

minimum value of each row of the matrix: [0 3 3]

sum of each column of the matrix: [ 6 19 15]

In [72]:
Z = np.random.randint(0, 10, (3,3))

print(Z, '\n')

print('maximum value in the whole matrix:', Z.max())

print('minimum value of each row of the matrix:', Z.min(axis=1))

print('sum of each column of the matrix:', Z.sum(axis=0))

[[0 6 6]

[3 7 4]

[3 6 5]]

maximum value in the whole matrix: 7

minimum value of each row of the matrix: [0 3 3]

sum of each column of the matrix: [ 6 19 15]

Q3. Generate the array containing the value of f (x) = sin(x) for −2π ≤ x ≤ 2π . Compute
f (x)
for 100 evenly spaced points in x.

Then, plot the values using the following command:

import matplotlib.pyplot as plt

plt.plot(x, y)

plt.show()

In [73]:
angles = np.linspace(-2*np.pi, 2*np.pi, 100)

sinvalues = np.sin(angles)

import matplotlib.pyplot as plt

plt.plot(angles, sinvalues)

plt.show()

2
x

Q4. Generate the array containing the value of f (x) for −5 . The array should
1 −
= e 2
≤ x ≤ 5
√2π

contain f (x) for 100 evenly spaced points in x. Then, plot out the computed values. Refer to Q3 on
how to plot the graph.

In [74]:
x = np.linspace(-5, 5, 100)

fx = np.exp(-x**2/2)/np.sqrt(2*np.pi)

import matplotlib.pyplot as plt

plt.plot(x, fx)

plt.show()

Q5. Given the vector a = np.array([12, 14, 20, 26, 39, 49]) , find the closest value to
query = 27 . The algorithm to achieve this is as follows:

compute the absolute difference between all items with the query item (Useful command:
np.abs )
get the item with the smallest absolute distance (Useful command: np.argmin )

Ans:

The closest number in a to 27 is 26

In [75]:
a = np.array([12, 14, 20, 26, 39, 49])

query = 27

index = (np.abs(a-query)).argmin()

print('The closest number in a to', query, 'is', a[index])

The closest number in a to 27 is 26

Q6. Randomly generate a vector a of size 10 with values betwen 100 and 200. Compute the L2
norm of a Then normalize the vector a to get an by dividing each item with the L2 norm.

For example, given a = [4, 3] , the L2 norm of a is √4 and an = [0.8, 0.6]


2 2
+ 3 = 5

In [76]:
import numpy as np

a = np.array([4, 3])

print('a: ', a)

L2ofA = np.sqrt(np.sum(a**2))

print('Norm 2 = ', L2ofA)

print('Normalized a = ', a/L2ofA)

a: [4 3]

Norm 2 = 5.0

Normalized a = [0.8 0.6]

In [77]:
a = np.random.randint(100, 201, 10)

print('a: ', a)

L2ofA = np.sqrt(np.sum(a**2))

print('Norm 2 = ', L2ofA)

print('Normalized a = ', a/L2ofA)

a: [178 181 132 136 166 180 136 128 197 186]

Norm 2 = 518.2721292911668

Normalized a = [0.34344891 0.34923738 0.25469245 0.26241041 0.32029505 0.34730789

0.26241041 0.2469745 0.38010919 0.35888482]

Q7. Given two vectors d1 = (5, 0, 3, 0, 2, 0, 0, 2, 0, 0) and d2 = (3, 0, 2, 0, 1,


1, 0, 1, 0, 1) , compute the cosine similarity between d1 and d2 .
n
∑ a i bi
i=1
cos(a, b) =
n n
2 2
√∑ a √∑ b
i=1 i i=1 i

The steps are as follows:

1. Compute A = ∑ = sum(5(3) + 0(0) + ... (0)(1)) = 25


n
ai b i
i=1

2. Compute B = √∑ = sqrt(sum(5(5) + 0(0) + ... + 0(0))) = 6.48


n 2
a
i=1 i

3. Compute C = √∑ = sqrt(sum(3(3) + 0(0) + ... + 1(1))) = 4.12


n 2
b
i=1 i

4. cos(d1, d2) = A / (B*C) = 0.9356

In [78]:
d1 = np.array([5, 0, 3, 0, 2, 0, 0, 2, 0, 0])

d2 = np.array([3, 0, 2, 0, 1, 1, 0, 1, 0, 1])

A = np.sum(d1*d2)

B = np.sqrt(np.sum(d1*d1))

C = np.sqrt(np.sum(d2*d2))

cos = A/(B*C)

print(cos)

0.9356014857063997

SECTION 3: INDEXING, SLICING AND ITERATING


Simple Indexing and Slicing
One-dimensional arrays can be indexed, sliced and iterated over, much like lists and other Python
sequences.

In [79]:
a = np.random.randint(0, 100, 10)

print(a)

[65 60 37 21 18 96 26 24 39 83]

In [80]:
a[2] # get item 2

Out[80]: 37

In [81]:
a[2:5] # start = 2, end = 5 (not inclusive). This gets item 2 up to item 4

Out[81]: array([37, 21, 18])

In [82]:
a[:3] # equivalent to a[0:3]

Out[82]: array([65, 60, 37])

In [83]:
a[7:] # gets item at position 7 to the last position

Out[83]: array([24, 39, 83])

We can change the content of the array using slicing.

In [84]:
a[2] = -1 # overwritting the third item

Out[84]: array([65, 60, -1, 21, 18, 96, 26, 24, 39, 83])

In [85]:
a[::2] = -1000 # equivalent to a[0:10:2] = -1000

Out[85]: array([-1000, 60, -1000, 21, -1000, 96, -1000, 24, -1000,

83])

Negative indexing
Numpy array also supports negative indexing where we index from the end of the list.

In [86]:
a = np.random.randint(0, 100, 10)

print(a)

[24 19 66 85 65 23 56 14 10 5]

In [87]:
a[-1]

Out[87]: 5

In [88]:
a[-3:] # get the last three items

Out[88]: array([14, 10, 5])

Multidimensional indexing
The following codes show how to index a 2-D array.

In [89]:
a = np.random.randint(0, 100, (3, 5))

Out[89]: array([[18, 25, 92, 31, 75],

[31, 98, 27, 99, 37],

[39, 87, 49, 25, 58]])

In [90]:
a[2, 3] # item at row 2, column 3 (third row, fourth column)

Out[90]: 25

In [91]:
a[:, 1] # get the column 1 of a

Out[91]: array([25, 98, 87])

In [92]:
a[:2, 1] # first two items in column 1 of a

Out[92]: array([25, 98])

In [93]:
a[:, 1:3] # get column 1 and 2 (not inclusive of 3)

Out[93]: array([[25, 92],

[98, 27],

[87, 49]])
When fewer indices are provided than the number of axes, the missing indices are considered
complete slices:

In [94]:
a[2] # get row 2

Out[94]: array([39, 87, 49, 25, 58])

In [95]:
a[1:] # get row 1 to the last row

Out[95]: array([[31, 98, 27, 99, 37],

[39, 87, 49, 25, 58]])

Indexing with Integer Arrays


Indexing a 1-D array

In [96]:
a = np.array([10, 11, 13, 0, 10, 10, 16, 12, 18, 9])

selected = [2, 3, 5] #select third, fourth and sixth item from array

b = a[selected]

Out[96]: array([13, 0, 10])

Indexing a 2-D array

In [97]:
a = np.array([[2, 5, 1, 1, 2],

[5, 7, 2, 7, 35],

[1, 9, 8, 49, 9]])

selected_row = (2, 1)

selected_col = (3, 4)

b = a[selected_row, selected_col] # extracts item at location (2, 3) and (


b

Out[97]: array([49, 35])

Indexing with Boolean Arrays


We can also index an array using boolean arrays.

1-D indexing using Boolean Arrays

In [98]:
a = np.array([23, -56, 2, 3, -57])
a

Out[98]: array([ 23, -56, 2, 3, -57])

In [99]:
selected = [True, False, True, False, False]

a[selected]

Out[99]: array([23, 2])

In [100…
all_pos = a > 0

a[all_pos] # extract all positive numbers from a

Out[100… array([23, 2, 3])

In [101…
isEven = a % 2 == 0

a [isEven] # extract all even numbers from a

Out[101… array([-56, 2])

2-D indexing using Boolean Arrays

In [102…
a = np.array([[2, 5, 1, 1, 2],

[5, 7, 2, 7, 35],

[1, 9, 8, 49, 9]])

In [103…
sel_rows = [False, True, True]

a[sel_rows] # select the last two rows

Out[103… array([[ 5, 7, 2, 7, 35],

[ 1, 9, 8, 49, 9]])
In [104… sel_cols = [True, True, False, False, False]

a[:, sel_cols] # select the first two columns

Out[104… array([[2, 5],

[5, 7],

[1, 9]])

Exercise 3
Q1. Create a vector of zeros (integers) of size 10. Use slicing to set the first two items with 2, the
next 3 items with 4 and the remaining items with 6.

In [105…
a = np.zeros(10, dtype = 'int')

a[:2] = 2

a[2:5] = 4

a[5:] = 6

Out[105… array([2, 2, 4, 4, 4, 6, 6, 6, 6, 6])

Q2. Generate a random matrix of size 6x6. Then write the codes to extract the following items from
the matrix.

In [106…
a = np.random.randint(1,100, (6,6))

print(a, '\n')

print ('Matrix (a):')

print(a[2], '\n')

print ('Matrix (b):')

print(a[:,2], '\n')

print ('Matrix (c):')

print(a[0,3:5], '\n')

print ('Matrix (d):')

print(a[2:5], '\n')

print ('Matrix (e):')

print(a[:, [2, 4]], '\n')

print ('Matrix (f):')

print(a[4:,4:])

[[87 2 90 53 59 14]

[92 98 88 66 53 73]

[56 63 49 24 95 60]

[32 95 35 55 69 24]

[ 9 35 63 30 79 89]

[98 90 37 94 75 49]]

Matrix (a):

[56 63 49 24 95 60]

Matrix (b):

[90 88 49 35 63 37]

Matrix (c):

[53 59]

Matrix (d):

[[56 63 49 24 95 60]

[32 95 35 55 69 24]

[ 9 35 63 30 79 89]]

Matrix (e):

[[90 59]

[88 53]

[49 95]

[35 69]

[63 79]

[37 75]]

Matrix (f):

[[79 89]

[75 49]]

Q3. Randomly generate an array comprising all numbers of size 20 with numbers between 0 and
100. Then, extract the array of numbers that are both even and divisible by 3.

In [107…
a = np.random.randint(0,101,20)

print(a)

selected = np.logical_and(a%2 == 0, a%3 == 0)

print(a[selected])

[89 45 55 72 31 49 60 98 91 28 97 68 72 88 95 62 41 38 1 95]

[72 60 72]

SECTION 4: MATRIX COMPUTATION


Basic matrix computation

In [108…
A = np.array([[1, 0], [2, 5], [3, 1]], dtype = 'int')

print(A)

B = np.array([[4, 0.5], [2, 5], [0, 1]], dtype = 'int')

print(B)

[[1 0]

[2 5]

[3 1]]

[[4 0]

[2 5]

[0 1]]

In [109…
A + B # A plus B

Out[109… array([[ 5, 0],

[ 4, 10],

[ 3, 2]])

In [110…
A - B # A subtract B

Out[110… array([[-3, 0],

[ 0, 0],

[ 3, 0]])
Matrix multiplication

np.dot :

In numpy, matrix multiplication is performed through np.dot . The operator * only performs
element-wise multiplication.

In [111…
m1 = np.array([[1, 2],[1, 4]])

print (m1)

m2 = np.array([[0, 1],[1, 2]])

print (m2)

v1 = np.array([2, 3])

print(v1)

v2 = np.array([1, 5])

print(v2)

[[1 2]

[1 4]]

[[0 1]

[1 2]]

[2 3]

[1 5]

In [112…
np.dot(m1, m2) # matrix matrix multiplication

Out[112… array([[2, 5],

[4, 9]])

In [113…
m1.dot(m2) # matrix matrix multiplication

Out[113… array([[2, 5],

[4, 9]])

In [114…
np.dot(m1, v1) # matrix vector multiplication

Out[114… array([ 8, 14])

In [115…
np.dot(v1, v2) # vector-vector multiplication (or dot product)

Out[115… 17

Transpose a matrix

Transpose of a matrix is obtained by changing rows to columns and columns to rows. In other
words, transpose of A[][] is obtained by changing A[i][j] to A[j][i].

In [116…
A.T

Out[116… array([[1, 2, 3],

[0, 5, 1]])
Invert a matrix

Inversion can only work on square matrix

In [117…
b = np.random.randint(0, 10, (3, 3))

print(b)

[[8 0 4]

[5 9 8]

[5 6 2]]

In [118…
np.linalg.inv(b)

Out[118… array([[ 0.1 , -0.08 , 0.12 ],

[-0.1 , 0.01333333, 0.14666667],

[ 0.05 , 0.16 , -0.24 ]])

Exercise 4
Q1. First, set the seed for your random generator to 42. Then, randomly sample from the standard
normal distribution: (a) predicted output vector h and (b) actual output vector y , both having a
size of (m,) where m = 50 (Use np.random.randn ).

Then, compute the RMSE.

1 m
(i) (i) 2
RM SE = √ ∑ (h − y )
m i=1

1
T
= √ (h − y) (h − y)
m

Expected answer: 1.2192


In [119… np.random.seed(42)

m = 50

h = np.random.randn(m)

y = np.random.randn(m)

e = h - y

rmse = np.sqrt(np.dot(e.T, e)/m)

rmse

Out[119… 1.2192273122578396

Q2. Create the matrix X = np.array([[1, 1],[2, 1],[3, 1]]) and y = np.array([1, 3,
7]) . Then, compute the normal equation.

T −1 T
h = (X X) X y

Expected answer: [3, -2.33333333]

In [120…
X = np.array([[1, 1],[2, 1],[3, 1]])

y = np.array([1, 3, 7])

h = np.dot(np.dot(np.linalg.inv(np.dot(X.T, X)), X.T), y)

Out[120… array([ 3. , -2.33333333])

SECTION 5: COPIES vs VIEWS


When operating and manipulating arrays, there are three possible scenarios on how the data is
updated or returned:

1. Reference copy: Returns the reference to the same data.


2. View copy: Similar to reference copy where it returns the reference to the same data. However,
it may change the view (namely the shape) of the data. However, the underlying data are the
same.
3. Copy: Copies the data into a new array. Any changes to the new array would not affect the old
one.

This may be quite confusing for beginners. But, it is important to understand this concepts.

Reference copy
Returns the reference to the same data. This is done through the operator = .

In [121…
a = np.arange(12)

print("a = ", a)

a = [ 0 1 2 3 4 5 6 7 8 9 10 11]

In [122… b = a

print("b = ", b)

b is a # for simple assignment, a and b are two names for the same nd

b = [ 0 1 2 3 4 5 6 7 8 9 10 11]

Out[122… True

Changing the content of a affects the content of b

In [123…
a[0] = -99 # change the content of a affects b as well

print(a)

print(b)

[-99 1 2 3 4 5 6 7 8 9 10 11]

[-99 1 2 3 4 5 6 7 8 9 10 11]

In [124…
b.shape = 3,4 # change the shape of b

print("b = ")

print(b)

print("a = ") # changes to b affects a as well


print(a)

b =

[[-99 1 2 3]

[ 4 5 6 7]

[ 8 9 10 11]]

a =

[[-99 1 2 3]

[ 4 5 6 7]

[ 8 9 10 11]]

View copy
Different array objects can share the same data but present it differently as follows:

Display it in a different shape


Display only portion of it

The following code creates three different view of the same data a .
First, we create our array a .

In [125…
a = np.array([[1, 2, 3],[4, 5, 6]])

Out[125… array([[1, 2, 3],

[4, 5, 6]])
When we flatten an array using .ravel() , it returns a view of the original data.

In [126…
b = a.ravel()

Out[126… array([1, 2, 3, 4, 5, 6])

When we reshape an array using .reshape() , it returns a view of the original data.

In [127…
c = a.reshape(3, 2)

Out[127… array([[1, 2],

[3, 4],

[5, 6]])
When we slice an array, we return a view of it:

In [128…
d = a[ : , 1:3] # spaces added for clarity; could also be written "s = a[:,1:3]"

Out[128… array([[2, 3],

[5, 6]])
Changing the content in b or c or d will change a as well. The changes will be reflected in all
views as well.
In [129… b[-1] = -9999

print('a:\n', a)

print('\nb:\n', b)

print('\nc:\n', c)

print('\nd:\n', d)

a:

[[ 1 2 3]

[ 4 5 -9999]]

b:

[ 1 2 3 4 5 -9999]

c:

[[ 1 2]

[ 3 4]

[ 5 -9999]]

d:

[[ 2 3]

[ 5 -9999]]

Copy
The copy method makes a complete copy of the array and its data.

In [130…
a = np.arange(15).reshape(3, 5)

d = a.copy() # a new array object with new data is created

In [131…
d is a # d is not the same as a

Out[131… False

In [132…
d[0,0] = 9999 # Changing the content of d does not affect a

print("a = ")

print(a)

print("d = ")

print(d)

a =

[[ 0 1 2 3 4]

[ 5 6 7 8 9]

[10 11 12 13 14]]

d =

[[9999 1 2 3 4]

[ 5 6 7 8 9]

[ 10 11 12 13 14]]

SECTION 6: MANIPULATING ARRAYS


We have already seen how we can manipulate shape of the arrays through the commands
.ravel() and .reshape() .

In this section, we further see how to:


1. combine matrices and
2. create a new dimensions to an array.

In [133…
a = np.array([[1,2,3], [4,5,6]])

Out[133… array([[1, 2, 3],

[4, 5, 6]])

In [134…
b = np.array([[23,23,34], [43,54,63]])

Out[134… array([[23, 23, 34],

[43, 54, 63]])


hstack
Stacks two arrays horizontally. The two arrays must have the same number of rows.

In [135…
c = np.hstack((a,b))

Out[135… array([[ 1, 2, 3, 23, 23, 34],

[ 4, 5, 6, 43, 54, 63]])


vstack

Stacks two arrays vertically. The two arrays must have the same number of columns.

In [136…
d = np.vstack((a,b))

Out[136… array([[ 1, 2, 3],

[ 4, 5, 6],

[23, 23, 34],

[43, 54, 63]])


Notes: numpy array is not designed to dynamically add or remove items. If you need to expand or
shrink your arrays consistently, it is better to represent the data using a normal Python list.

newaxis

We can use newaxis to convert an array (vector) of size n into a 2-D array (matrix) of size nx1 .

In [137…
a = np.array([4, 2])

print(a)

print(a.shape)

[4 2]

(2,)

In [138…
b = a[:, np.newaxis]

print(b)

print('shape =', b.shape)

[[4]

[2]]

shape = (2, 1)

Exercise 5
Q1. Create the following matrix:

array([[0, 1, 2],

[3, 4, 5],

[6, 7, 8]])

Use the command arange and reshape to do this.

In [139…
np.arange(9).reshape(3,3)

Out[139… array([[0, 1, 2],

[3, 4, 5],

[6, 7, 8]])
Q2. Create the matrix a as follows: generate a 10x10 matrix of zeros. Then, frame it with a border
of ones.

In [140…
a = np.zeros((10,10))

a = np.hstack((a, np.ones((10,1))))

a = np.hstack((np.ones((10,1)), a))

a = np.vstack((a, np.ones((1,12))))

a = np.vstack((np.ones((1,12)), a))

print(a)

[[1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]

[1. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 1.]

[1. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 1.]

[1. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 1.]

[1. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 1.]

[1. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 1.]

[1. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 1.]

[1. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 1.]

[1. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 1.]

[1. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 1.]

[1. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 1.]

[1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]]

Q3. Create a 8x8 matrix and the fill it with a checkerboard pattern.

First, create a zero matrix of size 8x8


Set the odd rows, even numbers to 1 using slicing
Set the even rows, odd numbers to 1 using slicing

In [141…
A = np.zeros((8,8))

A[1::2,::2] = 1

A[::2,1::2] = 1

print(A)

[[0. 1. 0. 1. 0. 1. 0. 1.]

[1. 0. 1. 0. 1. 0. 1. 0.]

[0. 1. 0. 1. 0. 1. 0. 1.]

[1. 0. 1. 0. 1. 0. 1. 0.]

[0. 1. 0. 1. 0. 1. 0. 1.]

[1. 0. 1. 0. 1. 0. 1. 0.]

[0. 1. 0. 1. 0. 1. 0. 1.]

[1. 0. 1. 0. 1. 0. 1. 0.]]

SECTION 7: ARRAY BROADCASTING


By default, numpy performs element-wise operations. This requires the input arrays to be of the
same shape.

When the input arrays have different shape, it may perform broadcasting before carrying out the
operation. Subject to certain constraints, the smaller array is "broadcast" across the larger array so
that they have compatible shapes.

2-D array and scalar


Let's take an example when we want multiply an array with a scalar:

In [142…
a = np.array([1, 2, 3])

b = 2

a * b

Out[142… array([2, 4, 6])

Note that numpy does not actually stretches b to create a new array but is smart enough to use
the original scalar value without actually making copies so that broadcasting operations are as
memory and computationally efficient as possible.

2-D array and 1-D array


Now, let's try to add a 2-D array with a 1-D array. If the size of the last dimension of the 2-D array is
the same as the number of items in 1-D array, then numpy will peform broadcasting by default.

In the following example,


Note that the shape of a is (4, 3) and the shape of b is (3,). Since the size of last dimension (i.e., 3)
is the same as the number of items in b , numpy will broadcast b to be added to each row of a .

In [143…
a = np.array([[ 0, 0, 0],

[10, 10, 10],

[20, 20, 20],

[30, 30, 30]])

b = np.array([0, 1, 2])

a + b

Out[143… array([[ 0, 1, 2],

[10, 11, 12],

[20, 21, 22],

[30, 31, 32]])


For broadcasting to work, the number of dimensions must be consistent, i.e., the size of the last
dimension of the 2-D array must be the same as the number of elements in the 1-D array.

2-D array and 2-D array

If you want to broadcast the column-wise addition of a matrix of shape (4,3) with a vector (4,),
ensure that you convert the vector into a 2-D array of shape (4, 1) to align the dimensions. Numpy
will then broadcast the operation across all columns.

In [144…
a = np.array([[ 0, 0, 0],

[10, 10, 10],

[20, 20, 20],

[30, 30, 30]])

print(a)

b = np.array([0, 10, 20, 30]).reshape(-1, 1)

print()

print(b)

a + b

[[ 0 0 0]

[10 10 10]

[20 20 20]

[30 30 30]]

[[ 0]

[10]

[20]

[30]]

Out[144… array([[ 0, 0, 0],

[20, 20, 20],

[40, 40, 40],

[60, 60, 60]])

Exercise 6
Q1. Randomly generate a matrix of size (5, 3). Normalize the columns such that the sum of each
column is equal to 1. Use broadcasting in your code to simplify your task.

In [145…
X = np.random.randn(5, 3)

colsum = X.sum(axis = 0) # get the sum of all columns

Xn = X / colsum # broadcasting: divide each column with its sum

print(Xn)

print(Xn.sum(axis = 0)) # show the sum of all columns

[[ 4.09447953 -3.06776877 -3.4267377 ]

[ 2.32088156 -1.17625763 4.04002818]

[ -5.45648524 1.273197 2.5751977 ]

[ 0.21536214 -13.99360956 -0.26510723]

[ -0.174238 17.96443895 -1.92338094]]

[1. 1. 1.]

You might also like