You are on page 1of 1

SLICING (INDEXING/SUBSETTING)

Numpy (Numerical Python)


Numpy Cheat Sheet Setting data with assignment :

ndarray1[ndarray1 < 0] = 0 *
5. Boolean arrays methods

Count # of ‘Trues’ (ndarray1 > 0).sum()


Python Package in boolean array

Created By: Arianne Colton and Sean Chen If ndarray1 is two-dimensions, ndarray1 < 0 If at least one ndarray1.any()
*
creates a two-dimensional boolean array. value is ‘True’
If all values are ndarray1.all()
‘True’
Numpy (Numerical Python) COMMON OPERATIONS
1. Transposing Note: These methods also work with non-boolean
What is NumPy? Default data type is ‘np.float64’. This is • A special form of reshaping which returns a ‘view’ arrays, where non-zero elements evaluate to True.
** equivalent to Python’s float type which is 8 on the underlying data without copying anything.
Foundation package for scientific computing in Python bytes (64 bits); thus the name ‘float64’.
ndarray1.transpose() or 6. Sorting
Why NumPy? If casting were to fail for some reason,
*** ‘TypeError’ will be raised. ndarray1.T or Inplace sorting ndarray1.sort()
• Numpy ‘ndarray’ is a much more efficient way
of storing and manipulating “numerical data” ndarray1.swapaxes(0, 1)
than the built-in Python data structures. Return a sorted sorted1 =
SLICING (INDEXING/SUBSETTING) 2. Vectorized wrappers (for functions that np.sort(ndarray1)
• Libraries written in lower-level languages, such copy instead of
as C, can operate on data stored in Numpy • Slicing (i.e. ndarray1[2:6]) is a ‘view’ on take scalar values) inplace
‘ndarray’ without copying any data. the original array. Data is NOT copied. Any • math.sqrt() works on only a scalar
modifications (i.e. ndarray1[2:6] = 8) to the
N-DIMENSIONAL ARRAY (NDARRAY) np.sqrt(seq1) # any sequence (list, 7. Set methods
‘view’ will be reflected in the original array.
ndarray, etc) to return a ndarray
What is NdArray? • Instead of a ‘view’, explicit copy of slicing via : Return sorted np.unique(ndarray1)

3. Vectorized expressions unique values


Fast and space-efficient multidimensional array ndarray1[2:6].copy()
(container for homogeneous data) providing vectorized • np.where(cond, x, y) is a vectorized version Test membership resultBooleanArray =
arithmetic operations of ndarray1 values np.in1d(ndarray1, [2,
• Multidimensional array indexing notation : of the expression ‘x if condition else y’ 3, 6])
in [2, 3, 6]
Create NdArray np.array(seq1) ndarray1[0][2] or ndarray1[0, 2] np.where([True, False], [1, 2],
# seq1 - is any sequence like object, [2, 3]) => ndarray (1, 3)
• Other set methods : intersect1d(),union1d(),
i.e. [1, 2, 3]
setdiff1d(), setxor1d()
Create Special 1, np.zeros(10) * Boolean indexing :
• Common Usages :
NdArray # one dimensional ndarray with 10 ndarray1[(names == ‘Bob’) | (names == 8. Random number generation (np.random)
elements of value 0 ‘Will’), 2:] np.where(matrixArray > 0, 1, -1)
2, np.ones(2, 3) • Supplements the built-in Python random * with
# ‘2:’ means select from 3rd column on => a new array (same shape) of 1 or -1 values
# two dimensional ndarray with 6
functions for efficiently generating whole arrays
elements of value 1 np.where(cond, 1, 0).argmax() * of sample values from many kinds of probability
3, np.empty(3, 4, 5) * Selecting data by boolean indexing => Find the first True element distributions.
*
# three dimensional ndarray of ALWAYS creates a copy of the data.
samples = np.random.normal(size =(3, 3))
uninitialized values argmax() can be used to find the
4, np.eye(N) or The ‘and’ and ‘or’ keywords do NOT work index of the maximum element.
* * Example usage is find the first
np.identity(N) with boolean arrays. Use & and |. Python built-in random ONLY samples
# creates N by N identity matrix element that has a “price > number” *
in an array of price data. one value at a time.
NdArray version of np.arange(1, 10)
Python’s range * Fancy indexing (aka ‘indexing using integer arrays’)
Select a subset of rows in a particular order : 4. Aggregations/Reductions Methods
Get # of Dimension ndarray1.ndim (i.e. mean, sum, std)
Get Dimension Size dim1size, dim2size, .. = ndarray1[ [3, 8, 4] ]
ndarray1.shape Compute mean ndarray1.mean() or
ndarray1[ [-1, 6] ]
Get Data Type ** ndarray1.dtype np.mean(ndarray1)
# negative indices select rows from the end Created by Arianne Colton and Sean Chen
Explicit Casting ndarray2 = ndarray1. Compute statistics ndarray1.mean(axis = 1)
astype(np.int32) *** www.datasciencefree.com
Fancy indexing ALWAYS creates a over axis * ndarray1.sum(axis = 0)
* Based on content from
copy of the data.
Cannot assume empty() will return all zeros. ‘Python for Data Analysis’ by Wes McKinney
* * axis = 0 means column axis, 1 is row axis.
It could be garbage values.
Updated: August 18, 2016

You might also like