You are on page 1of 32

FR

FABRIKAM RESIDENCES
Data Mining
Introduction to Python
Introduction

2
FR
FABRIKAM RESIDENCES
Numpy
NumPy

Output:

Output: row Output:

Output:
column

4
NumPy
Data type
Several python types are equivalent to a
corresponding array scalar when used to
generate a dtype object:

5
NumPy

Output:

Output:

Output:

6
NumPy

Output:

7
NumPy

step

start end
Output: 3
Output:
Number of rows
Output: 5
Number of columns
Output:
Output:

(Now if you call the variable "x", it will


turn into a 2-dimensional array) Output:

Output: 3 4th column


2nd row

8
NumPy

Random data uniform distribution

Output:

Output:

9
NumPy

Random data normal distribution

Output:

Output:

10
NumPy
Try It!
Output: 1. x = np.array(10)
2. x.shape = (3,5)
3. x[0,:]
4. x[:1,4]
5. y = x[1]
6. y[1:2:3]
7. y[:-2]
8. y[:2]
9. y[2:]
10. a = np.arrange(12)
11. b = np.arrange(12).reshape(3,4)
12. c = np.arrange(12).reshape(6,2)
11
NumPy

Arithmetic Logical Comparison


▪ Addition: + ▪ < for less than
▪ Subtraction: - ▪ > for greater than
▪ Exponent: ** ▪ <= for less than or equal to
▪ Elementwise product: * ▪ >= for greater than or equal to
▪ Matrix product: @ ▪ == for equal to each other
▪ Exponentiation: ^ ▪ != not equal to each other

12
NumPy
Try It! Try It!
a = np.array( [[1,1],[0,1]] ) b.sum(axis=0) # sum of each column
b = np.array( [[2,0],[3,4]] ) b.min(axis=1) # min of each row
c = a-b b.cumsum(axis=1) # cumulative sum along each row
d = a+b k = np.exp(b) #exponential
e = b**2 l = np.sqrt(a) #
f = np.sin(a) #Trigonometri m = np.add(a,b)
g = a < 35 n = np.linalg.inv(b)
h=a*b 0 = np.transpose(b)
i=a@b
j = a.dot(b) # another matrix product

13
FR
FABRIKAM RESIDENCES
Pandas
Pandas
“None”, if there is no header
0,If set 1st row as header

Output:

• This function returns the first `n` rows for the object based on position.
• Default: n = 5
• Example: data.head(2) to return the first 2 rows for the object based on position.
15
Pandas

Output: Output: 150

Output:

Output: 150

Output: 5
• This function returns the last `n` rows for the object based on
position.
• Default: n = 5
• Example: data.head(2) to return the last 2 rows for the object based
on position.
16
Pandas

Output:

17
Pandas

Output: Output:

Descriptive Statistics for categorical data

Descriptive Statistics for numerical data

18
Pandas

Output:

Variable selection
Output:

19
Pandas

Output: Output:

20
Pandas

Output: Output:

21
Pandas

Output: Output:

22
Pandas

Output:

Output:

Change row names

Change column names

23
Pandas

Output:

24
Pandas

Output:
Delete column “Trialx”

Output:

25
Pandas

Output: Output:

Drop row
Drop column

data_slice never change

26
Pandas

Output:

Create a variable to store changes. You can create a variable data_slice to replace original variable.

27
Pandas
Logical Comparison
▪ < for less than
Output: ▪ > for greater than
▪ <= for less than or equal to
▪ >= for greater than or equal to
▪ == for equal to each other
▪ != not equal to each other

Output:

28
Pandas
Factor in python
Output:

Output:

29
Pandas

Output: Change data type of “species” column in variable data

Output:

30
Pandas
Function:

Output: 112

Output:

31
Pandas
Function:

Output:

32

You might also like