You are on page 1of 2

Statistics with python

from scipy import stats


import matplotlib
import matplotlib.pyplot as plt
import scipy.stats as stats
import pandas as pd
import numpy as np

We learn binomial,normal,chisquare distribution


We can do binomial distribution using np.random.binomial
WE can do normal distribution by using np.random.normal(mu,segma,n)
We can find the standard deviation of a list of number by using np.std(distribution)
We can find the kurtosis of data by using stats.kurtosis()
We can find the skewness of the data
We can do chisquare distribution by using np.random.chisquare(df , sample size)
We can do ttest by using stats.ttest_ind

Binomial distribution np.random.binomial(n= number of operation per simulation, p = x = np.random.binomial(2,0.5,5)


There can be only two probability of a single process,number of simulation) print(x)
probability of the print((x==0))
occurance of an event print((x==0).sum())
True/False. In the example the probability of an event is 50%.A single simulation print((x==0).mean())
consists 2 operation.And the simulation is done 5 time. 2/1/0 indicates Results:
in a single simulation the event occurred 2/1/0times. [2 1 1 2 1]
[False False False False False]
0
0.0
Determining the chance_of_tornado = 0.01 111 tornadoes back to back in
possibility of occuring 2 2739.72602739726 years
tornedos 2 days tornado_events = np.random.binomial(1, chance_of_tornado,
1000000)

two_days_in_a_row = 0
for j in range(1,len(tornado_events)-1):
if tornado_events[j]==1 and tornado_events[j-1]==1:
two_days_in_a_row+=1

print('{} tornadoes back to back in {}


years'.format(two_days_in_a_row, 1000000/365))
Normal distribution np.random.normal(mu,standard deviation,sample size) It will give a list of length of
np.random.normal(1,0.25,1000) sample size

Standard deviation np.std(distribution) Here distribution is a list of


numbers
Kurtosis stats.kurtosis(distribution) Here distribution is a list of
k>0 means the normal distribution is more sharply peak numbers
k = 0 means normal
k<0 means the normal distribution is less sharply peak
Skewness stats.skew(distribution) Here distribution is a list of
positive skewness means skewed to left numbers
negative skewness means skewed to right
chisquare chi_squared_df2 = np.random.chisquare(df,sample size) Chisquare distribution is
chi_squared_df2 = np.random.chisquare(2,10000) skeweed to left .but with the
stats.skew(chi_squared_df2) increasing degree of freedom
1.9321803412553158 the skewness moves to become
chi_squared_df5 = np.random.chisquare(5,10000) normal distribution.
stats.skew(chi_squared_df5)
1.282805101471107

ttest stats.ttest_ind(early['assignment1_grade'], late['assignment1_grade'])

here early[‘assignment1_grade’] in a row in early table


here late[‘assignment1_grade’] in a row in early table

You might also like