Welcome to Scribd, the world's digital library. Read, publish, and share books and documents. See more
Download
Standard view
Full view
of .
Look up keyword
Like this
4Activity
0 of .
Results for:
No results containing your search query
P. 1
NumPy, SciPy, Pandas, Quandl Cheat Sheet

NumPy, SciPy, Pandas, Quandl Cheat Sheet

Ratings: (0)|Views: 1,045|Likes:
Published by seanrwcrawford
A quick reference for data gathering and analysis using the Python packages: NumPy, SciPy, Pandas, and Quandl.
A quick reference for data gathering and analysis using the Python packages: NumPy, SciPy, Pandas, and Quandl.

More info:

Published by: seanrwcrawford on Oct 01, 2013
Copyright:Attribution Non-commercial

Availability:

Read on Scribd mobile: iPhone, iPad and Android.
download as PDF, TXT or read online from Scribd
See more
See less

04/16/2015

pdf

text

original

 
1
NumPy / SciPy / Pandas Cheat Sheet
Select column.Select row by label.Return DataFrame index.Delete given row or column. Pass axis=1 for columns.Reindex df1 with index of df2.Reset index, putting old index in column named index.Change DataFrame index, new indecies set to NaN.
Show rst n rows.
Show last n rows.Sort index.Sort columns.Pivot DataFrame, using new conditions.Transpose DataFrame.Change lowest level of column labels into innermost row index.Change innermost row index into lowest level of column labels.
NumPy / SciPy
arr =array([])arr.shapeconvolve(a,b) arr.reshape()sum(arr) mean(arr) std(arr) dot(arr1,arr2) vectorize()
Create a Series.Create a Dataframe.Create a Panel.
Pandas
Create Structures
 
s =Series(data,index) df =DataFrame(data,index,columns) p =Panel(data,items,major_axis,minor_axis) df.stack()df.unstack()df.pivot(index,column,values) df.T
DataFrame commands
df[col] df.iloc[label] df.indexdf.drop()df1 =df1.reindex_like(df1,df2) df.reset_index()df.reindex()df.head(n) df.tail(n) df.sort()df.sort(axis=1)
Create numpy array.Shape of an array.Linear convolution of two sequences.Reshape array.Sum all elements of array.Compute mean of array.Compute standard deviation of array.Compute inner product of two arrays.Turn a scalar function into one whichaccepts & returns vectors.
 
2
Create a time series index.
date_range(start,end,freq)
Pandas Time Series
Business DayCalender dayWeeklyMonthlyQuarterlyAnnualHourly
BDWMQAH
Freq has many options including:
Any Structure with a datetime index
Split DataFrame by columns. Creates a GroupBy object (gb).Apply function (single or list) to a GroupBy object.Applies function and returns object with same index as onebeing grouped.Filter GroupBy object by a given function.Return dict whose keys are the unique groups, and valuesare axis labels belonging to each group.
Groupby
 
groupby()gb.agg()gb.transform()
gb.flter
()gb.groups
Save to CSV.Read CSV into DataFrame.Save to Excel.Read exel into DataFrame.
I/O
df.to_csv(‘foo.csv’) read_csv(‘foo.csv’) to_excel(‘foo.xlsx’,sheet_name) read_excel(‘foo.xlsx’,’sheet1’,index_col= None,na_values= [‘NA’]) df.dropna()df.count()df.min()df.max()df.describe()concat()
Drops rows where any data is missing.Returns Series of row counts for every column.Return minimum of every column.Return maximum of every column.Generate various summary statistics for every column.Merge DataFrame or Series objects.Apply function to every element in DataFrame.Apply function along a given axis.
df.applymap()df.apply()
Resample data with new frequency.
ts.resample()
 
3
Select current axis.Change axis color, none to remove.Change axis position. Can change coordinate space.Create legend. Set to ‘best’ for auto placement.Save plot.
 
ax=gca()ax.spines[].set_color() ax.spines[].set_position()
Label the x-axis.Label the y-axis.Title the graph.Set tick values for x-axis. First array for values, secondfor labels.Set tick values for y-axis. First array for values, secondfor labels.
yticks([],[])
 
Plot data or plot a function against a range.Create multiple plots; n- number of plots, x - numberhorizontally displayed, y- number vertically displayed.
plot()subplot(n,x,y) xlabel()ylabel()title()xticks([],[])
Plotting
Matplotlib is an extremely powerful module.See
for complete documentation.Quandl is a search engine for numerical data, allowing
easy access to nancial, social, and demographic data
from hundreds of sources.
Quandl
The Quandl package enables Quandl
access fromwithin Python, which makes acquiring and manipulatingnumerical data as quick and easy as possible.
In your rst Quandl function call you should speciy your
authtoken (found on Quandl’s website after signing up) toavoid certain API call limits.See
for more.
savefg
(foo.png)legend(loc=’ ‘)
Return data for nearest time interval.
Return data for specic time.Return data between specic interval.
Convert Pandas DatetimeIndex to datetime.datetime object.Conver a list of date-like objects (strings, epochs, etc.) to aDatetimeIndex.
 
ts.ix[start:end] ts[]ts.between_time()to_pydatetime()to_datetime()

Activity (4)

You've already reviewed this. Edit your review.
1 thousand reads
1 hundred reads
seanrwcrawford liked this

You're Reading a Free Preview

Download
scribd
/*********** DO NOT ALTER ANYTHING BELOW THIS LINE ! ************/ var s_code=s.t();if(s_code)document.write(s_code)//-->