Python Cheat Sheet For Excel Users

Python
Cheat Sheet
Python | Pandas
Data Analysis
Data Visualization
by Frank Andrade
Python Basics Variables
Variable assignment:
Creating a new list:

numbers = [4, 3, 10, 7, 1, 2]
Cheat Sheet
message_1 = "I'm learning Python" Sorting a list:
message_2 = "and it's fun!" >>> numbers.sort()

[1, 2, 3, 4, 7, 10]
Here you will find all the Python core concepts you need to String concatenation (+ operator):
message_1 + ' ' + message_2 >>> numbers.sort(reverse=True)

know before learning any third-party library.
[10, 7, 4, 3, 2, 1]
String concatenation (f-string):
f'{message_1} {message_2}'
Data Types
Update value on a list:

>>> numbers[0] = 1000
Integers (int): 1 >>> numbers
Float (float): 1.2
List [1000, 7, 4, 3, 2, 1]
String (str): "Hello World" Creating a list:
Copying a list:
Boolean: True/False countries = ['United States', 'India', new_list = countries[:]
'China', 'Brazil'] new_list_2 = countries.copy()
List: [value1, value2]
Dictionary: {key1:value1, key2:value2, ...}
Create an empty list:
my_list = [] Built-in Functions

Numeric Operators Comparison Operators Indexing: Print an object:
>>> countries[0] print("Hello World")
+ Addition
== United States

Equal to

Return the length of x:
- Subtraction >>> countries[3] len(x)

!= Different Brazil
Multiplication
Return the minimum value:
*

> Greater than >>> countries[-1] min(x)
Division
Brazil
/ < Less than

Return the maximum value:

Slicing:
Exponent
max(x)
** >= Greater than or equal to >>>countries[0:3]

['United States', 'India', 'China']

Returns a sequence of numbers:
% Modulus range(x1,x2,n) # from x1 to x2
<= Less than or equal to

>>>countries[1:] (increments by n)
// Floor division ['India', 'China', 'Brazil']
Convert x to a string:
>>>countries[:2] str(x)
['United States', 'India']
String methods
Convert x to an integer/float:
Adding elements to a list: int(x)
string.upper(): converts to uppercase countries.append('Canada') float(x)
string.lower(): converts to lowercase countries.insert(0,'Canada')
string.title(): converts to title case Convert x to a list:

Nested list: list(x)
string.count('l'): counts how many times "l" nested_list = [countries, countries_2]
appears

string.find('h'): position of the "h" first Remove element:

countries.remove('United States')
ocurrance countries.pop(0)#removes and returns value
string.replace('o', 'u'): replaces "o" with "u" del countries[0]
Dictionary If Statement Functions
Creating a dictionary: Create a function:
Conditional test:
my_data = {'name':'Frank', 'age':26} def function(<params>):
if <condition>:
<code> <code>
Create an empty dictionary: elif <condition>: return <data>
my_dict = {} <code>
...
Get value of key "name": else:
Modules
>>> my_data["name"] <code> Import module:
'Frank'
import module
Example: module.method()
Get the keys: if age>=18:
>>> my_data.keys() print("You're an adult!") OS module:

dict_keys(['name', 'age'])
import os
Conditional test with list: os.getcwd()

Get the values: if <value> in <list>: os.listdir()
>>> my_data.values() <code> os.makedirs(<path>)
dict_values(['Frank', 26])

Get the pair key-value:

>>> my_data.items()
Loops Special Characters
dict_items([('name', 'Frank'), ('age', 26)]) For loop: # Comment
for <variable> in <list>:
Adding/updating items in a dictionary: <code> \n New Line

my_data['height']=1.7
my_data.update({'height':1.8, For loop and enumerate list elements:
'languages':['English', 'Spanish']}) for i, element in enumerate(<list>): Boolean Operators Boolean Operators
>>> my_data <code> (Pandas)
{'name': 'Frank',
'age': 26, For loop and obtain dictionary elements: and logical AND & logical AND
'height': 1.8, for key, value in my_dict.items():

'languages': ['English', 'Spanish']} <code> or logical OR | logical OR
Remove an item: While loop: not logical NOT ~ logical NOT

my_data.pop('height') while <condition>:
del my_data['languages'] <code>
my_data.clear()
Copying a dictionary: Data Validation

new_dict = my_data.copy()
Try-except:
try:
<code> Below there are my guides, tutorials
except <error>:
<code> and complete Python courses:
- Medium Guides
Loop control statement: - YouTube Tutorials
break: stops loop execution
continue: jumps to next iteration - Udemy Courses
pass: does nothing
Made by Frank Andrade frank-andrade.medium.com
Pandas Selecting rows and columns Merge multiple data frames horizontally:
df3 = pd.DataFrame([[1, 7],[8,9]],
Cheat Sheet
Select single column: index=['B', 'D'],
df['col1'] columns=['col1', 'col3'])

#df3: new dataframe
Select multiple columns: Only merge complete rows (INNER JOIN):
Pandas provides data analysis tools for Python. All of the df[['col1', 'col2']] df.merge(df3)
following code examples refer to the dataframe below.

Show first n rows: Left column stays complete (LEFT OUTER JOIN):
df.head(2) df.merge(df3, how='left')
axis 1

col1 col2 Show last n rows: Right column stays complete (RIGHT OUTER JOIN):
df.tail(2) df.merge(df3, how='right')
A 1 4

Select rows by index values: Preserve all values (OUTER JOIN):

axis 0
df = B 2 5

df.loc['A'] df.loc[['A', 'B']]

df.merge(df3, how='outer')
C 3 6 Select rows by position: Merge rows by index:

df.loc[1] df.loc[1:] df.merge(df3,left_index=True,

right_index=True)

Getting Started Data wrangling Fill NaN values:

df.fillna(0)
Import pandas: Filter by value:
import pandas as pd df[df['col1'] > 1] Apply your own function:

def func(x):
Sort by one column: return 2**x
Create a series: df.sort_values('col1') df.apply(func)
s = pd.Series([1, 2, 3],

Sort by columns:
index=['A', 'B', 'C'], df.sort_values(['col1', 'col2'], Arithmetics and statistics
name='col1') ascending=[False, True])

Add to all values:
Create a dataframe:
Identify duplicate rows: df + 10
data = [[1, 4], [2, 5], [3, 6]] df.duplicated()
index = ['A', 'B', 'C']

Sum over columns:
df = pd.DataFrame(data, index=index, Identify unique rows: df.sum()
df['col1'].unique()
columns=['col1', 'col2'])
Cumulative sum over columns:
Read a csv file with pandas: Swap rows and columns: df.cumsum()
df = pd.read_csv('filename.csv') df = df.transpose()
df = df.T Mean over columns:

df.mean()
Advanced parameters: Drop a column:
df = pd.read_csv('filename.csv', sep=',', df = df.drop('col1', axis=1) Standard deviation over columns:

df.std()
names=['col1', 'col2'], Clone a data frame:
index_col=0, clone = df.copy() Count unique values:

encoding='utf-8',
df['col1'].value_counts()
Connect multiple data frames vertically:
nrows=3) df2 = df + 5 #new dataframe Summarize descriptive statistics:

pd.concat([df,df2]) df.describe()

Hierarchical indexing Data export Visualization

Create hierarchical index: Data as NumPy array: The plots below are made with a dataframe
df.stack() df.values with the shape of df_gdp (pivot() method)

Dissolve hierarchical index: Save data as CSV file:

df.unstack() df.to_csv('output.csv', sep=",") Import matplotlib:

import matplotlib.pyplot as plt
Format a dataframe as tabular string:
Aggregation
df.to_string() Start a new diagram:

plt.figure()
Create group object: Convert a dataframe to a dictionary:
g = df.groupby('col1') df.to_dict() Scatter plot:

df.plot(kind='scatter')
Iterate over groups: Save a dataframe as an Excel table:
for i, group in g: df.to_excel('output.xlsx') Bar plot:

print(i, group)
df.plot(kind='bar',

xlabel='data1',
Aggregate groups: ylabel='data2')
g.sum()
g.prod()
Pivot and Pivot Table
Lineplot:
g.mean() Read csv file 1: df.plot(kind='line',
g.std() df_gdp = pd.read_csv('gdp.csv') figsize=(8,4))
g.describe()

The pivot() method: Boxplot:
Select columns from groups: df_gdp.pivot(index="year", df['col1'].plot(kind='box')
g['col2'].sum() columns="country",
g[['col2', 'col3']].sum() values="gdppc") Histogram over one column:

df['col1'].plot(kind='hist',
Transform values: Read csv file 2: bins=3)
import math df_sales=pd.read_excel(
g.transform(math.log) 'supermarket_sales.xlsx') Piechart:

df.plot(kind='pie',
Apply a list function on each group: Make pivot table: y='col1',
def strsum(group): df_sales.pivot_table(index='Gender', title='Population')
return ''.join([str(x) for x in group.value]) aggfunc='sum')

Set tick marks:
g['col2'].apply(strsum) Make a pivot tables that says how much male and labels = ['A', 'B', 'C', 'D']
female spend in each category: positions = [1, 2, 3, 4]
plt.xticks(positions, labels)
df_sales.pivot_table(index='Gender', plt.yticks(positions, labels)
columns='Product line',
values='Total', Label diagram and axes:

Below there are my guides, tutorials plt.title('Correlation')
aggfunc='sum')
and complete Python courses:
plt.xlabel('Nunstück')
- Medium Guides plt.ylabel('Slotermeyer')

- YouTube Tutorials Save most recent diagram:

- Udemy Courses plt.savefig('plot.png')
plt.savefig('plot.png',dpi=300)
Made by Frank Andrade frank-andrade.medium.com plt.savefig('plot.svg')

Python Cheat Sheet For Excel Users

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Python Cheat Sheet For Excel Users

Uploaded by

Copyright:

Available Formats

Python

numbers = [4, 3, 10, 7, 1, 2]

message_1 + ' ' + message_2 >>> numbers.sort(reverse=True)

Update value on a list:

Dictionary: {key1:value1, key2:value2, ...}

Create an empty list:

my_list = [] Built-in Functions

/ < Less than

string.title(): converts to title case Convert x to a list:

string.find('h'): position of the "h" first Remove element:

>>> my_data.keys() print("You're an adult!") OS module:

Conditional test with list: os.getcwd()

Get the pair key-value:

for <variable> in <list>:

Adding/updating items in a dictionary: <code> \n New Line

'languages': ['English', 'Spanish']} <code> or logical OR | logical OR

Remove an item: While loop: not logical NOT ~ logical NOT

Copying a dictionary: Data Validation

Select rows by index values: Preserve all values (OUTER JOIN):

df.loc['A'] df.loc[['A', 'B']]

C 3 6 Select rows by position: Merge rows by index:

Getting Started Data wrangling Fill NaN values:

import pandas as pd df[df['col1'] > 1] Apply your own function:

index = ['A', 'B', 'C']

df = df.T Mean over columns:

df = pd.read_csv('filename.csv', sep=',', df = df.drop('col1', axis=1) Standard deviation over columns:

index_col=0, clone = df.copy() Count unique values:

nrows=3) df2 = df + 5 #new dataframe Summarize descriptive statistics:

Hierarchical indexing Data export Visualization

Dissolve hierarchical index: Save data as CSV file:

df.to_string() Start a new diagram:

g = df.groupby('col1') df.to_dict() Scatter plot:

for i, group in g: df.to_excel('output.xlsx') Bar plot:

g[['col2', 'col3']].sum() values="gdppc") Histogram over one column:

g.transform(math.log) 'supermarket_sales.xlsx') Piechart:

values='Total', Label diagram and axes:

- YouTube Tutorials Save most recent diagram:

You might also like