You are on page 1of 2

Data Science with

Python Workflow
If you are an R-User and want to learn Python, then join our
waitlist: R / Python Teams Course Waitlist.

Click the links for


CS = Cheat Sheet
matplotlib plotnine
seaborn plotly (CS)

time series Visualize
Pandas categorical
(CS) missing
Numpy Transform

Import Tidy Communicate

Pandas Pandas JupyterLab

data structures Model Dash
I/O tools
group by Streamlit
joins & merge Flask
reshape (pivot) Scikit-Learn TensorFlow
Statsmodels Keras
Jupyter | Pycharm | VSCode

Important Resources
Anaconda Distribution:
Python Documentation:
Python Standard Library:

Business Science University

Join the R/Python Teams Course Waitlist

version: 1.0
Data Science with Text Analysis & NLP Machine Learning
Special Topics Scikit-Learn - ML in Python
NLTK - Text Tokenization & Modeling H2O - Scalable & AutoML
spaCy - NLP using Cython for Speed TPOT - TPOT Automated ML Tool
fuzzywuzzy - Fuzzy String Matching PyCaret - PyCaret Low Code ML
Dask ML - Scalable ML with Dask
Time Series Forecasting Recommendation
ML Packages
sktime - Scikit-Learn Extension for Time Series
statsmodels - Time Series Analysis
Systems CatBoost

GluonTS - MXNet/Gluon Deep Learning for Time Annoy - Approximate Nearest Neighbors
Series LightFM - Popular recommendation algo's. Feature Engineering
Featuretools - Automated Feature Engineering
Time Series Features sklearn-pandas - Sklearn Extension for Pandas
category_encoders - Categorical Encoding
TSFresh - Time Series Feature Engineering
imbalanced-learn - Resampling for Imbalanced
tslearn - Time Series Features
Pandas Time Series
Arrow - Human-Friendly Time
Apps & APIs Deep Learning
TensorFlow & Keras
FastAPI - Web framework for building APIs in pytorch
Python MXNet, Gluon, & GluonTS
Flask - Web Development OpenAI Gym - Reenforcement Learning
Dash & Streamlit - DS Web Frameworks
Web Image & Comp Vision
beautifulsoup - Extract data from HTML
requests-html - HTML Parsing MLOps OpenCV - Open Source Computer Vision
scrappy - Web crawling Scikit Image - Image Processing
MLFlow - Machine Learning Lifecycle, Tracking, Pillow - Python Imaging Library
MetaFlow - Scalable AWS Jobs for Data
Speed & Scale
Cloud datatable - C++ Speed Up
Dask (CS) - Parallel Pandas
MS Office & PDF boto3 (AWS) - AWS Python SDK
Google Cloud - GCP Python SDK
RAPIDS (CS)- GPU Accelerated Pandas
PySpark - Spark Clusters
Azure - Azure Python SDK Optimus - PySpark Extension for Humans
XlsxWriter - Create Excel Workbooks
pyexcel -Read/Write Excel
xlwings - Call python from Excel
python-docx - Word Documents ETL & Automations Coming from R?
python-pptx - PowerPoint Documents Airflow - Workflow Scheduling & Monitoring
pdfminer - Text extraction from PDF Luigi - Batch Job Tool, Scheduling, Monitoring R-to-Pandas Comparison
textract - Extract text from any document Ansible - Deployment Automation siuba & plydata - dplyr/tidyr ports
PyPDF2 - Create PDF documents JobLib - Run python jobs datatable - data.table port
gspread - Google Sheets plotnine - ggplot2 port

Business Science University

Join the R/Python Teams Course Waitlist

You might also like