You are on page 1of 11

Introduction

Installations etc.

1
Scikit-learn
• Scikit-learn is a library in Python that provides many unsupervised and
supervised learning algorithms. It's built upon NumPy, Pandas, and Matplotlib
• Specialized in “machine learning” (as opposed to “deep learning”)
1. Regression, including Linear and Logistic Regression, SVM, Trees, NN etc.
2. Classification, including K-Nearest Neighbors
3. Clustering, including K-Means and K-Means++
4. Model validation, cross-validation, leave-one-out, stratified, suffle-split
5. Model improvement, including hyper parameter optimization via grid search
6. Data preprocessing, including Min-Max Normalization
7. Feature engineering, including iterative or model based selection, binning,
one-hot encoding, interactions, principal component analysis etc.
8. Model combination, pipelines
9. Prepared data sets, real and artificial

2
Numpy Pandas and Matplotlib
• NumpyPandasMatplotlib.pdf

3
Cheat Sheets
1. CheatSheetImportingDataDataCamp.pdf
2. CheatSheetPandasBasicsDataCamp.pdf
3. CheatSheetPandasDataFrameMarkGraph.pdf
4. CheatSheetPandasDataquest.pdf
5. CheatSheetPandasDataWrangling1&2Princeton.pdf
6. CheatSheetPythonJupyterNotebookDataCamp.pdf
7. CheatSheetSeabornDataCamp.pdf
8. CheatSheetConda.pdf

4
Pandas and Machine Learning
• Rob Hick's Pandas Primer
• http://rlhick.people.wm.edu/stories/python-p
andas-primer.html

• List of the best cheat sheets


• https://becominghuman.ai/cheat-sheets-for-ai
-neural-networks-machine-learning-deep-lear
ning-big-data-678c51b4b463
5
References
Main references:
1. Data science from scratch: first principles with python by Joel Grus
2. Michael Nielsen’s book: http
://neuralnetworksanddeeplearning.com/chap1.html
3. Python data science handbook by Jake VanderPlas
Secondary references (financial indicators):
4. Evidence based technical analysis by David Aronson
5. Statistically sound machine learning for algorithmic trading of
financial instruments: developing predictive-model-based trading
systems using TSSB by David Aronson:
http://www.evidencebasedta.com/

6
Technical Indicators and Talib
1. https://www.kaggle.com/kratisaxena/lstm-gru-
models-for-stock-movement-analysis

2. https://www.quantopian.com/posts/technical-a
nalysis-indicators-without-talib-code

3. https://cryptotrader.org/talib
4. http://tadoc.org/
5. https://
stackoverflow.com/questions/59745818/how-tal
ib-is-detecting-patterns/59756160#59756160 7
Keras
• Keras is an open source neural network library written in Python. It is
capable of running on top of TensorFlow, Microsoft Cognitive Toolkit,
or Theano
• We will be using TensorFlow
• It's built upon NumPy, pandas, and Matplotlib
• An interface that offers a higher-level, more intuitive set of abstractions
that make it easy to develop deep learning models regardless of the
computational backend used
• Specialized in “deep learning”:
1. Deep Neural Networks
2. Recurrent Neural Networks
3. Convolutional Neural Networks
4. LSTMS

8
Anaconda Installation Instructions
• AnacondaInstallationInstructions.txt

9
Online Platforms for Machine Learning and
Deep Learning
• kaggle
• https://www.kaggle.com/
• Google Colab
• https://
colab.research.google.com/notebooks/welcome.ipynb#r
ecent=true

• Amazon Web Services


• https://
machinelearningmastery.com/develop-evaluate-large-d
eep-learning-models-keras-amazon-web-services
/ 10
WRDS

https://wrds-www.wharton.upenn.edu/ 11

You might also like