You are on page 1of 20

J2.

"(2" L'%(A%0*%'#

<&44&= >9?@(<&44&='*A · !#2%&*AB(C2,DA <'0%7*'A !E)4&*' F&"%*287%'

G&7(/0H'(<(5*''(3'38'*$&"4:(A%&*2'A(4'5%(%/2A(3&"%/I
J2."(7)(5&*(K'#273(0"#(.'%(0"('E%*0(&"'

!"#$%"$&'()*($+$,-../
+-*"0)*(1$+2$3)4(1
%')1567$8/4*(0$95*:
;/*:"6
!"#$%&$'"#()*&+',%-(.'%(%/'(#0%01(%*02"(%/'(3&#'41()40,'(%/'
&*#'*1(.'%("&%252'#

678é"(6&3'*& 9(#0:(0.& · 9;(32"(*'0#


C/&%&(8:(M2,/&40A(F0))'44&(&"(N"A)40A/

A couple of weeks ago I was casually chatting with a friend,


masks on, social distance, the usual stu9. He was telling me
how he was trying to, and I quote, detox from the broker app he
was using. I asked him about the meaning of the word detox in
this particular context, worrying that he might go broke, but
nah: he told me that he was constantly trading. “If a particular
stock has been going up for more than one hour or so and I’m
already over the 1% proEt threshold then I sell”, he said,
“among other personal rules I’ve been following”. Leaving aside
the slight pseudoscientiEc aspect of those rules, I understood
what he meant by detox: following them implied checking the
phone an astronomically high number of times.

So I started wondering: would it be possible to automate the set


of rules this guy has in mind? And actually — would it be
possible to automate a saner set of rules, so I let the system do
the trading for me? Since you’re reading this I assume you got
caught by the title, so you’ve probably already guessed that the
answer is yes. Let’s elaborate on that, but Erst of all: time is gold
and I don’t want to clickbait anyone. This is what we’re going to
do:
1. Get some real-time, granular stocks price data: ideally, in
one minute intervals. The richer the better — we’re going to
use Yahoo! Finance for that, more details to follow.

2. Instead of a personal set of rules, we are going to add some


AI Ravour to the system. Full disclosure: I’m by no means an
expert in time series analysis, there are already lots of
tutorials over there about how to train neural networks to
trade and I don’t really want to overengineer in a toy system
like this, so let’s keep it simple: a very basic ARIMA model
will do for now.

3. At this point we’ll have the data and the prediction coming
from the algorithm, so we should be able to decide whether
to sell, buy or hold; we need to connect with our broker to
actually perform the action. We are going to use RobinHood
and Alpaca.

4. That’s pretty much it — the system is Enished. The last thing


we need is to deploy it somewhere, in our case AWS, and
monitor the activity. I’ve chose to send a Telegram message
to a group everytime an action is performed by my system.

And what are we going to need?

Python 3.6 with some libraries.

An AWS account with admin rights, for storage and


deployment.

Node.js, just to set up the serverless framework for


deployment.
A Telegram account, for monitoring.

Everything I’ve coded is available here. Okay! So, without


further ado, let’s go for the Erst part: getting the data.

!"##$%&'#("')*#*
Getting the data is not easy. Some years ago there was an
o_cial Yahoo! Finance API, as well as alternatives like Google
Finance — sadly, both have been discontinued for years now.
But don’t worry, there’s still plenty of alternatives in the
market. My personal requirements were:

Free of charge: for a production system I would deEnitely


change this bullet point to cheap alternatives, but for a toy
system, or proof of concept, whatever you want to call it, I
want it free.

High limit rate: ideally no limit, but anything above 500-ish


hits per minute is more than enough.

Real-time data: some APIs provide data with a slight delay,


let’s say 15 minutes. I want the real deal — the closest I can
get to the real-time price of the stock.

Ease to use: Again — this is just a POC. I want the easiest


one.

With that list in mind, I went for yfinance — the uno_cial


alternative to the old Yahoo Finance API. Bear in mind that for
a real system, and based on the awesome list provided by
Patrick Collins, I would deEnitely choose the Alpha Vantage API
— but let’s keep it simple for now.

The yfinance library was developed by Ran Aroussi to get


access to the Yahoo! Finance data when the o_cial API was
shut down. Quoting from the GitHub repository,

Ever since Yahoo! 4nance decommissioned their historical data


API, many programs that relied on it to stop working.

y"nance aimes to solve this problem by oAering a reliable,


threaded, and Pythonic way to download historical market data
from Yahoo! 4nance.

Sweet, good enough for me. How does it work? First we need to
install it:

$ pip install yfinance --user

And then we can access everything using the Ticker object:

import yfinance as yf

google = yf.Ticker(“GOOG”)

That method is quite fast, slightly above 0.005 seconds on


average, and returns LOTS of info about the stock; for instance,
google.info contains 123 Eelds, including the following:

52WeekChange: 0.3531152
SandP52WeekChange: 0.17859101
address1: 1600 Amphitheatre Parkway
algorithm: None
annualHoldingsTurnover: None
annualReportExpenseRatio: None
ask: 1815
askSize: 1100

...

twoHundredDayAverage: 1553.0764
volume: 1320946
volume24Hr: None
volumeAllCurrencies: None
website: http://www.abc.xyz
yield: None
ytdReturn: None
zip: 94043

There is more info available through several methods:


dividends , splits , balance_sheet or earnings among others.
Most of these methods return the data in a pandas DataFrame
object, so we’ll need to play with it a bit to get whatever we
want. For now I just need the information of the stock price
through the time; the history method is the best one for that
purpose. We can select both the period or the interval dates
and the frequency of the data down to one minute — note that
intraday information is only available if the period is minor
than 60 days, and that only 7 days worth of 1m granularity
data are allowed to be fetched per request. The transposed data
of the last entry with a 1m interval is as follows:
df = google.history(period='1d', interval="1m")
print(df.head())

O0%05*03'(&5(L&&.4'(/2A%&*2,04(A%&,D()*2,'(P(Q30.'(8:(R7%/&*

We can see how it’s indexed by the datetime and every entry
has seven features: four Exed points of the stock price during
that minute (open, high, low and close) plus the volume,
dividends and stock splits. I’m going to use just the low, so let’s
keep that data:

df = google.history(period='1d', interval="1m")
df = df[['Low']]
df.head()

Q30.'(8:(R7%/&*
Finally, since we’re going to use the data just for the last day,
let’s reindex the dataframe to remove the date and timezone
components and keep just the time one:

df['date'] = pd.to_datetime(df.index).time
df.set_index('date', inplace=True)
df.head()

Q30.'(8:(R7%/&*

Looking good! We already know how to fetch the latest info


from yfinance — we’ll later feed our algorithm with this. But
for that, we need an algorithm to feed: let’s go for the next part.

+))$%&'#("'+,
I said it before but I’ll say this again: don’t try this at home.
What I’m going to do here is Etting a VERY simple ARIMA
model to forecast the next value of the stock price; think of it as
a dummy model. If you want to use this for real trading, I’d
recommend to look for better and stronger models, but be
aware: if it were easy, everyone would do it.
First let’s split the dataframe into train and test, so we can use
the test set to validate the results of the dummy model — I’m
going to keep the last 10% of the data as the test set:

X = df.index.values
y = df['Low'].values

# The split point is the 10% of the dataframe length


offset = int(0.10*len(df))

X_train = X[:-offset]
y_train = y[:-offset]
X_test = X[-offset:]
y_test = y[-offset:]

If we plot it, we get:

plt.plot(range(0,len(y_train)),y_train, label='Train')
plt.plot(range(len(y_train),len(y)),y_test,label='Test')
plt.legend()
plt.show()
Q30.'(8:(R7%/&*

Now let’s Et the model with the training data and get the
forecast. Note that the hyperparameters of the model are Exed
whereas in the real world you should use cross-validation to get
the optimal ones — check out this awesome tutorial about How
To Grid Search ARIMA Hyperparameters With Python. I’m
using a 5, 0, 1 conEguration and getting the forecast for the
moment immediately after the training data ends:

from statsmodels.tsa.arima.model import ARIMA

model = ARIMA(y_train, order=(5,0,1)).fit()


forecast = model.forecast(steps=1)[0]

Let’s see how well performed our dummy model:

print(f'Real data for time 0: {y_train[len(y_train)-1]}')


print(f'Real data for time 1: {y_test[0]}')
print(f'Pred data for time 1: {forecast}')

---

Real data for time 0: 1776.3199462890625

Real data for time 1: 1776.4000244140625


Pred data for time 1: 1776.392609828666

That’s not bad — we can work with it. With this info we can
deEne a set of rules based on whatever we want to do, like
holding if it’s going up or selling if it’s going down. I’m not
going to elaborate on this part because I don’t want y’all to sue
me saying you lost all your money, so please go ahead and
deEne your own set of rules :) In the meantime, I’m going to
explain the next part: connecting to the broker.

-.%%"/#$%&'#.'#("'01.2"1
As you probably have guessed, this part highly depends on the
broker you’re using. I’m covering here two brokers, RobinHood
and Alpaca; the reason is that both of them:

Have a public API (o_cial or not) available.

Do not charge commissions for trading.

Depending on the type of your account you might have some


limits: for instance, RobinHood allows just 3 trades over a 5 day
period if your account balance is below 25000$; Alpaca allows
far more requests but still has a limit of 200 requests per minute
per API key.

3.0$%4..)
There are several libraries that wrap the RobinHood API, but
sadly, as far as I know no one of them is o_cial. Sanko’s library
was the biggest one, with 1.5k stars in GitHub, but it has been
discontinued; LichAmnesia’s has continued Sanko’s path, but
has just 99 stars so far. I’m going to use robin_stocks library,
which has a little over 670 stars at the moment of writing this.
Let’s install it:
$ pip install robin_stocks

Not all actions require login, but most of them do, so it’s useful
to login before doing anything else. RobinHood requires MFA,
so it’s necessary to set it up: go to your account, turn on the two
factor authentication and select “other” when asked about the
app you want to use. You will be presented with an
alphanumeric code, which you will use in the code below:

import pyotp
import robin_stocks as robinhood

RH_USER_EMAIL = <<<YOUR EMAIL GOES HERE>>>


RH_PASSWORD = <<<YOUR PASSWORD GOES HERE>>>
RH_MFA_CODE = <<<THE ALPHANUMERIC CODE GOES HERE>>>

timed_otp = pyotp.TOTP(RH_MFA_CODE).now()
login = rh.login(RH_USER_EMAIL, RH_PASSWORD,
mfa_code=totp)

To buy or sell is pretty easy:

# Buying 5 shares of Google


rh.order_buy_market('GOOG', 5)

# Selling 5 shares of Google


rh.order_sell_market('GOOG', 5)

Check the docs for advanced usage and examples.


+56*/*
For Alpaca we are going to use the alpaca-trade-api library,
which has over 700 stars in GitHub. To install:

$ pip install alpaca-trade-api

After signing in your account you’ll get an API key ID and a


secret key; both are needed for login:

import alpaca_trade_api as alpaca

ALPACA_KEY_ID = <<<YOUR KEY ID GOES HERE>>>


ALPACA_SECRET_KEY = <<<YOUR SECRET KEY GOES HERE>>>

# Change to https://api.alpaca.markets for live


BASE_URL = 'https://paper-api.alpaca.markets'

api = alpaca.REST(
ALPACA_KEY_ID, ALPACA_SECRET_KEY, base_url=BASE_URL)

Submitting orders is slightly more complex than with


RobinHood:

# Buying 5 shares of Google


api.submit_order(
symbol='GOOG',
qty='5',
side='buy',
type='market',
time_in_force='day'
)
# Selling 5 shares of Google
api.submit_order(
symbol='GOOG',
qty='5',
side='sell',
type='market',
time_in_force='day'
)

That’s it! Note that leaving your credentials in plain text is a


very, VERY bad thing to do — do not worry though, we’ll switch
in the next step to environment variables, which is far safer.
Now let’s deploy everything to the cloud and monitor it.

7"65.8'*%)'9.%$#.1$%&
We are going to deploy everything in AWS Lambda. This
wouldn’t be the best option for a production system, obviously,
since Lambda does not have storage and we would want to
store the trained model somewhere, for instance in S3.
However, this will do for now — we’ll schedule the Lambda to
run daily, training the model every time with the data from the
current day. For monitoring purposes we’ll set up a Telegram
bot that will send a message with the action to be taken and its
outcome. Note that AWS Lambda is free up to a certain limit,
but be aware of the quotas in case you want to send lots of
messages.

The Erst thing on the to-do list is creating a bot. I followed the
o_cial instructions from Telegram:

Search for the user @BotFather in Telegram.


Use the command \newbot and choose a name and
username for your bot.

Get the token and store it somewhere safe, you’re going to


need it shortly.

Next step: deployment. There are several ways of deploying to


Lambda. I’m going to use the serverless framework, so let’s
install it and create a template:

$ npm install serverless --global


$ serverless create --template aws-python3 --path
ai_trading_system

That will create a scheduled_tg_bot folder with three Eles:


.gitignore , serverless.yml , and handler.py . The serverless

Ele deEnes the deployment: what, when, and how it is going to


be run. The handler Ele will contain the code to run:

import telegram
import sys
import os

CHAT_ID = XXXXXXXX
TOKEN = os.environ['TELEGRAM_TOKEN']

# The global variables should follow the structure:


# VARIABLE = os.environ['VARIABLE']
# for instance:
# RH_USER_EMAIL = os.environ['RH_USER_EMAIL]

def do_everything():
# The previous code to get the data, train the model
# and send the order to the broker goes here.
return 'The action performed'

def send_message(event, context):


bot = telegram.Bot(token=TOKEN)
action_performed = do_everything()

bot.sendMessage(chat_id=CHAT_ID,
text=action_performed)

You need to change CHAT_ID to the ID of the group, the


channel, or the conversation you want the bot to interact with.
Here you can End how to get the ID from a channel and here is
how to get the ID from a group.

Now, we’re going to deEne how to run the code. Open


serverless.yml and write:

org: your-organization-name
app: your-app-name
service: ai_trading_system

frameworkVersion: “>=1.2.0 <2.0.0”

provider:
name: aws
runtime: python3.6
environment:
TELEGRAM_TOKEN: ${env:TELEGRAM_TOKEN}
# If using RobinHood
RH_USER_EMAIL: ${env:RH_USER_EMAIL}
RH_PASSWORD: ${env:RH_PASSWORD}
RH_MFA_CODE: ${env:RH_MFA_CODE}
# If using Alpaca
ALPACA_KEY_ID: ${env:ALPACA_KEY_ID}
ALPACA_SECRET_KEY: ${env:ALPACA_SECRET_KEY}

functions:
cron:
handler: handler.send_message
events:
# Invoke Lambda function at 21:00 UTC every day
- schedule: cron(00 21 * * ? *)

This code tells AWS the kind of runtime we want and


propagates the Telegram token from our own environment so
we don’t have to deploy it. Afterwards, we’re deEning the cron
to run the function daily at 21:00 UTC time.

The only thing left is to get the AWS credentials and set them,
along with the token and the rest of variables, as environment
variables before deploying. Getting the credentials is fairly
easy:

From your AWS console:

Go to My Security Credentials — Users — Add user.

Choose a username and select Programmatic access.

Next page: select Attach existing policies directly —


AdministratorAccess.

Copy the Access Key ID and the Secret Access Key and store
them.

That’s it. Now, let’s export the AWS credentials and the
Telegram token. Open a terminal and write:

$ export AWS_ACCESS_KEY_ID=[your key goes here]


$ export AWS_SECRET_ACCESS_KEY=[your key goes here]
$ export TELEGRAM_TOKEN=[your token goes here]
# If using RobinHood
$ export RH_USER_EMAIL=[your mail goes here]
$ export RH_PASSWORD=[your password goes here]
$ export RH_MFA_CODE=[your mfa code goes here]

# If using Alpaca
$ export ALPACA_KEY_ID=[your key goes here]
$ export ALPACA_SECRET_KEY=[your key goes here]

Install the necessary packages locally and Enally, deploy


everything to AWS:

$ pip3 install -r requirements.txt -t . --system


$ serverless deploy

We’re done! The bot will trade for us every day at 21:00 UTC
time and will message us with the action performed. Not bad
for a proof of concept — now I can tell my friend he can stop
frantically checking his phone to trade :)

Note that all the resources we’ve used through this tutorial
have their own documentation: I encourage y’all to go deeper
on whatever you think is interesting — remember that this is
just a toy system! However, as a toy system, I believe it is a good
starting point for a richer, more complex product. Happy
coding!

You can check the code in GitHub.


Note from Towards Data Science’s editors: While we allow
independent authors to publish articles in accordance with our
rules and guidelines, we do not endorse each author’s
contribution. You should not rely on an author’s works without
seeking professional advice. See our Reader Terms for details.

3":"1"%/";<
[1] P. Collins, Best Stock APIs and Industry Landscape in 2020
(2020), Medium

[2] R. Aroussi, Reliably download historical market data from


Yahoo! Finance with Python (2019), Aroussy.com

[3] J. Brownlee, How to Grid Search ARIMA Model


Hyperparameters with Python (2017), Machine Learning
Mastery

[4] J. Brownlee, How to Make Out-of-Sample Forecasts with


ARIMA in Python (2017), Machine Learning Mastery

[5] Serverless team, AWS Python Scheduled Cron Example,


GitHub
=$&%'>6':.1'?("'7*$58'@$/2
S:(T&=0*#A(O0%0(J,2'",'

U0"#A$&"(*'04$=&*4#('E03)4'A1(*'A'0*,/1(%7%&*204A1(0"#(,7%%2".$'#.'
%',/"2V7'A(#'42H'*'#(K&"#0:(%&(T/7*A#0:I(K0D'(4'0*"2".(:&7*(#024:
*2%704I(T0D'(0(4&&D

G&7*('3024 L'%
%/2A
"'=A4'%%'*
S:(A2."2".(7)1(:&7(=244(,*'0%'(0(K'#273(0,,&7"%(25(:&7(#&"W%(04*'0#:(/0H'(&"'I(6'H2'=(&7*(C*2H0,:(C&42,:
5&*(3&*'(2"5&*30%2&"(08&7%(&7*()*2H0,:()*0,%2,'AI

T/0"DA(%&(!442&%(L7""I(

K0,/2"'(X'0*"2". C*&.*0332". T*0#2". C:%/&" R7%&30%2&"

R8&7% U'4) X'.04

You might also like