Professional Documents
Culture Documents
If you’ve been following along with the short Excel course, you’ve created a spreadsheet
using SPY daily data, and added daily TLT data to the sheet in another column. I received a
few questions about why we needed to match the two series with a VLOOKUP instead of
simply pasting the data into another column. Answering that question leads us to
considering some other issues about market data, so this is a good time to take a day or two
to think about those issues.
You might think this should be simple, and perhaps you’re right–markets trade, prices are
published, and we should be able to just look at those past prices and call it a day. In reality,
it’s a lot more complicated, and it’s not unusual to nd, for instance, academic researchers
spending the vast majority of their time dealing with data issues. (Some researchers
estimate that 85% of their time is spent on mundane data tasks.) Let’s look at some of those
issues so we can better understand what can go wrong when working with market data.
These errors fall into two categories. One we could call “problems” rather than errors
because no mistake has been made in the recording or transmission of the data. The other
category is these actual errors, and we should look at how to catch some of the more
common actual errors as well. The “problems” are instructive, so let’s start there.
This is also a serious problem with intraday data, and particularly when mixing data from
many sources. Some bars are timestamped at the beginning of the bar, some at the end.
Which are yours? How about if you are comparing intraday data from an active stock and
from one that does not trade for many minutes? The active stock will show a price change,
while the less active stock will display the same price. Again, there’s a spread relationship
here that is only an illusion, as the bid ask spread for the less active stock will likely have
moved (and here’s your solution to that particular problem.)
Let’s not even get into the issues of back-adjusting data for futures rolls or stock corporate
actions. There are established methodologies for doing these things, and they work–but
they have to be done properly. (Also, too many naive technical analysts have no idea what
these issues are or what their charts actually show. The more I learned about these issues,
the less I trusted “levels” and many traditional chart formations.)
Missing entries
Entries that display the same price (e.g., seeing a string of 5 minute bars in the ES futures
that show the same prices.)
H = L = C in an active market (not likely)
H < L or L > H. Some errors are obvious, but you have to look for them. This little gem is
not so easy to nd when you have 40,000 datapoints.
Missing or incorrectly placed decimals. The way to check for this is to look for very large
price changes. For instance, seeing a change of +/- 1000% in 1 minute bars would be,
shall we say, unlikely, and probably points to an error.
Volume or open interest data with errors.
Missing data points. (Example: a high price missing for a random bar. It happens.)
The list goes on, but you need to be aware of these issues and you need to have some way
to check for them. I’m not going to focus too much on data issues in this short series, but I
thought these serious issues–and they are serious because they can create completely
misleading tests and destroy a lot of your time work–I thought these issues deserved our
respect and some attention.
Tomorrow we will get into the fun stuff and start to do calculations with the data that we
now have in our spreadsheet.
Name
zaqimon .
− ⚑
3 years ago
There is another issue existing in daily data, which is the Close price might be in fact the
calculated daily Settlement price, not the actual Close price.
△ ▽ Reply
Very true... must always understand what, exactly, your data is.
1△ ▽ Reply
Santa Claus rally: updated stats and What can we learn from the range?
expectations 2 comments • 4 months ago
1 comment • 4 months ago Adam Grimes — That post is only a few
Nick — A complex topic (as you stated days old! :)I do this kind of analysis
"Books have been written on the topic, daily for my marketlifetrading.com
and entire disciplines are struggling …
TradeLab: Reviewing some recent How bad was October 2018, and what
trades we published comes next?
2 comments • 5 months ago 3 comments • 5 months ago
DJM — Adam, thank you so much for Adam Grimes — Thank you. That's a
Avatarbeing transparent and honest about the Avatarslippery slope. As soon as you start
realities of trading! I was actually … taking multiple cuts through the data, …
First Name
Last Name
Submit
Popular Posts
A few short trading lessons
Roll em! How to calculate futures rolls (and why you care)
A shift in perspective
Recent Comments
Mike Hi Adams, I've been following you for quite some time and you do a great job. In this
setup, I have doubts about the trigger because I do not see...
Why Bitcoin was a clear short · 3 weeks ago
Gelston new at following you and eager to learn. thank you Gelston
A few short trading lessons · 4 weeks ago
CountryMusicJesus A very cursory glance at just those areas, to me, leads me to a couple
observations: 1) Excursions outside the keltner bands are short lived and do...
A shift in perspective · last month
Johan For me it's perfectly reasonable to look at this as an anti. A Adam word here would
be very insightful. Adam?
A good pullback setup in the USDCAD · last month
Alan When you are back testing a potential idea. How many trades would you test out
before giving up on the idea, for example say I have done 700 manual...
The law of small numbers: a mistake you’re probably making · last month
Terms of Service
Privacy Policy
Sitemap
Contact form
Topics Archive
Select Category Select Month
Post Calendar
April 2019
M T W T F S S
1 2 3 4 5 6 7
8 9 10 11 12 13 14
15 16 17 18 19 20 21
22 23 24 25 26 27 28
29 30
« Mar