Algorithm and Portfolio Stats: 05/20/2024 - 05/24/2024
Our algorithm broke about even this week. As you can see from its performance over the last month, that isn’t enough to pull it out of its current tail spin. This week, however, I have good news. Because we believe we have something that will pull it out of this slump.
I’ve spent the last few weeks running back tests, and manually auditing the discrepancies between trades made by the back tests, and trades we’ve seen in our live results. At this time, I’ve been able to isolate all discrepancies to 2 sources.
The early stop-out issue. I gave a more in-depth explanation of this glitch during our last weekly review, but the new information is that the glitch has been much more pervasive, and much more impactful than anticipated. Looking at our trades made since March 19th, I would estimate that ~10% of our trades have exhibited this glitch to some extent. Notably, this glitch has never helped us. In every instance I’ve been able to find, either no change to P&L occurred (this happened when the trade was stopped out early, but would have been stopped out at the same price even if things had worked as intended), or our P&L on the trade was reduced. This might not sound like much, but when the signal is already on a roughly 32% win rate, and 10% of trades are now being made worse artificially, that can easily be the difference between profit and loss.
This glitch has been able to sneak under our radar for 1 reason: our historic auditing strategy wasn’t sufficient. We’ve been periodically auditing the algorithm’s trades since we rolled it out, but only visually. I’ve sat down with Fred (one of our staff members, and a strong Ichimoku Trader - you probably know him from our daily trading calls) and gone through stock charts on an individual basis, to evaluate whether trades made were done in error. This was mostly technical analysis on our part, evaluating if the TK-cross was really strong enough to justify entry, among other things. The only way to notice this glitch is to check, not the stock charts, but the individual signal values in the trades, and comparing them by hand to signal values from back tested trades.The second factor is a mismatch between the tickers used in back tests, and tickers used in the live implementation. During development of this new algorithm, I compiled a list of about 1200 tickers, with minimum values for dollar volume, number of transactions, rates of non-flat candles, etc: factors meant to select for liquid and non-choppy tickers. After running back tests on these, we see our stats on all tickers, and our stats on only the tickers from this list that are S&P 500 constituents. While running our system live, we only use a list of all S&P 500 constituents.
Eagle-eyed readers will have noticed what I’m getting at here: what about tickers that are in the S&P 500, but did not make it onto that list of 1200? Those tickers have been used by our live implementation, but were excluded from all back tests due to low liquidity metrics. There were roughly 100 such tickers, equating to nearly 20% of our live universe. When adjusting our back tests, including these tickers lowers our Sharpe from low 2’s to under 1 - a massive drop, and explainer for the discrepancy.
So to summarize, our algorithm hasn’t under-performed because it was over fit, or because of poor market conditions. It’s been under-performing because it’s stopping itself out more easily than it’s supposed to, and it’s being run on tickers with less liquidity than it was developed with. This begs the question, what comes next?
First, this secondary layer of trade-auditing will become a part of our routine operations. I’m very much not excited about this, since the process is highly labor intensive, and mostly involves scrolling through numerical output and making sure everything lines up. But if this is what we have to do in order to keep things working, it’s plenty worth it.
Secondly, we’ll be sticking with our plan to re-introduce our algorithm at the beginning of June. We’ll be putting most of our experimental systems online, in a private trial, this Tuesday, to verify that the early-stop bug has been fixed successfully. Our machine learning systems will follow later in the week, as they will need to be retrained.
Lastly, this increases the odds that our final system is a machine-learning based algorithm. Since part of the problem is low-liquidity tickers, we want some kind of rule in place that determines when a ticker is “liquid” enough to be traded on our system. We’ve taken attempts at this before, but never come up with a set of rules we were really satisfied with. This is a problem that we feel machine learning will be better equipped to tackle.
Now then, let’s examine our portfolio.
We did great this week, beating SPY by more than 0.9% (up 0.89%, versus SPY being down 0.02%). Our result this week was decided by 1 factor: NVDA. It made up 7.8% of our portfolio’s allocations, and crushed its earnings this week. Your results were likely decided by the same factor. If you held NVDA, you’re probably celebrating right now. If you didn’t, you probably feel like you missed out. If not for NVDA, we would have under-performed SPY this week. On one hand, that’s a tautology - removing our best ticker will always make things worse in hindsight, but even for me, this level of dependency on 1 ticker is a bit much.
This week, our plan is largely the same. Tech is still the plurality of our portfolio, and our greatest over-exposure. We’re still long the market, though our exposures are more centered on tech this week, as opposed to last week. Given our momentum strategy and the sector’s performance over the last week, this is fairly natural for us.
As always, our portfolio for the week is below. Allocations <0.5% are excluded, for brevity.
That’s all I have for you tonight. As always, thank you for reading and happy trading!