Show HN: How I Used Machine Learning to Optimize My Trading Algorithm

tokenadult · on March 26, 2013

"Past performance does not guarantee future results" is still the operative principle here. Data-mining discovers patterns, but it doesn't lead to deep insight into causes, and markets are perturbed by many events that you don't put into your training algorithm. "The market can remain irrational longer than you can remain solvent" is still important investment advice.

ciferkey · on March 27, 2013

The Keynes quote reminds me of LTCM (just finished reading When Genius Failed).

tosseraccount · on March 26, 2013

If someone really came up with some fool proof method of beating the market, wouldn't they keep it secret?

Meanwhile, most folks should stick to asset class allocation and indexed funds and ETFs.

minimax · on March 27, 2013

Traders try to keep their successful strategies a secret, but most strategies "work until they don't" meaning that the algos making money today are not necessarily the same ones that were making money in 2012. Furthermore, as with technology companies in general, traders and developers move around between trading firms, and the ideas move with them.

yarou · on March 27, 2013

That's the whole point of Quantopian; to open-source a closed-source and highly secretive profession. See: http://blog.quantopian.com/open-source-and-trading-algorithm...

nilkn · on March 27, 2013

Judging by the success of a select few firms, like Renaissance Technologies (known for snatching up Putnam Fellows and top university faculty with outrageous compensation packages), it seems possible that this has already happened.

(I highly doubt they have a "fool proof" method, but clearly their method is better than others, or at least has been so far.)

Hansi · on March 26, 2013

Yes, a heap of firms have found these methods, they don't post them on HN :)

One would hope that most people would do that, unfortunately too few people pay attention to saving at all, let alone allocations that fit their needs/profile.

ecopoesis · on March 26, 2013

Great job. From 2007-12-01 to 2008-12-31 you only managed to lose 215.81%. The benchmark only lost 37.04%.

wildwood · on March 27, 2013

Where are you seeing that earlier data? I can only get the graphs to cover 2012 and 2013.

This is a good example of how deceptive a percentage-change-only chart can be, without the absolute value of the portfolio also figuring in. If the system has a max drawdown of 98%, then getting 200% returns after hitting that low isn't going to do much good.

fawce · on March 27, 2013

You can clone and run the algo yourself over any time period since 2002.

olh · on March 27, 2013

Reverse the sell/buy and you could get something.

zaptheimpaler · on March 26, 2013

What data did you use to train this? Because it looks like it might just be overfitting the training data.

gknoy · on March 26, 2013

How can one tell the difference?

rodrigtw · on March 26, 2013

Typically, you would train the algorithm on one set of data and then test it on another. So you might train it on data from FY 2010 and then test it out in a simulation of FY 2011.

The fear is that if you train it on FY 2010 and then it does well in a simulation of FY 2010, it might only be because it has stored some representation of a record of FY 2010 which is extremely predictive of FY 2010 but doesn't generalize well to any other year. Testing the algorithm against a simulation of FY 2011 would reveal this flaw.

em70 · on March 27, 2013

That code is a recipe for getting hurt. Here are a few points:

- Return is not everything. More informative performance metrics are Sharpe ratio (a sort of reward/risk measure) and information ratio. Both of the above have ridiculously low values in this case.

- Another thing that matters is the distribution of returns. If you have plotted this and still see nothing wrong, you are really better off doing something else. With numbers like these, chances are will be out of cash much sooner than you will hit a good month. And even after you have hit a good month, what happens when you hit a bad one?

- Beta: essentially, when the algo does well, it is mostly because of significant overexposure to the market. At this point, I would much rather lever up and buy SPY than trade using this thing.

- Predictability and risk management: ok, so you have tested this on historical data. What are the cases in which this would misbehave? After all, it is optimization, so there may be inputs for which this gives very undesirable results. How would you notice? (hint: you have no risk management in your code!)

The bottom line is that, if you ever want to put some money where your mouth is, you would have way better chances at doing well if you learned some basic finance rather than treating the markets as a black box (no matter how creative you can be). At least, you will be able to evaluate appropriately whether you are doing well or not.

steven2012 · on March 26, 2013

The site is pretty slick, but the backtest I ran when I cloned the algo is pretty slow. It's been running for about 10+ mins now and it's only 40% done.

As well, in the logs, when I see stuff like:

2012-05-31handle_data:35INFO -63.520880 shares of Security(6109) sold.

it doesn't really inspire a lot of confidence. What does it mean that -63.520880 shares were sold? Does that mean they were bought? And the fact that you are purchasing fractional shares also doesn't inspire a lot of confidence.

jbredeche · on March 26, 2013

(disclaimer: I work for Quantopian)

I can't speak for the author of the algo, but from the algo, it looks like the relevant lines for your question are 34 (order(stock,indicator * context.bet_amount)) and 35 (log.info("%f shares of %s sold." %(context.bet_amount * indicator,stock)).

Our backtester (Zipline) will only order whole number of shares, obviously. If you pass it a fractional number, we take the floor: https://github.com/quantopian/zipline/blob/master/zipline/ge...

The log line you're seeing should probably flip the sign of the number of shares before logging. order(-63, sid(6109)) means sell 63 shares of security 6109. The log line is simply logging the negative value instead of the positive one. Users can log anything they want in their backtest.

As for the slow performance, apologies - being on HN has resulted in a lot of people running this algo and while we're scaling up new servers, it's taking a bit of time to distribute load.

thanks for using Quantopian!

[Edit - added source link to Zipline's order method]

eob · on March 26, 2013

From the business standpoint, what are the risks/hurdles that need to be overcome for Quantopian to offer a "Operationalize Algorithm" button that starts running on real money?

Surely there could be some structure in which Quantopian gets institutional trader status (or whatever the "trade for (virtually) free" status is), and then passes off the low-cost trading to its users, for a fee.

dunster · on March 26, 2013

There aren't a ton of hurdles left before we start offering "live trading" on Quantopian. We have all the pieces, we just need to stitch them together. A couple more months, I think.

In the beginning, at least, it will be leveraged through your existing brokerage account. You're going to integrate Quantopian with your brokerage, and Quantopian will place orders for you with your brokerage.

If we're as successful as we hope to be that will mean we're driving a lot of trading volume. If you start driving enough trading volume, the exchanges start to pay you rather than the other way around. It would be a pretty sweet day if we can offer trading for free to our members and fund the company on the exchange fees.

minimax · on March 26, 2013

the exchanges start to pay you rather than the other way around

Most US equities exchanges only pay if you post resting orders (adding liquidity / market making). You pay a fee for removing liquidity (market orders). I think you are actually talking about internal matching at the broker here?

nolite · on March 26, 2013

Which brokerages do you see yourself allowing? I'm thinking of opening one

dunster · on March 26, 2013

Interactive Brokers will be our first integration.

spitfire · on March 26, 2013

How about risk and money management?

dunster · on March 26, 2013

Cash management is built already. We track how much you have, dividend payments, all that stuff. We've built many risk measurements, too: alpha, beta, Sortino, Information Ratio, etc.

Risk management is far more complex. Risk management is more a part of the algorithm itself than a feature that we can build. That said, we can add more risk tools. We're very open to suggestions, if you have some in mind.

spitfire · on March 26, 2013

I see cash and risk management both as separate domains in their own right equal to if not more important than the actual algorithm. I don't think they can be folded into the algorithm itself - there is a reason banks have separate risk departments.

Eg: For risk management I might not allow any trading whatsoever when the VIX is over 40, and the 5 day stddev of the S&P is above some threshold.

Similarly, I might scale my capital usage based on my risk metrics. Or scale the capital available to a particular algorithm based on its individual risk profile.

Recreating risk management in each algorithm seems like a bad idea. But even worse is pushing off risk to the user to do in an ad-hoc way.

meson2k · on March 27, 2013

Good try. Some observations: - The sharp ratio is very bad. Focus on improving. Instead of looking for spectacular gains, focus on solid growth. - Looking at the daily tick backtest, at many points, the alpha is so negative that the losses your algo occurs would make you delinquent. Again, focusing on better sharp ratio should help here. (RETURNS -72.64%) - Practical consideration: You look at every stock in data for each tick and its historical prices. This could work perfectly well at a low frequency trades e.g. daily, but not at per-second tick because computing time > transaction time i.e. you would be acting on stale inference. - Transaction costs?

PS: Pet peeve. Gradient descent is a heuristic at best and not true machine learning :)

aneth4 · on March 26, 2013

Has anyone consistently beat the market with a home grown trading algorithm?

I suspect it's possible but probably needs more signal than just stock price.

KMag · on March 26, 2013

Given the number of people trying to beat the market with their personal algos, at least a few are certain to beat the market for many years, even if their strategies were no better than random.

Edit: I don't mean to suggest that it can't be done. I'm just suggesting that the existence of people beating the market may not be a good indicator of your ability to beat the market.

dunster · on March 26, 2013

Aneth. . . . you're reading our mind. Come back to Quantopian on Thursday. I think you'll like what you see.

Yes, there are people who trade today and make money using algorithms. They are few and far between, mostly because the toolset is so hard to build. Data, backtester, trading platform, etc. all take a long time to build. We're trying to make it much easier by providing all the tools. You need an idea; we'll make the rest work for you.

yummyfajitas · on March 26, 2013

Not that I know of. Typically after they start beating the market for a while and get a few $M in the bank, they hire other people and build infrastructure. Then the algorithm ceases to be "home grown".

nolite · on March 26, 2013

sticking around for 97% drawdown? Sorry.. doing this in real life would be more stupid than anything

unreal37 · on March 26, 2013

One point in the graph shows a -144% return. How do you even get a -144% return? Debt?

dunster · on March 26, 2013

Yeah. This algo is highly leveraged - like 15X. It's possible to really lose your shirt if you trade this algo exactly.

Taibo's algo is interesting as a starting point. It's not one that that you just take off the shelf and start trading with. But, you can take it and learn from it and develop an alternative strategy. Presumably one with less risk!

meson2k · on March 27, 2013

Dunster, any clue if Quantopian supports option trading?

KMag · on March 26, 2013

Clearly the simulation doesn't take into account forced de-leveraging at significant losses due to margin calls.

loganfrederick · on March 26, 2013

Shorting is also a possibility.

xntrk · on March 26, 2013

yeah the algo looked like it was just highly leveraged version of the benchmark. The benchmark, which you can get a better look at if you remove the algos chart looks a lot like SPX or some other US index (at least when you eyeball it). I would like to see this run against 2008s market and see what would happen.

dunster · on March 26, 2013

The benchmark on Quantopian is indeed modeled after the S&P 500.

How would the algo do in 2008? It's trivial for you to check it yourself. Click the "clone algo" button, change the time range of the test, and click "Run Backtest." Question answered!

Domenic_S · on March 27, 2013

> I would like to see this run against 2008s market and see what would happen.

Disaster. I'm only in Nov and at -200% return, 88% drawdown. Ouch.

Edit: spoke too soon: -453% vs -45.3% bench in nov.

Hansi · on March 26, 2013

Agreed, this is just silly.

dcnstrct · on March 27, 2013

Is it possible to bring in your own data sets? It seems the more interesting algorithms would be those that merge the pure financial volume data with outside sources like news, social data, weather patterns, or other predictors relevant to individual stock prices.

I do data mining work (not in finance) and often the best signal is the one missing. Often more prediction value is gained from additional feature construction and the layering of more interesting data sources onto the problem than simply a better algorithm on the data at hand.

nsxwolf · on March 26, 2013

So, I run this, and it makes me a bunch of free money?

Hansi · on March 26, 2013

No look at the drawdowns.

karamazov · on March 26, 2013

Quantopian is a great tool. I've played around with it, and all I can say is I wish it had existed when I was day-trading.

yarou · on March 27, 2013

I actually checked out your algorithm today during work (I work at a big bank, aka dead end for a "technologist"). It's an intriguing concept, but I still feel as though classifiers are a poor technique for P&L optimization for a given portfolio. Out of curiosity, what type of data set are you using for backtesting, and what time frame? It's not entirely clear that asset (i.e. stock) prices follow Brownian motion/Weiner process, rather they may be discontinuous.