Lessons learned building an ML trading system (opens in new tab)

(tradientblog.com)

914 pointstraK6Dcm6y ago151 comments

151 comments

98 comments · 22 top-level

nickreese6y ago· 30 in thread

After having spent an insane amount of time in late 2017/2018 building an HFT bot for Binance I can say this is a pretty solid article.

In our case we were doing triangle trading between BTC/ETH/USDT pairs and had our buys/sell delay down to 3-7ms. At one point moving 0.3-0.7% of Binance’s daily volume.

Few notes:

* Finding an objective point of truth for value when all of the currencies are floating is hard but vital to success. This was the hardest problem we encountered. We tried taking the realtime average of BTC and ETH across all exchanges, we tried tying it to the shortest route to USD, and several other routes... but ultimately this is where we ended up “losing” most of our alpha.

* Order books are seemingly simple but the devil is in the details. This especially matters for paper trading.

* Efficiently using API limits at exchanges is an optimization problem in and of itself.

* Our model was relatively simple but we focused on speed and edge cases. For instance Binance would rotate IPs on their load balancers and we’d constantly check the latency between each open SSL connection and use the fastest. Further we wouldn’t decode the buy response to plaintext we’d just read the raw stream.

After several epic months our entire project fell apart after a cryptic phone call about “institutional access” that didn’t follow the 1s websocket update. The access was quiet expensive and we said no to it and shortly after all of our strategies went to crap.

Best we could tell someone was front running us due to an artificial delay for our account (delay between trades went to ~20ms up from our prior steady speed of 3-7ms) and/or a bunch of the trades in the orderbook were bogus.

Frustrated we tried our strategy on another account and the delay dropped again to our normal range and was profitable again (the orderbooks were slightly different between bots!).

It was in that moment we realized playing in unregulated markets is not fun or something we wanted to continue to do. Intermediary risk was something we didn’t account for.

Further we realized that there will always been a better resourced or more dedicated team willing to fight you for your alpha.

After months of effort and a ton of fun we decided it was best we went back and focused on a problem where we could build a long term competitive advantage.

Edit: typos and formatting

milesvp6y ago

Even in regulated markets there are problems. I have a friend who built an HFT algorithm that he was using for trading stocks. He had some really good results with early testing. But at some point he was confused why many of his buy/sell orders weren't being executed despite being open for several minutes (hours?) his algo would make bids far away from the current spread, anticipating movement. He finally concluded that some institutional traders must have access to sub-penny ordering, despite it being against regulation.

I didn't believe him at first, since, the more likely problem was elsewhere, but then a few months later sub-penny trading in dark pools was all over the news. This was like 5 or 6 years ago. He's since moved on to other things having come to similar conclusions, that trying to play such a rigged game was futile.

nickles6y ago

> But at some point he was confused why many of his buy/sell orders weren't being executed despite being open for several minutes (hours?) his algo would make bids far away from the current spread, anticipating movement

* Did his levels actually get crossed? It could be that he never had the opportunity to get filled at the price he hoped for.

* What was his queue position? Stock exchanges tend to be price time ordered. If others had submitted orders at the same price before he had, they would have priority when getting filled.

* Apropos the prior point, was his broker actually submitting orders when he sent them? Some brokers may avoid showing deep out of the money orders, which could have affected his queue position.

> He finally concluded that some institutional traders must have access to sub-penny ordering, despite it being against regulation.

There's nothing wrong with this. Reg NMS rule 612 permits sub-penny price improvement [0].

[0] https://www.sec.gov/divisions/marketreg/subpenny612faq.htm#q...

2 more replies

tardis_thad6y ago

Really interesting, thanks! Could you share bit more about the 'institutional access' call? Does it mean that selected few have access to real-time book updates and all the rest is @100ms delayed? So much of level playing field :) Is it common for asian crypto exchanges?

carlsborg6y ago

> the orderbooks were slightly different between bots

That sounds like a big deal. If this is repeatable, you should document it better. Unregulated doesn't mean a license to do blatantly illegal things. Crypto exchanges certainly get taken to court.

Sure it wasnt your book building algo and a snapshot retrieval race?

daniel-levin6y ago

There's a much less nefarious explanation. I'm not saying they didn't manipulate the order book feeds. I work in the regulated trading space. I've written market data decoders. One is in production inside of a bigger trading system I wrote at a customer as I type this.

The less nefarious explanation is simple. You decide how likely it is: there is no mechanism making sure that IP packets sent to all data feed subscribers arrive at the same time. Exchanges of consequence distribute market data via UDP multicast, over physical links that are as identical as possible (think identical lengths of fibre).

Now if you're receiving JSON via Websocket and parsing it using an allocating parser and your NIC driver is in kernel space and you use a GC'd language and if the exchange loops through a list of TCP connections to send a message to them one at a time and there is jitter in packet delivery time in upstream hosts (and other internet weather) and and and ... you simply cannot expect identical order books at the submillisecond timescale.

A fortiori, throw half of that crap away and suppose they were using userspace NICs and no-alloc single threaded C++ that chills entirely in L1 cache. Still consuming TCP over the public internet.

1 more reply

nickreese6y ago

After 2 weeks of digging through our code I’m 100% confident it wasn’t our code. Was able to hook up to a total of 3 additional accounts via api, all 3 books matched, and were started at different times.

We wrote support and they told us they would investigate. Never heard back.

pjc506y ago

Binance is in Malta, a somewhat corrupt tax haven; are you sure it's illegal there?

1 more reply

megaframe6y ago

Ran an ML model years ago had a number of great months then out of no where no trade I or the ML would make, would work. Looked like someone was front running my orders and messing with my trades. Weird delays, trades would take to long to go through all, and all sorts of odd events on Level 2. Ended up shutting it down took a good 2 months before my manual trades started going through at a normal rate again.

This is why the whole $0 trading fee and robinhood concern me. I'm paying for the trades and someone is still messing with me.

smeeth6y ago

I think, though I'm not sure, that RH makes their money by investing in treasury bills with the cash balances of people's portfolios. You're (most likely) not getting secretly screwed.

5 more replies

criddell6y ago

What's the value of HFT? If exchanges were required to add a random delay to very trade to work against high frequency traders, would anything of value be lost?

jstanley6y ago

If exchanges added a random delay, that would mean traders have to plan for a longer time horizon, which means they need to offer more conservative prices. That means when you're selling, you get less money, and when you're buying, you get less stuff.

Some people have this notion that if their trade got matched against a HFT that they have somehow lost out. The exact opposite is true. If the party on the other side of your trade was a HFT, then that implies that all other parties were offering worse prices than the HFT was. If the HFT had not been there, you would have got a worse price (whether on the buy side or the sell side). The presence of high-frequency traders (or any traders, for that matter) reduces spreads and increases liquidity.

The only people who get worse deals as a result of high-frequency trading are people who would sell you the same stuff for more money.

1 more reply

traK6DcmOP6y ago

Many HFT systems provide liquidity. If there is no liquidity, retail investors like you or me cannot buy or sell.

Just imagine you want to exchange a currency because you go traveling and the exchange tells you "Sorry, nothing available right now, gotta come back in a few weeks". That's what would happen if there is no liquidity.

3 more replies

1e-96y ago

Yes, adding delay generally lowers the economic value of markets. In addition to improving liquidity, as mentioned by others here, HFT also improves the relative pricing of related instruments. There is an enormous number of financial instruments traded for the purpose of prioritizing resources, transforming assets, and offsetting risk. Equities and cryptocurrencies are a small part of this connected financial universe. There are many trading entities with unique information and/or understanding about different parts of this constantly changing universe of prices. None of them has a complete view, but by communicating with each other through trading, their expertise is pooled. HFT speeds this communication network, which improves the dynamic consensus.

downerending6y ago

As sibling said, it's liquidity. Worked at an HFT for a while, and they once got in a snit with an exchange over some policy change. They turned off their trading there one morning, and within 20 minutes, the exchange was on the phone begging them to come back.

rb8086y ago

The liquidity from bots has replaced the human brokers. Sell side stock trader is a job that mostly no longer exists Eg https://www.bloomberg.com/news/articles/2018-04-30/goldman-t...

musicale6y ago

Not really. The real-world value of Google (or any other company) doesn't change meaningfully on a millisecond basis.

munk-a6y ago

Liquidity, which honestly has very little value to a traditional stock market. HFT and day trading both just exist as rent seeking behaviors on company valuation growth.

rixrax6y ago

When You do this kind of (HFT) trading, how does ones tax returns look like? E.g. I assume you still have to report every trade with its gain/loss to a tax authority?

TuringNYC6y ago

IANAA, but you file a gigantic "Schedule-D". I have never done it with crypto, but there is standard software to do it for non crypto stuff.

I used GainsKeeper: http://www.wolterskluwerfs.com/tax-reporting/solutions/gains...

I see they do crypto, but I've never used it for that.

For companies, they have entire departments that consolidate trades and construct massive Schedule Ds

jotakami6y ago

Depends on the nature of the trades. If you are hedging and maintaining a delta-neutral position (very common situation for market makers and arbitrageurs) then technically your trades are part of a straddle, which is reported separately from capital gains. (in the U.S., at least)

I learned this because I had a six-figure capital loss on my spot crypto trades, and a slightly larger gain on unregulated futures which do not fall under capital gains. Capital losses can only offset up to $3k in other income, so I was terrified that I’d have to pay taxes on the massive “gain” without being able to count the losses against it. Fortunately, the straddle is the correct way to report this, allowing gains and losses to be matched across instruments that may otherwise fall into different income categories.

pjc506y ago

> It was in that moment we realized playing in unregulated markets is not fun or something we wanted to continue to do. Intermediary risk was something we didn’t account for.

Indeed. This is the really difficult thing about the crypto space: winnings, if you can keep them. And you can't if the house is just going to front-run all your orders.

alexcnwy6y ago

Fascinating story, thanks for sharing!

"we realized that there will always been a better resourced or more dedicated team willing to fight you for your alpha" - that's part of what makes financial markets such a fun and interesting challenge but agreed, intermediary risk at the timescales you were operating at in this kind of unregulated market is real and not fun

I'd love to hear what problem you moved to that you believe you can build a long term competitive advantage on if you can talk about it?

nickreese6y ago

My partner and I had built broadbandnow.com and were looking for a new interesting problem to solve to get us fire up again about fun tech problems. This just scratched the itch. The team was running day to day of the other business while we worked on the HFT bot.

Shortly after we realized we either should exit or hire an exec team to run the business. After a few months of executive search we found an offer we liked and took it.

These days I’m interested in using some heuristics based on public data to help consumers make informed decisions about nursing homes and in home health care. Still early on this project but lots of data and not many people looking to do good in that market. Seems like a great place I can add value.

1 more reply

bhl6y ago

> Best we could tell someone was front running us due to an artificial delay for our account (delay between trades went to ~20ms up from our prior steady speed of 3-7ms) and/or a bunch of the trades in the orderbook were bogus.

I wouldn't be surprised, given that traditional HFT companies are building cryptotrading desks and they have a lot more capital to play with too.

jstanley6y ago

You can't simply front-run someone else by having deep pockets.

The only way that is possible is if you gain privileged information about that person's orders before they actually hit the book. The most practical way to do this is to be the exchange.

2 more replies

throwaway_e4WNi6y ago

> the orderbooks were slightly different between bots

As somebody who still runs a profitable bot on Binance I find this hard to believe.

Also all order book related endpoints/streams are public, so queries/subscriptions are not tied to a specific account.

nickreese6y ago

In late 2017/early 2018 when their websocket was new I believe you had to be authenticated for either the initial order book query or the websocket subscription. Either way since we always locked our api keys to our IPs it wouldn't have mattered.

kami88456y ago

Thanks for sharing. Do you think building the strategy on another exchange such as Coinbase Pro or pursuing a strategy that wasn't as latency-sensitive might've yielded more success?

andruby6y ago

I ran a triangle trading arbitrage bot on Coinbase that was profitable at the end of 2017. But a few weeks later, it was losing money on hard to diagnose slippage. The number of “opportunities” also collapsed, which indicates that other bots were doing the same or that coinbase was doing this internally.

I pulled the plug. Tried to run it on Binance but the websocket only updated once a second, so there was way too much risk.

keyle6y ago

I have had a similar experience. The house always wins.

cco6y ago· 11 in thread

And in the end, what value was created?

"Liquidity in the BTC market"?

traK6DcmOP6y ago

Author here. Actually, the system is mostly taking liquidity from the market so it's not even doing that :) Perhaps, "opportunity for others to create more liquidity"

Jokes aside, it's actually something I am thinking about a lot. Such systems don't create value, but they extremely intellectually interesting and I've learned a lot. You can say the same for many other projects, for example academic research in many fields. Most of it is just noise to promote the author and does not create value in the world. But it's intellectually interesting, so people work on it.

Other people write compilers for fun to learn something new without creating value. I don't think this is fundamentally different.

ScottBurson6y ago

I think of it as like competitive sport. The value you're providing is the opportunity for others to test themselves against you.

mrpopo6y ago

> But it's intellectually interesting, so people work on it.

No. It's a financially sound way to use time, so people work on it. Anything can be intellectually interesting.

What kind of academic research does not create value? And if so, maybe it deserves to be criticized the same way.

1 more reply

account734666y ago

So you enter with market orders and exit with trailing limit orders?

mrchicity6y ago

I've thought a lot about this too (I used to work in HFT). Here's what I think:

- The only part I didn't like in your article was how you described creating indicators as exploitation. The limit order book is public by design so all traders can look at it. People have the free choice to trade on a centralized exchange or not. This is a trade-off between revealing information and being able to trade quickly without calling all your friends asking if they want to buy some Bitcoin.

- I'm guessing you used data from other exchanges outside the one you were trading as indicators too. That's unquestionably good since your trading helped information propagate faster or more accurately than it would have otherwise.

- Markets are only zero-sum in isolation. Most participants derive utility from things outside short-term profit and loss. Maybe they trade to manage risk, to hedge, to gamble, have a longer time horizon than you, whatever. They just want to trade and get back to their lives. They don't want to waste time squeezing the last fraction of a basis point out of their fills. It's hard to believe, but they actually enjoy getting picked off, run over, paying too much spread, whatever things make you feel bad or indifferent about the service you provide.

I used to get filled making markets on Nasdaq (which pays resting orders a rebate, and charges crossers) when BX (which pays crossers a rebate) was at the same price, and could lay off the trade for an instant profit. The people who traded with me paid for the luxury of saying "fuck it, send it to good ol' Nasdaq." I used to think it was stupid of them, and from the perspective of a prop trader, it was mind numbingly stupid, but they probably had more productive things to do than read every exchange fee schedule or hook up to every small exchange.

- Providing liquidity has nothing to do with resting limit orders vs. crossing the spread. Providing liquidity is about taking risk off the hands of people that don't want it, and moving it across time to someone else who does. If you're market neutral, trade many round trips every day, and end relatively flat, you've played that intermediary role as a liquidity provider regardless of what order types you use.

- Crossing against mispriced orders is doing the world a favor. You're not the bad guy picking them off. If anything, they're the bad guy for holding the market at an incorrect price.

So maybe think of yourself as more of a service provider. Not only will you feel better, but viewing trading through that lens tends to make you a better trader. Strategies truly built around an exploitation mindset are fundamentally unsustainable, since you run out of people to exploit. Providing a service works forever.

FWIW, the rest of what you wrote is almost exactly how the pros do things. If you built this system yourself, you could make far more than 200k at a prop firm. If you're interested, reply with a throwaway and I can refer you to a friend who's still in the business.

drawkbox6y ago

Depends on what the gains are used for.

Ultimately all investors from the hedge funds down to the long sucker, it is all skimming for returns. HFT just skims from other skimmers returns. Yes those investments may have created value, but ultimately people invest because they want returns.

The money made will be used for more skimming but also some for investment in other possibly beneficial projects.

Extracting financial value via investment (skimming) may allow someone to start a company, support a community or family and have valuable time to make those things better. Just as an investment in a company/idea creates value in that company, the value gained from trading where the first step really is skimming, can be used to create value in the real world or even just time which may lead to more real actual value created.

arthurcolle6y ago

Don’t forget price discovery.

anigbrowl6y ago

But you're not discovering anything valuable because there may not be a person on the other end of a trade. You're just learning about the chaotic boundaries of the trading algorithm.

1 more reply

wpietri6y ago

Exactly. Regular speculators at least have the (hazy) claim that they're reducing the spread for people trading markets with actual utility. But given BTC's lack of economic meaning, that doesn't work here.

If anything, providing BTC liquidity might just be increasing net societal harm.

jrockway6y ago

I feel like commercial activity finds value and removes it from the system.

To create value, you need to give someone else money. If I wanted to build a better society, I'd make sure everyone had the best possible education. Paying for this would bankrupt me, maybe even bankrupt the entire country. But with every single person in the country walking around with a deep understanding of music, art, mathematics, engineering, and science... as a society, I'm sure we'd do great things. Value would be created in the very long term, but not for me, the potential investor.

Then on the other hand, we have things like automated trading. That boils down to asking a bunch of people "will you pay $5 for this $4 bitcoin?" Anyone that says "sure!" just gave you a dollar. Do that millions of times per second, and you remove as much value from the system as possible.

jjoonathan6y ago

The reason why you feel this way is because there is a large difference between value† as defined by markets (which I'm writing with a dagger †) and value as defined by everyone else (which I'm writing without a dagger). Value† is weighted by wealth. Value is not. Feeding a starving African village generates value, but feeding a starving African village does not generate value†, because the village has no money. Figuring out how to monopolize the telecom industry generates value†, because it benefits heavily-weighted interests while only penalizing lightly-weighted interests, but it does not generate value.

Markets promote value† creation, but they don't promote value creation, except insofar as the two concepts coincide. It is instructive to think about the circumstances under which the two concepts coincide, the extent to which those circumstances hold, and the extent to which we are moving towards or away from those circumstances.

1 more reply

onlyrealcuzzo6y ago· 7 in thread

Is this the author?

I would love to know how this fared recently in the large sell-off.

What he says about some markets possibly being predicable rings true to me. But the article was far from convincing that the BTC market is actually predicable.

The natural assumption should be that the author was in the right place at the right time. Although he went through great lengths, I'm not convinced this is anything other than luck.

traK6DcmOP6y ago

Author here. Let's assume you cannot got short (which you can't in most crypto markets) and can only make money when the market goes up. The best you can do is avoid losing money when the market goes down.

But there are no pure market downturns. On a daily scale the market may be down, but that does not mean that on a millisecond or second-scale you will only see downward movement. There is just an overall downtrend, but there is still almost the same upward movement to make money. For HFT systems, it really doesn't matter if the market, on a daily scale, goes up or down. There is no difference.

In fact, on many markets the system does better in downward trends. Probably because there is more liquidity on that side of the book, a bias that may come from certain market participants.

smabie6y ago

How do you maintain a market neutral portfolio without the ability to go short? Also I would be interested in hearing about your Sharpe ratio. Thanks! Great article!

1 more reply

Akababa6y ago

Through hypothesis testing you can estimate the probability that this was due to luck is very low. Assuming that a monkey would have a 50% chance of profiting on a day, the chance of going a month without a losing day is less than 1 in a billion.

symplee6y ago

How many monkey bots are flipping the coin daily? Selection bias just needs one.

I very much look forward to the author's follow up post at the end of 2020 to see if another 5k turns into 200.

1 more reply

onlyrealcuzzo6y ago

Wouldn't it be better to look at how much this deviates from average market returns during the time frame?

1 more reply

dtjohnnyb6y ago

IANAquant, but he said in the intro that he used a market neutral strategy, so he _should_ make money both when the broader market is going up or down. It would be interesting to know though!

semiotagonal6y ago

I wish he'd keep it running, then write on lessons learned turning $200K into something much less, should such a loss be manifest.

pinouchon6y ago· 5 in thread

I spent the last year working fulltime on a system similar to the one described here. I trade the top ~20 cryptos on binance. I use deep learning models (combination of temporal, causal convnets and RNNs) with heavy data augmentation. I built my own tooling for data collection, training, backtesting and live deployment. Having a data engineer background coming into this was hugely helpful: most of my time was spent manipulating data in some way (and not playing around with the models). One of the most demanding parts was estimating spread/slippage costs and including it into the loss function.

Most of what the author talked about, I learned the hard way.

I'm now at the point where I ran some tests (trading small amounts) live on binance and the results are positive: I do manage to make small profits, but more importantly, the recorded live trades reflect very closely the backtest trades (for a given period). I'm currently scaling up my model and adding better monitoring / reporting / CI.

I'd be happy to chat with anyone having done similar projects or willing to exchange ideas.

alexcnwy6y ago

I'd love to hear more about what kind of data augmentation you're doing. A friend of mine recently got a GAN to work for timeseries which is really interesting.

I've done a lot of work in the space and would love to chat - just emailed you :)

pinouchon6y ago

I use a supervised learning setup (although with a custom loss function).

The kind of data augmentation I do is adding different candles sizes. I validate with 5m candles, but I train with 2,3,4,5,6,7m ones. I also sample more frequently more recent data. I train jointly with ~22 symbols, but in each X with those symbols, I randomly set some to 0, some I invert their price, some I invert time-wise. This helps generalization for some reason. I tried many kinds of noise, but what I described above is what I found to work best in my case.

I have a more ambitious idea to generate synthetic data using self play: have a bunch of agents trading one against another. This create new price data I can train the agents with, and repeat (this self-play training scheme would be similar to what DeepMind did with AlphaGo/AlphaZero). The issue with it is the need to tune the parameters exactly so that the resulting synthetic data is realistic enough that I can tranfer the agents to real data.

For example, during self-play, should you have only trading agents or should you add "retail traders" that buy during bubbles, "normal buyers" that buy only below, sell above certain prices, institutional buyers that randomly move the price a lot in a given direction. This is a lot of parameters to get right, and it's an optitization problem on it own. You could treat this a as two-fold optimization problem such as in this paper: https://arxiv.org/pdf/1810.02513.pdf, but it gets tricky very fast.

ghgr6y ago

Indeed GANs are showing very promising results. There's a series of blog posts from Fernando de Meer which discuss this topic in a very approachable way. https://quantdare.com/generating-financial-series-with-gener...

1 more reply

Traster6y ago

Are you concerned by other commenters who mention that once they were successful the exchange moved in to either stop them or extract a fee?

pinouchon6y ago

Yes I am. Without more detail about the case it's hard to say what was the issue. Binance has what they call "machine learning limits": https://www.binance.com/en/support/articles/360004492232.

What lowers my concerns is that such issues might come once the bot is profitable, and by then I might have found other ways to raise capital (I'm doing a trading bot as a way to raise capital to create an AI research lab). Being able to setup a profitable crypto trader is a good thing to add on a resume or for personal branding even if I shut it down at some point. I'll be in a very different position by this point so it's a bit premature to be concerned about that now, although it's still somewhat of a concern.

On the positive side, I hope this scares the competition away.

thundergolfer6y ago· 5 in thread

This post is an exemplar of the crucial relationship between domain-specific knowledge and ML competency in the ML space. The bulk of the post is detailing the tricky ins and outs of trading, and overall the author gives the impression that they're broadly knowledgeable about stock markets.

Contrast this post with those you see with ML hobbyists who delve into medicine or fake-news and produce useless results testament to their lack of domain-specific competency.

nsainsbury6y ago

Bingo. I've made this comment several times on Hacker News in the past as well and in my opinion it's the number one reason I've seen ML projects fail to have impact at companies it's deployed: the operators (typically higher level Math/CS types) simply don't understand the domains well enough and so frequently end up making absurd recommendations/suggestions (often to the detriment of other business areas).

The successful application of ML requires a deep understanding of the domain it's being applied in.

jacquesm6y ago

Domain knowledge is essential to almost any project that aims for eventual commercial success, it is quite rare than an outsider will come into a field, apply some ML and make a killing.

deepnotderp6y ago

RenTec

(Yes, I know they do much more than ML, but still)

1 more reply

alexcnwy6y ago

Totally agree.

Funnily enough I think the ML hobbyist problem is most pervasive in the "predict the stock market" domain. There was a post on HN a few days ago [1] that was overfitting the validation set and hand-waving away fees and spreads. The author concluded that "there was no subtle underlying pattern" because they failed to find one.

[1] https://news.ycombinator.com/item?id=21624907

thundergolfer6y ago

I agree. It mixes the hype of ML with the evergreen appeal of the 'Get Rich Quick' scheme.

gricardo996y ago· 4 in thread

Great post! Very refreshing to hear about a) the honest level of effort involved in this type of endeavor, b) the amount of nonsense trading advice out there.

Maybe in a future post you could discuss the security and banking side of this in more detail? In the 6ish years I’ve played around with crypto trading (and I really mean play, nothing close to your level), I’ve had 2 exchanges hacked and lose all customer funds, another 2 had major security breaches causing days of downtime but recovered, and one site seized by the FBI.

Then there are the horror stories of banks freezing your account when you move funds in and out of exchanges. Luckily That hasn’t happened to me.

I bet you have some good stories and perspective on that side of it, I would love to hear it.

traK6DcmOP6y ago

Author here. Honestly, I don't have a good answer. I spread my capital across enough exchanges so that if one runs away with it gets hacked it doesn't ruin me. It's just a risk I'm taking.

I'm also not trading much capital. Because the system is more on the HFT side, the actively traded capital isn't that high, and I don't care about losing it. Any profit I try to get out of the exchanges regularly. I wouldn't feel comfortable leaving large sums on those exchanges.

account734666y ago

Apart from Binance who else you trust? (of course overall no crypto exchange can be really trusted) I assume you trade alts vs (BTC or USDT).

Also, when you said "market neutral", did you mean you also short (only few pairs have margin on Binance and it appeared recently).

nov2720196y ago

The banking side is becoming more mature, I think, as many exchanges like Coinbase provide custodial cold-storage options for institutional clients.

Counter-party risk always exist.

> Then there are the horror stories of banks freezing your account when you move funds in and out of exchanges.

Depends on the country. What happened to me is that a bank did not freeze my account. Instead, they simply reported it to the government, and asked AML questions regarding the transfer. The government, on the other hand, wanted me to provide bookkeeping records. Otherwise, they were going to assume that every transfer coming back from cryptocurrency exchange was pure profit.

Basically, I was not raided, my accounts were not frozen, but the government knows my wallet addresses (and I had to pay back 4 years worth of cryptocurrency trading profits with interest applied, which also left me realize how little I had made profit in the end).

carlsborg6y ago

Why did you have to pay back trading profits with interest? Do you mean taxes on trading profits with interest*?

Which country?

1 more reply

d--b6y ago· 4 in thread

Is anyone else suspicious about the results?

Claiming a 4000% return while staying market neutral seems a little too good to be true.

First: those levels are insanely high, so the algo must be taking some absurd risks and have the worst sharpe ratio, or getting pretty close to being 100% accurate.

Second: if you can scale this across markets, and assuming the same return, that investment will turn into 12 billions in 4 years. I doubt that you'd write a blog post about it if you had found such a gold mine.

crazypyro6y ago

Naively using linear scaling on financial models provides zero guidance to how the model would actually perform... Scaling financial models is an extremely hard problem.

See RenTech limiting the size of their Medallion fund because it was getting too large to scale....

kungito6y ago

Is't it so that many of these strategies don't scale well? When you are in low volume trading you are collecting all the best trades but as soon as you go 10x you are affecting way too much

jotakami6y ago

Back in 2017/2018, not at all surprising. I don’t see that kind of opportunity in the crypto markets anymore though, they’ve gotten a lot more efficient. Used to be able to scalp 1% on big price moves in altcoin futures at least once a week, now the prices move in lockstep with spot.

throwawaymath6y ago

> Second: if you can scale this across markets, and assuming the same return, that investment will turn into 12 billions in 4 years.

Scalability and profitability are orthogonal. If it could scale indefinitely, you'd be right. But no trading strategy can scale indefinitely.

That doesn't say anything about whether or it "works", and it's not a reason to be suspicious of the results, in of itself. All successful trading strategies are capacity constrained.

echelon6y ago· 4 in thread

Can this same strategy be leveraged on zero-fee stock exchanges? Why is crypto the target here?

traK6DcmOP6y ago

Author here. Perhaps if you already have existing HFT infrastructure and connections to efficiently trade on such exchanges. But such infra costs millions. If you don't have this, you're probably at too large of a disadvantage to find any alpha.

At least that's my understanding based on conversations I've had, I've never traded equities.

__d6y ago

In a little more detail ...

To be competitive in US equities HFT, you need an FGPA with 40GbE ports hosted in a server (which needs to power and cool the FPGA, and deal with the less latency-sensitive bits of your system). You'll need some storage as well.

That server needs to be co-located with your target exchange(s) matching engines, and connected via 40GbE. You might additionally want remote market data via mm-wave microwave.

You can probably put together a basic but competitive hardware setup for $70k or so, if you ignore redundancy, and you only need to trade a single market. More realistically, you'll need at least two, plus shared storage, and probably more depending on what markets you intend to trade on.

Then you have monthly costs: colocation for the server(s) ($5k-ish+), port fees for the order entry ($500-ish), port fees for market data ($500-ish), physical connectivity fees ($20k-ish) , cross connect fees for the connectivity ($500-ish), wireless connectivity fees, you might need roof access (more fees), market data fees (per exchange), memberships, and trading costs.

I haven't done this for a while, but it easily adds up to $100k per month or more.

So you need to be making quite a bit to pay off your infra, before you start thinking about profit. And your model will age pretty fast, so you'll want to be working on a few possible replacements concurrently.

It's a tough business.

1 more reply

smabie6y ago

Crypto is a relatively inefficient market. Equity markets are too efficient: there are too many smart players and alpha is very hard to find. Crypto is a little easier, though that is changing fast.

proverbialbunny6y ago

Low trading fees is what originally caused me to change gears from etf programming trading to btc programming trading in the early days. I imagine it is the same for most.

fny6y ago· 3 in thread

Can someone comment on how taxes are handled when automated trades are made like this? It's something that seems wholly absent from the cost calculations.

traK6DcmOP6y ago

Author here. Personally, I just don't. I tried doing it but it was too complicated. So I end up just hiring a tax accountant specializing in crypto, send them all the data I have, and pay a few $k. In case something goes wrong, it's their fault and they take the risk.

ACow_Adonis6y ago

Not sure what country you're in, but that's not the way tax accountants work in mine. You're still liable for mistakes/problems they make here :(

lowracle6y ago

Why not establish a trading firm in an offshore country with no capital gain tax ? The bookkeeping of crypto txs is hell

mellosouls6y ago· 1 in thread

I enjoyed reading this but here is a cautionary review of a project in the same field:

https://towardsdatascience.com/what-happened-when-i-tried-ma...

Discussed here: https://news.ycombinator.com/item?id=21624907

alexcnwy6y ago

The only caution I took away from that post is that it's very easy to make mistakes applying ML to financial markets if you don't know what you're doing.

There looks like a lot of overfitting the validation set going on in that post.

It's also a mistake to conclude that "there was no subtle underlying pattern" just because the author couldn't find one.

Throwing XGBoost at a bunch of technical indicators isn't gonna cut it but I have had some solid real-world success (as have several people I know) applying ensembles of deep learning models (with regime switching based on model residuals) to profit from "subtle underlying patterns".

mtm76y ago· 1 in thread

I’m impressed with this system, but I’m even more impressed with the author’s writing style. I’d love to see more technical posts written with this level of clarity.

traK6DcmOP6y ago

Thank you very much :) I'd love to write more, I just need to figure out a good next topic.

lorepieri6y ago· 1 in thread

It is a big loss of your time and nobody will give you back that one. I suggest you to use your time in non zero-sum games, something that can create value for you and society. Now that you have some saving you can definitely afford it. The next best thing of not doing it is to quit doing it now.

Disclaimer: I built a similar system in the past, took some gains and then realised the above. I then quitted to build a company.

lowracle6y ago

I have been working on such a system and it is NOT a huge loss of your time. I've learned so much things in the past year, in market microstructure, in networking (infrastructure, protocols), cloud computing, cloud management (docker swarm, kubernetes), linux kernel bypass, distributed systems, data base, and I've read hundreds of papers on neural networks, gaussian processes, etc... If you are wondering if you should get into this, it is one of the best learning experience you will ever have.

dnautics6y ago

> For example, instead of defining a tick as 1 second, we could define it as 1.0 BTC traded...

Interestingly Benoit Mandelbrot talks about this in "the (mis)behaviour of markets" and explicitly calls it "market time"

jackschultz6y ago

> The biggest edge probably comes from the effort put into building the infrastructure.

I feel like this should be in bold, but either way, I love reading that in these posts. In every way, from research to confirm your models are correct, to be able to trust real time trades, you need a solid architecture. This thought isn't only for trading remember, where it's the same in tons of solutions to problems. If comment readers have other examples, I'd love to hear them in responses.

latchkey6y ago

I've played with writing bots before and this post hits on so many of the edge cases I personally ran into. I have never heard it this well explained before. Brilliant.

mthoms6y ago

Fascinating post. There's just so much to digest here. Well, there goes the rest of my day!

adamiscool86y ago

It's interesting they suggest the higher the timeframe, the noisier the time series, when to my understanding the opposite is typically found -- the lower timeframes exhibit a more random walk and the higher timeframes exhibit trending behavior.

jugg1es6y ago

It must be said that it is a lot easier to make money in a stock market that has had low volatility and no significant, prolonged dip in the last 5+ years. My own long-term investments have earned 20% return over the last 5 years with zero trading. I realize that this article is specifically about crypto, but trends in all markets is generally up across the board.

tatoalo6y ago

As someone who just finished a BSc in computer science and started a MSc in Financial Technology and Computing this post is really interesting to me, keep ‘em coming :D!

KloudTrader6y ago

This is a really good post, thanks for sharing it. Algorithmic trading systems vary a lot and every shop have their own way of doing things.

m3kw96y ago

Only way to beat is go long, super long, longer the better. Machines doesn’t go long

known6y ago

https://archive.vn/iI8H1

j / k navigate · click thread line to collapse

151 comments

98 comments · 22 top-level

nickreese6y ago· 30 in thread

After having spent an insane amount of time in late 2017/2018 building an HFT bot for Binance I can say this is a pretty solid article.

In our case we were doing triangle trading between BTC/ETH/USDT pairs and had our buys/sell delay down to 3-7ms. At one point moving 0.3-0.7% of Binance’s daily volume.

Few notes:

* Order books are seemingly simple but the devil is in the details. This especially matters for paper trading.

* Efficiently using API limits at exchanges is an optimization problem in and of itself.

Frustrated we tried our strategy on another account and the delay dropped again to our normal range and was profitable again (the orderbooks were slightly different between bots!).

It was in that moment we realized playing in unregulated markets is not fun or something we wanted to continue to do. Intermediary risk was something we didn’t account for.

Further we realized that there will always been a better resourced or more dedicated team willing to fight you for your alpha.

After months of effort and a ton of fun we decided it was best we went back and focused on a problem where we could build a long term competitive advantage.

Edit: typos and formatting

milesvp6y ago

nickles6y ago

* Did his levels actually get crossed? It could be that he never had the opportunity to get filled at the price he hoped for.

* What was his queue position? Stock exchanges tend to be price time ordered. If others had submitted orders at the same price before he had, they would have priority when getting filled.

* Apropos the prior point, was his broker actually submitting orders when he sent them? Some brokers may avoid showing deep out of the money orders, which could have affected his queue position.

> He finally concluded that some institutional traders must have access to sub-penny ordering, despite it being against regulation.

There's nothing wrong with this. Reg NMS rule 612 permits sub-penny price improvement [0].

[0] https://www.sec.gov/divisions/marketreg/subpenny612faq.htm#q...

2 more replies

tardis_thad6y ago

carlsborg6y ago

> the orderbooks were slightly different between bots

That sounds like a big deal. If this is repeatable, you should document it better. Unregulated doesn't mean a license to do blatantly illegal things. Crypto exchanges certainly get taken to court.

Sure it wasnt your book building algo and a snapshot retrieval race?

daniel-levin6y ago

A fortiori, throw half of that crap away and suppose they were using userspace NICs and no-alloc single threaded C++ that chills entirely in L1 cache. Still consuming TCP over the public internet.

1 more reply

nickreese6y ago

We wrote support and they told us they would investigate. Never heard back.

pjc506y ago

Binance is in Malta, a somewhat corrupt tax haven; are you sure it's illegal there?

1 more reply

megaframe6y ago

This is why the whole $0 trading fee and robinhood concern me. I'm paying for the trades and someone is still messing with me.

smeeth6y ago

I think, though I'm not sure, that RH makes their money by investing in treasury bills with the cash balances of people's portfolios. You're (most likely) not getting secretly screwed.

5 more replies

criddell6y ago

What's the value of HFT? If exchanges were required to add a random delay to very trade to work against high frequency traders, would anything of value be lost?

jstanley6y ago

The only people who get worse deals as a result of high-frequency trading are people who would sell you the same stuff for more money.

1 more reply

traK6DcmOP6y ago

Many HFT systems provide liquidity. If there is no liquidity, retail investors like you or me cannot buy or sell.

3 more replies

1e-96y ago

downerending6y ago

rb8086y ago

The liquidity from bots has replaced the human brokers. Sell side stock trader is a job that mostly no longer exists Eg https://www.bloomberg.com/news/articles/2018-04-30/goldman-t...

musicale6y ago

Not really. The real-world value of Google (or any other company) doesn't change meaningfully on a millisecond basis.

munk-a6y ago

Liquidity, which honestly has very little value to a traditional stock market. HFT and day trading both just exist as rent seeking behaviors on company valuation growth.

rixrax6y ago

When You do this kind of (HFT) trading, how does ones tax returns look like? E.g. I assume you still have to report every trade with its gain/loss to a tax authority?

TuringNYC6y ago

IANAA, but you file a gigantic "Schedule-D". I have never done it with crypto, but there is standard software to do it for non crypto stuff.

I used GainsKeeper: http://www.wolterskluwerfs.com/tax-reporting/solutions/gains...

I see they do crypto, but I've never used it for that.

For companies, they have entire departments that consolidate trades and construct massive Schedule Ds

jotakami6y ago

pjc506y ago

> It was in that moment we realized playing in unregulated markets is not fun or something we wanted to continue to do. Intermediary risk was something we didn’t account for.

Indeed. This is the really difficult thing about the crypto space: winnings, if you can keep them. And you can't if the house is just going to front-run all your orders.

alexcnwy6y ago

Fascinating story, thanks for sharing!

I'd love to hear what problem you moved to that you believe you can build a long term competitive advantage on if you can talk about it?

nickreese6y ago

Shortly after we realized we either should exit or hire an exec team to run the business. After a few months of executive search we found an offer we liked and took it.

1 more reply

bhl6y ago

I wouldn't be surprised, given that traditional HFT companies are building cryptotrading desks and they have a lot more capital to play with too.

jstanley6y ago

You can't simply front-run someone else by having deep pockets.

The only way that is possible is if you gain privileged information about that person's orders before they actually hit the book. The most practical way to do this is to be the exchange.

2 more replies

throwaway_e4WNi6y ago

> the orderbooks were slightly different between bots

As somebody who still runs a profitable bot on Binance I find this hard to believe.

Also all order book related endpoints/streams are public, so queries/subscriptions are not tied to a specific account.

nickreese6y ago

kami88456y ago

Thanks for sharing. Do you think building the strategy on another exchange such as Coinbase Pro or pursuing a strategy that wasn't as latency-sensitive might've yielded more success?

andruby6y ago

I pulled the plug. Tried to run it on Binance but the websocket only updated once a second, so there was way too much risk.

keyle6y ago

I have had a similar experience. The house always wins.

cco6y ago· 11 in thread

And in the end, what value was created?

"Liquidity in the BTC market"?

traK6DcmOP6y ago

Author here. Actually, the system is mostly taking liquidity from the market so it's not even doing that :) Perhaps, "opportunity for others to create more liquidity"

Other people write compilers for fun to learn something new without creating value. I don't think this is fundamentally different.

ScottBurson6y ago

I think of it as like competitive sport. The value you're providing is the opportunity for others to test themselves against you.

mrpopo6y ago

> But it's intellectually interesting, so people work on it.

No. It's a financially sound way to use time, so people work on it. Anything can be intellectually interesting.

What kind of academic research does not create value? And if so, maybe it deserves to be criticized the same way.

1 more reply

account734666y ago

So you enter with market orders and exit with trailing limit orders?

mrchicity6y ago

I've thought a lot about this too (I used to work in HFT). Here's what I think:

- Crossing against mispriced orders is doing the world a favor. You're not the bad guy picking them off. If anything, they're the bad guy for holding the market at an incorrect price.

drawkbox6y ago

Depends on what the gains are used for.

The money made will be used for more skimming but also some for investment in other possibly beneficial projects.

arthurcolle6y ago

Don’t forget price discovery.

anigbrowl6y ago

But you're not discovering anything valuable because there may not be a person on the other end of a trade. You're just learning about the chaotic boundaries of the trading algorithm.

1 more reply

wpietri6y ago

If anything, providing BTC liquidity might just be increasing net societal harm.

jrockway6y ago

I feel like commercial activity finds value and removes it from the system.

jjoonathan6y ago

1 more reply

onlyrealcuzzo6y ago· 7 in thread

Is this the author?

I would love to know how this fared recently in the large sell-off.

What he says about some markets possibly being predicable rings true to me. But the article was far from convincing that the BTC market is actually predicable.

The natural assumption should be that the author was in the right place at the right time. Although he went through great lengths, I'm not convinced this is anything other than luck.

traK6DcmOP6y ago

In fact, on many markets the system does better in downward trends. Probably because there is more liquidity on that side of the book, a bias that may come from certain market participants.

smabie6y ago

How do you maintain a market neutral portfolio without the ability to go short? Also I would be interested in hearing about your Sharpe ratio. Thanks! Great article!

1 more reply

Akababa6y ago

symplee6y ago

How many monkey bots are flipping the coin daily? Selection bias just needs one.

I very much look forward to the author's follow up post at the end of 2020 to see if another 5k turns into 200.

1 more reply

onlyrealcuzzo6y ago

Wouldn't it be better to look at how much this deviates from average market returns during the time frame?

1 more reply

dtjohnnyb6y ago

IANAquant, but he said in the intro that he used a market neutral strategy, so he _should_ make money both when the broader market is going up or down. It would be interesting to know though!

semiotagonal6y ago

I wish he'd keep it running, then write on lessons learned turning $200K into something much less, should such a loss be manifest.

pinouchon6y ago· 5 in thread

Most of what the author talked about, I learned the hard way.

I'd be happy to chat with anyone having done similar projects or willing to exchange ideas.

alexcnwy6y ago

I'd love to hear more about what kind of data augmentation you're doing. A friend of mine recently got a GAN to work for timeseries which is really interesting.

I've done a lot of work in the space and would love to chat - just emailed you :)

pinouchon6y ago

I use a supervised learning setup (although with a custom loss function).

ghgr6y ago

1 more reply

Traster6y ago

Are you concerned by other commenters who mention that once they were successful the exchange moved in to either stop them or extract a fee?

pinouchon6y ago

Yes I am. Without more detail about the case it's hard to say what was the issue. Binance has what they call "machine learning limits": https://www.binance.com/en/support/articles/360004492232.

On the positive side, I hope this scares the competition away.

thundergolfer6y ago· 5 in thread

Contrast this post with those you see with ML hobbyists who delve into medicine or fake-news and produce useless results testament to their lack of domain-specific competency.

nsainsbury6y ago

The successful application of ML requires a deep understanding of the domain it's being applied in.

jacquesm6y ago

Domain knowledge is essential to almost any project that aims for eventual commercial success, it is quite rare than an outsider will come into a field, apply some ML and make a killing.

deepnotderp6y ago

RenTec

(Yes, I know they do much more than ML, but still)

1 more reply

alexcnwy6y ago

Totally agree.

[1] https://news.ycombinator.com/item?id=21624907

thundergolfer6y ago

I agree. It mixes the hype of ML with the evergreen appeal of the 'Get Rich Quick' scheme.

gricardo996y ago· 4 in thread

Great post! Very refreshing to hear about a) the honest level of effort involved in this type of endeavor, b) the amount of nonsense trading advice out there.

Then there are the horror stories of banks freezing your account when you move funds in and out of exchanges. Luckily That hasn’t happened to me.

I bet you have some good stories and perspective on that side of it, I would love to hear it.

traK6DcmOP6y ago

Author here. Honestly, I don't have a good answer. I spread my capital across enough exchanges so that if one runs away with it gets hacked it doesn't ruin me. It's just a risk I'm taking.

account734666y ago

Apart from Binance who else you trust? (of course overall no crypto exchange can be really trusted) I assume you trade alts vs (BTC or USDT).

Also, when you said "market neutral", did you mean you also short (only few pairs have margin on Binance and it appeared recently).

nov2720196y ago

The banking side is becoming more mature, I think, as many exchanges like Coinbase provide custodial cold-storage options for institutional clients.

Counter-party risk always exist.

> Then there are the horror stories of banks freezing your account when you move funds in and out of exchanges.

carlsborg6y ago

Why did you have to pay back trading profits with interest? Do you mean taxes on trading profits with interest*?

Which country?

1 more reply

d--b6y ago· 4 in thread

Is anyone else suspicious about the results?

Claiming a 4000% return while staying market neutral seems a little too good to be true.

First: those levels are insanely high, so the algo must be taking some absurd risks and have the worst sharpe ratio, or getting pretty close to being 100% accurate.

crazypyro6y ago

Naively using linear scaling on financial models provides zero guidance to how the model would actually perform... Scaling financial models is an extremely hard problem.

See RenTech limiting the size of their Medallion fund because it was getting too large to scale....

kungito6y ago

Is't it so that many of these strategies don't scale well? When you are in low volume trading you are collecting all the best trades but as soon as you go 10x you are affecting way too much

jotakami6y ago

throwawaymath6y ago

> Second: if you can scale this across markets, and assuming the same return, that investment will turn into 12 billions in 4 years.

Scalability and profitability are orthogonal. If it could scale indefinitely, you'd be right. But no trading strategy can scale indefinitely.

That doesn't say anything about whether or it "works", and it's not a reason to be suspicious of the results, in of itself. All successful trading strategies are capacity constrained.

echelon6y ago· 4 in thread

Can this same strategy be leveraged on zero-fee stock exchanges? Why is crypto the target here?

traK6DcmOP6y ago

At least that's my understanding based on conversations I've had, I've never traded equities.

__d6y ago

In a little more detail ...

That server needs to be co-located with your target exchange(s) matching engines, and connected via 40GbE. You might additionally want remote market data via mm-wave microwave.

I haven't done this for a while, but it easily adds up to $100k per month or more.

It's a tough business.

1 more reply

smabie6y ago

Crypto is a relatively inefficient market. Equity markets are too efficient: there are too many smart players and alpha is very hard to find. Crypto is a little easier, though that is changing fast.

proverbialbunny6y ago

Low trading fees is what originally caused me to change gears from etf programming trading to btc programming trading in the early days. I imagine it is the same for most.

fny6y ago· 3 in thread

Can someone comment on how taxes are handled when automated trades are made like this? It's something that seems wholly absent from the cost calculations.

traK6DcmOP6y ago

ACow_Adonis6y ago

Not sure what country you're in, but that's not the way tax accountants work in mine. You're still liable for mistakes/problems they make here :(

lowracle6y ago

Why not establish a trading firm in an offshore country with no capital gain tax ? The bookkeeping of crypto txs is hell

mellosouls6y ago· 1 in thread

I enjoyed reading this but here is a cautionary review of a project in the same field:

https://towardsdatascience.com/what-happened-when-i-tried-ma...

Discussed here: https://news.ycombinator.com/item?id=21624907

alexcnwy6y ago

The only caution I took away from that post is that it's very easy to make mistakes applying ML to financial markets if you don't know what you're doing.

There looks like a lot of overfitting the validation set going on in that post.

It's also a mistake to conclude that "there was no subtle underlying pattern" just because the author couldn't find one.

mtm76y ago· 1 in thread

I’m impressed with this system, but I’m even more impressed with the author’s writing style. I’d love to see more technical posts written with this level of clarity.

traK6DcmOP6y ago

Thank you very much :) I'd love to write more, I just need to figure out a good next topic.

lorepieri6y ago· 1 in thread

Disclaimer: I built a similar system in the past, took some gains and then realised the above. I then quitted to build a company.

lowracle6y ago

dnautics6y ago

> For example, instead of defining a tick as 1 second, we could define it as 1.0 BTC traded...

Interestingly Benoit Mandelbrot talks about this in "the (mis)behaviour of markets" and explicitly calls it "market time"

jackschultz6y ago

> The biggest edge probably comes from the effort put into building the infrastructure.

latchkey6y ago

I've played with writing bots before and this post hits on so many of the edge cases I personally ran into. I have never heard it this well explained before. Brilliant.

mthoms6y ago

Fascinating post. There's just so much to digest here. Well, there goes the rest of my day!

adamiscool86y ago

jugg1es6y ago

tatoalo6y ago

As someone who just finished a BSc in computer science and started a MSc in Financial Technology and Computing this post is really interesting to me, keep ‘em coming :D!

KloudTrader6y ago

This is a really good post, thanks for sharing it. Algorithmic trading systems vary a lot and every shop have their own way of doing things.

m3kw96y ago

Only way to beat is go long, super long, longer the better. Machines doesn’t go long

known6y ago

https://archive.vn/iI8H1

j / k navigate · click thread line to collapse