In our case we were doing triangle trading between BTC/ETH/USDT pairs and had our buys/sell delay down to 3-7ms. At one point moving 0.3-0.7% of Binance’s daily volume.
Few notes:
* Finding an objective point of truth for value when all of the currencies are floating is hard but vital to success. This was the hardest problem we encountered. We tried taking the realtime average of BTC and ETH across all exchanges, we tried tying it to the shortest route to USD, and several other routes... but ultimately this is where we ended up “losing” most of our alpha.
* Order books are seemingly simple but the devil is in the details. This especially matters for paper trading.
* Efficiently using API limits at exchanges is an optimization problem in and of itself.
* Our model was relatively simple but we focused on speed and edge cases. For instance Binance would rotate IPs on their load balancers and we’d constantly check the latency between each open SSL connection and use the fastest. Further we wouldn’t decode the buy response to plaintext we’d just read the raw stream.
After several epic months our entire project fell apart after a cryptic phone call about “institutional access” that didn’t follow the 1s websocket update. The access was quiet expensive and we said no to it and shortly after all of our strategies went to crap.
Best we could tell someone was front running us due to an artificial delay for our account (delay between trades went to ~20ms up from our prior steady speed of 3-7ms) and/or a bunch of the trades in the orderbook were bogus.
Frustrated we tried our strategy on another account and the delay dropped again to our normal range and was profitable again (the orderbooks were slightly different between bots!).
It was in that moment we realized playing in unregulated markets is not fun or something we wanted to continue to do. Intermediary risk was something we didn’t account for.
Further we realized that there will always been a better resourced or more dedicated team willing to fight you for your alpha.
After months of effort and a ton of fun we decided it was best we went back and focused on a problem where we could build a long term competitive advantage.
Edit: typos and formatting
I didn't believe him at first, since, the more likely problem was elsewhere, but then a few months later sub-penny trading in dark pools was all over the news. This was like 5 or 6 years ago. He's since moved on to other things having come to similar conclusions, that trying to play such a rigged game was futile.
* Did his levels actually get crossed? It could be that he never had the opportunity to get filled at the price he hoped for.
* What was his queue position? Stock exchanges tend to be price time ordered. If others had submitted orders at the same price before he had, they would have priority when getting filled.
* Apropos the prior point, was his broker actually submitting orders when he sent them? Some brokers may avoid showing deep out of the money orders, which could have affected his queue position.
> He finally concluded that some institutional traders must have access to sub-penny ordering, despite it being against regulation.
There's nothing wrong with this. Reg NMS rule 612 permits sub-penny price improvement [0].
[0] https://www.sec.gov/divisions/marketreg/subpenny612faq.htm#q...
That sounds like a big deal. If this is repeatable, you should document it better. Unregulated doesn't mean a license to do blatantly illegal things. Crypto exchanges certainly get taken to court.
Sure it wasnt your book building algo and a snapshot retrieval race?
The less nefarious explanation is simple. You decide how likely it is: there is no mechanism making sure that IP packets sent to all data feed subscribers arrive at the same time. Exchanges of consequence distribute market data via UDP multicast, over physical links that are as identical as possible (think identical lengths of fibre).
Now if you're receiving JSON via Websocket and parsing it using an allocating parser and your NIC driver is in kernel space and you use a GC'd language and if the exchange loops through a list of TCP connections to send a message to them one at a time and there is jitter in packet delivery time in upstream hosts (and other internet weather) and and and ... you simply cannot expect identical order books at the submillisecond timescale.
A fortiori, throw half of that crap away and suppose they were using userspace NICs and no-alloc single threaded C++ that chills entirely in L1 cache. Still consuming TCP over the public internet.
We wrote support and they told us they would investigate. Never heard back.
This is why the whole $0 trading fee and robinhood concern me. I'm paying for the trades and someone is still messing with me.
Some people have this notion that if their trade got matched against a HFT that they have somehow lost out. The exact opposite is true. If the party on the other side of your trade was a HFT, then that implies that all other parties were offering worse prices than the HFT was. If the HFT had not been there, you would have got a worse price (whether on the buy side or the sell side). The presence of high-frequency traders (or any traders, for that matter) reduces spreads and increases liquidity.
The only people who get worse deals as a result of high-frequency trading are people who would sell you the same stuff for more money.
Just imagine you want to exchange a currency because you go traveling and the exchange tells you "Sorry, nothing available right now, gotta come back in a few weeks". That's what would happen if there is no liquidity.
I used GainsKeeper: http://www.wolterskluwerfs.com/tax-reporting/solutions/gains...
I see they do crypto, but I've never used it for that.
For companies, they have entire departments that consolidate trades and construct massive Schedule Ds
I learned this because I had a six-figure capital loss on my spot crypto trades, and a slightly larger gain on unregulated futures which do not fall under capital gains. Capital losses can only offset up to $3k in other income, so I was terrified that I’d have to pay taxes on the massive “gain” without being able to count the losses against it. Fortunately, the straddle is the correct way to report this, allowing gains and losses to be matched across instruments that may otherwise fall into different income categories.
Indeed. This is the really difficult thing about the crypto space: winnings, if you can keep them. And you can't if the house is just going to front-run all your orders.
"we realized that there will always been a better resourced or more dedicated team willing to fight you for your alpha" - that's part of what makes financial markets such a fun and interesting challenge but agreed, intermediary risk at the timescales you were operating at in this kind of unregulated market is real and not fun
I'd love to hear what problem you moved to that you believe you can build a long term competitive advantage on if you can talk about it?
Shortly after we realized we either should exit or hire an exec team to run the business. After a few months of executive search we found an offer we liked and took it.
These days I’m interested in using some heuristics based on public data to help consumers make informed decisions about nursing homes and in home health care. Still early on this project but lots of data and not many people looking to do good in that market. Seems like a great place I can add value.
I wouldn't be surprised, given that traditional HFT companies are building cryptotrading desks and they have a lot more capital to play with too.
The only way that is possible is if you gain privileged information about that person's orders before they actually hit the book. The most practical way to do this is to be the exchange.
As somebody who still runs a profitable bot on Binance I find this hard to believe.
Also all order book related endpoints/streams are public, so queries/subscriptions are not tied to a specific account.
I pulled the plug. Tried to run it on Binance but the websocket only updated once a second, so there was way too much risk.
"Liquidity in the BTC market"?
Jokes aside, it's actually something I am thinking about a lot. Such systems don't create value, but they extremely intellectually interesting and I've learned a lot. You can say the same for many other projects, for example academic research in many fields. Most of it is just noise to promote the author and does not create value in the world. But it's intellectually interesting, so people work on it.
Other people write compilers for fun to learn something new without creating value. I don't think this is fundamentally different.
No. It's a financially sound way to use time, so people work on it. Anything can be intellectually interesting.
What kind of academic research does not create value? And if so, maybe it deserves to be criticized the same way.
- The only part I didn't like in your article was how you described creating indicators as exploitation. The limit order book is public by design so all traders can look at it. People have the free choice to trade on a centralized exchange or not. This is a trade-off between revealing information and being able to trade quickly without calling all your friends asking if they want to buy some Bitcoin.
- I'm guessing you used data from other exchanges outside the one you were trading as indicators too. That's unquestionably good since your trading helped information propagate faster or more accurately than it would have otherwise.
- Markets are only zero-sum in isolation. Most participants derive utility from things outside short-term profit and loss. Maybe they trade to manage risk, to hedge, to gamble, have a longer time horizon than you, whatever. They just want to trade and get back to their lives. They don't want to waste time squeezing the last fraction of a basis point out of their fills. It's hard to believe, but they actually enjoy getting picked off, run over, paying too much spread, whatever things make you feel bad or indifferent about the service you provide.
I used to get filled making markets on Nasdaq (which pays resting orders a rebate, and charges crossers) when BX (which pays crossers a rebate) was at the same price, and could lay off the trade for an instant profit. The people who traded with me paid for the luxury of saying "fuck it, send it to good ol' Nasdaq." I used to think it was stupid of them, and from the perspective of a prop trader, it was mind numbingly stupid, but they probably had more productive things to do than read every exchange fee schedule or hook up to every small exchange.
- Providing liquidity has nothing to do with resting limit orders vs. crossing the spread. Providing liquidity is about taking risk off the hands of people that don't want it, and moving it across time to someone else who does. If you're market neutral, trade many round trips every day, and end relatively flat, you've played that intermediary role as a liquidity provider regardless of what order types you use.
- Crossing against mispriced orders is doing the world a favor. You're not the bad guy picking them off. If anything, they're the bad guy for holding the market at an incorrect price.
So maybe think of yourself as more of a service provider. Not only will you feel better, but viewing trading through that lens tends to make you a better trader. Strategies truly built around an exploitation mindset are fundamentally unsustainable, since you run out of people to exploit. Providing a service works forever.
FWIW, the rest of what you wrote is almost exactly how the pros do things. If you built this system yourself, you could make far more than 200k at a prop firm. If you're interested, reply with a throwaway and I can refer you to a friend who's still in the business.
Ultimately all investors from the hedge funds down to the long sucker, it is all skimming for returns. HFT just skims from other skimmers returns. Yes those investments may have created value, but ultimately people invest because they want returns.
The money made will be used for more skimming but also some for investment in other possibly beneficial projects.
Extracting financial value via investment (skimming) may allow someone to start a company, support a community or family and have valuable time to make those things better. Just as an investment in a company/idea creates value in that company, the value gained from trading where the first step really is skimming, can be used to create value in the real world or even just time which may lead to more real actual value created.
If anything, providing BTC liquidity might just be increasing net societal harm.
To create value, you need to give someone else money. If I wanted to build a better society, I'd make sure everyone had the best possible education. Paying for this would bankrupt me, maybe even bankrupt the entire country. But with every single person in the country walking around with a deep understanding of music, art, mathematics, engineering, and science... as a society, I'm sure we'd do great things. Value would be created in the very long term, but not for me, the potential investor.
Then on the other hand, we have things like automated trading. That boils down to asking a bunch of people "will you pay $5 for this $4 bitcoin?" Anyone that says "sure!" just gave you a dollar. Do that millions of times per second, and you remove as much value from the system as possible.
Markets promote value† creation, but they don't promote value creation, except insofar as the two concepts coincide. It is instructive to think about the circumstances under which the two concepts coincide, the extent to which those circumstances hold, and the extent to which we are moving towards or away from those circumstances.
I would love to know how this fared recently in the large sell-off.
What he says about some markets possibly being predicable rings true to me. But the article was far from convincing that the BTC market is actually predicable.
The natural assumption should be that the author was in the right place at the right time. Although he went through great lengths, I'm not convinced this is anything other than luck.
But there are no pure market downturns. On a daily scale the market may be down, but that does not mean that on a millisecond or second-scale you will only see downward movement. There is just an overall downtrend, but there is still almost the same upward movement to make money. For HFT systems, it really doesn't matter if the market, on a daily scale, goes up or down. There is no difference.
In fact, on many markets the system does better in downward trends. Probably because there is more liquidity on that side of the book, a bias that may come from certain market participants.
I very much look forward to the author's follow up post at the end of 2020 to see if another 5k turns into 200.
Most of what the author talked about, I learned the hard way.
I'm now at the point where I ran some tests (trading small amounts) live on binance and the results are positive: I do manage to make small profits, but more importantly, the recorded live trades reflect very closely the backtest trades (for a given period). I'm currently scaling up my model and adding better monitoring / reporting / CI.
I'd be happy to chat with anyone having done similar projects or willing to exchange ideas.
I've done a lot of work in the space and would love to chat - just emailed you :)
The kind of data augmentation I do is adding different candles sizes. I validate with 5m candles, but I train with 2,3,4,5,6,7m ones. I also sample more frequently more recent data. I train jointly with ~22 symbols, but in each X with those symbols, I randomly set some to 0, some I invert their price, some I invert time-wise. This helps generalization for some reason. I tried many kinds of noise, but what I described above is what I found to work best in my case.
I have a more ambitious idea to generate synthetic data using self play: have a bunch of agents trading one against another. This create new price data I can train the agents with, and repeat (this self-play training scheme would be similar to what DeepMind did with AlphaGo/AlphaZero). The issue with it is the need to tune the parameters exactly so that the resulting synthetic data is realistic enough that I can tranfer the agents to real data.
For example, during self-play, should you have only trading agents or should you add "retail traders" that buy during bubbles, "normal buyers" that buy only below, sell above certain prices, institutional buyers that randomly move the price a lot in a given direction. This is a lot of parameters to get right, and it's an optitization problem on it own. You could treat this a as two-fold optimization problem such as in this paper: https://arxiv.org/pdf/1810.02513.pdf, but it gets tricky very fast.
What lowers my concerns is that such issues might come once the bot is profitable, and by then I might have found other ways to raise capital (I'm doing a trading bot as a way to raise capital to create an AI research lab). Being able to setup a profitable crypto trader is a good thing to add on a resume or for personal branding even if I shut it down at some point. I'll be in a very different position by this point so it's a bit premature to be concerned about that now, although it's still somewhat of a concern.
On the positive side, I hope this scares the competition away.
Contrast this post with those you see with ML hobbyists who delve into medicine or fake-news and produce useless results testament to their lack of domain-specific competency.
The successful application of ML requires a deep understanding of the domain it's being applied in.
Funnily enough I think the ML hobbyist problem is most pervasive in the "predict the stock market" domain. There was a post on HN a few days ago [1] that was overfitting the validation set and hand-waving away fees and spreads. The author concluded that "there was no subtle underlying pattern" because they failed to find one.
Maybe in a future post you could discuss the security and banking side of this in more detail? In the 6ish years I’ve played around with crypto trading (and I really mean play, nothing close to your level), I’ve had 2 exchanges hacked and lose all customer funds, another 2 had major security breaches causing days of downtime but recovered, and one site seized by the FBI.
Then there are the horror stories of banks freezing your account when you move funds in and out of exchanges. Luckily That hasn’t happened to me.
I bet you have some good stories and perspective on that side of it, I would love to hear it.
I'm also not trading much capital. Because the system is more on the HFT side, the actively traded capital isn't that high, and I don't care about losing it. Any profit I try to get out of the exchanges regularly. I wouldn't feel comfortable leaving large sums on those exchanges.
Also, when you said "market neutral", did you mean you also short (only few pairs have margin on Binance and it appeared recently).
Counter-party risk always exist.
> Then there are the horror stories of banks freezing your account when you move funds in and out of exchanges.
Depends on the country. What happened to me is that a bank did not freeze my account. Instead, they simply reported it to the government, and asked AML questions regarding the transfer. The government, on the other hand, wanted me to provide bookkeeping records. Otherwise, they were going to assume that every transfer coming back from cryptocurrency exchange was pure profit.
Basically, I was not raided, my accounts were not frozen, but the government knows my wallet addresses (and I had to pay back 4 years worth of cryptocurrency trading profits with interest applied, which also left me realize how little I had made profit in the end).
Which country?
Claiming a 4000% return while staying market neutral seems a little too good to be true.
First: those levels are insanely high, so the algo must be taking some absurd risks and have the worst sharpe ratio, or getting pretty close to being 100% accurate.
Second: if you can scale this across markets, and assuming the same return, that investment will turn into 12 billions in 4 years. I doubt that you'd write a blog post about it if you had found such a gold mine.
See RenTech limiting the size of their Medallion fund because it was getting too large to scale....
Scalability and profitability are orthogonal. If it could scale indefinitely, you'd be right. But no trading strategy can scale indefinitely.
That doesn't say anything about whether or it "works", and it's not a reason to be suspicious of the results, in of itself. All successful trading strategies are capacity constrained.
At least that's my understanding based on conversations I've had, I've never traded equities.
To be competitive in US equities HFT, you need an FGPA with 40GbE ports hosted in a server (which needs to power and cool the FPGA, and deal with the less latency-sensitive bits of your system). You'll need some storage as well.
That server needs to be co-located with your target exchange(s) matching engines, and connected via 40GbE. You might additionally want remote market data via mm-wave microwave.
You can probably put together a basic but competitive hardware setup for $70k or so, if you ignore redundancy, and you only need to trade a single market. More realistically, you'll need at least two, plus shared storage, and probably more depending on what markets you intend to trade on.
Then you have monthly costs: colocation for the server(s) ($5k-ish+), port fees for the order entry ($500-ish), port fees for market data ($500-ish), physical connectivity fees ($20k-ish) , cross connect fees for the connectivity ($500-ish), wireless connectivity fees, you might need roof access (more fees), market data fees (per exchange), memberships, and trading costs.
I haven't done this for a while, but it easily adds up to $100k per month or more.
So you need to be making quite a bit to pay off your infra, before you start thinking about profit. And your model will age pretty fast, so you'll want to be working on a few possible replacements concurrently.
It's a tough business.
https://towardsdatascience.com/what-happened-when-i-tried-ma...
Discussed here: https://news.ycombinator.com/item?id=21624907
There looks like a lot of overfitting the validation set going on in that post.
It's also a mistake to conclude that "there was no subtle underlying pattern" just because the author couldn't find one.
Throwing XGBoost at a bunch of technical indicators isn't gonna cut it but I have had some solid real-world success (as have several people I know) applying ensembles of deep learning models (with regime switching based on model residuals) to profit from "subtle underlying patterns".
Disclaimer: I built a similar system in the past, took some gains and then realised the above. I then quitted to build a company.
Interestingly Benoit Mandelbrot talks about this in "the (mis)behaviour of markets" and explicitly calls it "market time"
I feel like this should be in bold, but either way, I love reading that in these posts. In every way, from research to confirm your models are correct, to be able to trust real time trades, you need a solid architecture. This thought isn't only for trading remember, where it's the same in tons of solutions to problems. If comment readers have other examples, I'd love to hear them in responses.