Predicting Price Changes in Ethereum (2017) [pdf] (opens in new tab)

(cs229.stanford.edu)

110 pointsPredictorY8y ago40 comments

40 comments

32 comments · 12 top-level

anjc8y ago· 4 in thread

Interesting that even the most naive methods still have >50% accuracy. Also interesting that the best method was better able to predict downward moves than upward moves, during a bull market. Any intuitive reason for this?

Is there some reason the study doesn't include the post-December 2017 bear market?

fwdpropaganda8y ago

> Interesting that even the most naive methods still have >50% accuracy.

All methods that you'll ever see have >50% accuracy, because if you find a signal with <50% accuracy you'll just flip the sign in the sginal and call it >50% accuracy.

Here's a bit relevant to this conversation:

> Previous work on predicting the directionality of Bitcoin prices has shown that significant signal exists in the price of the cryptocurrency. Hegazy and Mumford (2016) compute an exponentially-smoothed Bitcoin price every eight minutes; using the first five left derivatives of this price as features in a decision-tree based algorithm, they predict thedirection of the next change in Bitcoin price with 57.11% accuracy.

> Their results substantiate earlier research done by Madan, Saluja, and Zhao (2014), who found that by using the Bitcoin price sampled every 10 minutes as the primary feature for a random-forest model, they could predict the direction of the next change in Bitcoin price with 57.4% accuracy.

> An alternative model was used by Sebastian, Katabarwa, and Li (2014), who use the Bitcoin price sampled every minute as the primary feature for a forward-feed neural network. Their results suggest that this system predicts future Bitcoin price directionality with 60% accuracy.

The most glaring evidence that this entire paper is garbage is the fact that zero time is spent on putting these numbers (57.11%, 57.4%, 60%) in context. What do I mean by context? For example, observations like the fact that for the same dataset if you use a daily resolution and your prediction is always "up", you'll beat those accuracies. Obviously, the reason why this discussion is absent is because it's a lot harder than just dumping a dataset into sklearn.

natalyarostova8y ago

In order to meaningfully test this stuff you have to recreate a simulation as close as possible to the real trading environment -- and even then -- this is extremely hard to do. The lag, downtime, transaction fees, failed trades, API changes, etc, all throw a huge huge wrench in this theoretical sklearn+CSV 'prediction' game.

Don't get me wrong, sklearn+CSV is great for learning, and great for initial experimentation or playing around. But it's just too far from the real process to be meaningful imo.

anjc8y ago

I presume that the context of those accuracy numbers is prediction for the next period (i.e. 8 minutes, 10 minutes, 1 minute). To me 60% sounds good, but apparently not to people in this thread :)

2 more replies

nanis8y ago

> Is there some reason the study doesn't include the post-December 2017 bear market?

Now, that would be a decent out of sample test, wouldn't it?

Also, keep in mind that at hourly resolution, with a highly autocorrelated dataset, 60% is not much better than coin toss.

I would like to see if any of the methods they used would outperform just using what happened in the previous period as one's prediction.

adjkant8y ago· 3 in thread

I've actually run a trading algorithm off a very similar approach for the past 8-10 months, which yielded about 87% return in that time. Using regression I can still to this day hit about 55-60% accuracy. What's not mentioned in the paper is that accuracy is only a small part of the story. If you're accurate 60% of the time but in the 40% accuracy range you're very wrong, acting on the information is useless.

As a result, it's important to develop a trading approach that can actually capitalize on the information. For that, I have found three things to work best:

1. Only trading on the highest signals of increase within a model that is a spectrum rather than binary classification. This usually doesn't increase accuracy much ironically but does increase the "average value" of buying on the increase signals. I usually set this through historical testing for prediction values and taking a top percentage of the prediction values to set the "threshold".

2. More features and feature selection tuning. Right now I'm using genetic algorithms to constantly try and test new sets of features, thresholds, "hold times" after buying, etc.

3. Work on minutes, not hours. The volatility is so high that you can actually capitalize well on the micro level in my experience.

While accuracy is important, the average trade value and trades per day are far more important to returns.

Interestingly enough, the algorithm was steadily making money until April or so, when it stagnated. Mind you, it was making money from January-March due to sheer volatility even while the price was dropping most days. I've actually shut mine down for two reasons - the plateau plus the fact that the market was too thin on GDAX to quickly trade on buy signals for the amount I was running with (ending at about $3.5K). If the market thickens, I'll likely start running it again.

Takeaway: this paper's approach may seem simple but honestly the reality is that with something so volatile it's surprisingly easy to capitalize on with algorithmic trading that learns even a few small features and trades frequently.

AznHisoka8y ago

"More features and feature selection tuning."

This sounds an awful lot like curve fitting.

anjc8y ago

Very interesting.

Are you buying and selling, or are you shorting too?

Does this 60% accuracy remain if you change the training/prediction period? I'd guess that it would be more accurate if you train based on days versus seconds.

adjkant8y ago

Just buying and selling, all GDAX maker to avoid any fees which would more or less cancel out even the best model I have. Lack of fees is another unique feature of the space.

The training period I keep to 3 months and haven't really moved much since initially trying things. A month or less and the model is overfit and useless, too long and it's not working with current data.

The prediction/"hold time" changes absolutely make a difference. I was running on litecoin and found about an hour to be the sweet spot.

> days versus seconds

Seconds would be useless because you can't trade that fast - minutes is what I use.

Even if accuracy increases with a hold time over days, the average trade value doesn't go up nearly enough to make up for the trade frequency of the minutes/hours level. Why make 3 trades per week with an average value of .5% when you can make 4 trades a day for an average value of .15%? The compounding of that frequency works wonders, and that 4 trades a day for .15% is what I was actually hitting for a few months.

For the record, I do also compare to both naive buy/hold over the training period and the average trade value for the period for all times (different than the buy/hold time because I have a profit lock-in threshold for individual trades, also tuned with genetic algorithms), and the model outperforms both still.

The model is still predicting positively but the average trade value is shrinking + market thinning hence why it was breaking even recently until shut down.

1 more reply

heptathorp8y ago· 2 in thread

> A logical explanation for the high volatility of Ether, relative to Bitcoin and common stock, is that Ether is transferred or traded in a way in which Bitcoin and stock are not. More specifically, Ethereum has contract accounts which can cause Ether to be transferred between accounts in an unpredictable manner. (While Bitcoin and common stock are also traded by algorithms, the distinction is that anyone - even someone writing a hello world program - can instantiate a contract account on Ethereum. Meanwhile, algorithmically trading Bitcoin or stock requires more technical sophistication.)

Can anyone clarify what the movement of Ether between contract addresses has to do with the price of Ether? Intuitively, the price of ETH would move when it is traded against other currencies, not when it is transferred between accounts on Ethereum. Yes, it's easy to write a contract that moves Ether. It's also trivial to create a Bitcoin transaction.

anjc8y ago

Presumably they're talking about the consumption of Ether as gas, to run contracts. Perhaps they're seeing that algorithmic trading of currency is more predictable than gas consumption by distributed apps, which may be highly and suddenly viral for some period - e.g. Cryptokitties. Viral use of the DApp in theory would increase the price of Ether due to increased demand.

heptathorp8y ago

That makes sense. It wasn't clear from the paper since they talk about how Ether is "transferred or traded" and comparing it to algo trading.

stfwn8y ago· 2 in thread

> The primary dataset consists of the price of Ether sampled at approximately one-hour intervals between August 30, 2015 and December 2, 2017.

As far as I am aware, automated trading often works with much shorter intervals, in the milliseconds range. The traditional stock trading industry also does not only look at the price history, but also at planned buy and sell orders in the books and even news and social media sentiment analysis. Perhaps prediction methods that also use these strategies would produce better results, especially given that the crypto market has been in such a chaotic flux between 2015 and 2017.

varjag8y ago

There doesn't seem to be enough frequency in crypto trading for meaningful info at millisecond resolution.

sgwae8y ago

minutes or seconds would be more useful than one hour intervals.

1 more reply

mrwebmaster8y ago· 2 in thread

Maybe there are a lot of other trading algorithms that worked for some time but ended with a loss so big that exceeded all the profits, and we will never see those stats. Publisher bias.

fenwick678y ago

That's what I think is quaint about all these people who came up with crypto buying strategies. The naive "just buy as much as as you can" strategy was very effective until January.

anjc8y ago

It was effective but not optimal. Some people were overjoyed with their 10x on Ethereum, while others were getting 100x on tokens. It was also obviously highly non-optimal if you held through January :)

blinds8y ago· 2 in thread

Applying decades old statistical techniques to predict trends in time series data. Now with added blockchain, as is tradition.

adjkant8y ago

Same old ideas as stock prediction, yes, but the crucial difference here is a market with high volatility and no trade fees. Circumstance is everything.

blinds8y ago

Yes, there is a difference in circumstance. Ironically though, their best performing model assumes constant volatility...

nanis8y ago· 2 in thread

This is OK as a final project in an intro econometrics class, but nothing else.

anjc8y ago

It looks like it is a class project. But so what? It's giving you enough information to reproduce the study, which is more than can be said for many published research papers. That makes it useful imo.

nanis8y ago

As a class project where kids are learning how to use software, clean data, type equations etc, it is acceptable. It is unreasonable to expect them to know better when they are just getting a feel for the tools.

If this is supposed to give insight to people on how to allocate their money, the critiques would have to be much harsher.

1 more reply

maxpert8y ago· 1 in thread

I can vouch for the DNN results. I took data from Kaggle and tried training networks with different approaches. Was not able to get prediction any better than 54% (which I am pretty sure would be biased as well). Later on while inspecting data and looking at dips and rises turns out external events (govts banning crypto or a price crash due to other reasons) was causing price drops or surge, my conclusion was if you need to predict anything you need to be aware of these external factors.

nikkwong8y ago

Yeah, I've always been skeptical of people who say they've been making a killing with their genius proprietary trading algo's. You can't predict the price movements which are directly attributable to external news influences. And these types of events have such a direct effect on price.

inkrement8y ago· 1 in thread

Interesting paper! We just published a similar project, but in a more econometric focused context. However, we also find it interesting that even really simple models (like AR1 with SV) outperformed more sophisticated ones (e.g. TVPs without TVP). If you are interested: https://onlinelibrary.wiley.com/doi/full/10.1002/for.2524

inkrement8y ago

*VAR without TVP

cateye8y ago· 1 in thread

Did it successfully predict the fall after January 2018?

StavrosK8y ago

No. Unfortunately, it hadn't been trained to detect pride.

fredgrott8y ago

Its interesting how many people misunderstand Elliot Wave models or whats behind Price movements....

If you examine Metcalfe-Lee's formula for bitcoin valuation you will find that to predict price swings one has to track both:

Transaction volume and number of accounts involved in transactions

with those two pieces of info you can predict whether a starting wave is Speculative or consolidated, etc.

will put up a Medium article on Monday showing the ratio tools I came up with

a-dub8y ago

This looks like someone's homework. :)

j / k navigate · click thread line to collapse

40 comments

32 comments · 12 top-level

anjc8y ago· 4 in thread

Is there some reason the study doesn't include the post-December 2017 bear market?

fwdpropaganda8y ago

> Interesting that even the most naive methods still have >50% accuracy.

All methods that you'll ever see have >50% accuracy, because if you find a signal with <50% accuracy you'll just flip the sign in the sginal and call it >50% accuracy.

Here's a bit relevant to this conversation:

natalyarostova8y ago

Don't get me wrong, sklearn+CSV is great for learning, and great for initial experimentation or playing around. But it's just too far from the real process to be meaningful imo.

anjc8y ago

I presume that the context of those accuracy numbers is prediction for the next period (i.e. 8 minutes, 10 minutes, 1 minute). To me 60% sounds good, but apparently not to people in this thread :)

2 more replies

nanis8y ago

> Is there some reason the study doesn't include the post-December 2017 bear market?

Now, that would be a decent out of sample test, wouldn't it?

Also, keep in mind that at hourly resolution, with a highly autocorrelated dataset, 60% is not much better than coin toss.

I would like to see if any of the methods they used would outperform just using what happened in the previous period as one's prediction.

adjkant8y ago· 3 in thread

As a result, it's important to develop a trading approach that can actually capitalize on the information. For that, I have found three things to work best:

2. More features and feature selection tuning. Right now I'm using genetic algorithms to constantly try and test new sets of features, thresholds, "hold times" after buying, etc.

3. Work on minutes, not hours. The volatility is so high that you can actually capitalize well on the micro level in my experience.

While accuracy is important, the average trade value and trades per day are far more important to returns.

AznHisoka8y ago

"More features and feature selection tuning."

This sounds an awful lot like curve fitting.

anjc8y ago

Very interesting.

Are you buying and selling, or are you shorting too?

Does this 60% accuracy remain if you change the training/prediction period? I'd guess that it would be more accurate if you train based on days versus seconds.

adjkant8y ago

Just buying and selling, all GDAX maker to avoid any fees which would more or less cancel out even the best model I have. Lack of fees is another unique feature of the space.

The prediction/"hold time" changes absolutely make a difference. I was running on litecoin and found about an hour to be the sweet spot.

> days versus seconds

Seconds would be useless because you can't trade that fast - minutes is what I use.

The model is still predicting positively but the average trade value is shrinking + market thinning hence why it was breaking even recently until shut down.

1 more reply

heptathorp8y ago· 2 in thread

anjc8y ago

heptathorp8y ago

That makes sense. It wasn't clear from the paper since they talk about how Ether is "transferred or traded" and comparing it to algo trading.

stfwn8y ago· 2 in thread

> The primary dataset consists of the price of Ether sampled at approximately one-hour intervals between August 30, 2015 and December 2, 2017.

varjag8y ago

There doesn't seem to be enough frequency in crypto trading for meaningful info at millisecond resolution.

sgwae8y ago

minutes or seconds would be more useful than one hour intervals.

1 more reply

mrwebmaster8y ago· 2 in thread

Maybe there are a lot of other trading algorithms that worked for some time but ended with a loss so big that exceeded all the profits, and we will never see those stats. Publisher bias.

fenwick678y ago

That's what I think is quaint about all these people who came up with crypto buying strategies. The naive "just buy as much as as you can" strategy was very effective until January.

anjc8y ago

blinds8y ago· 2 in thread

Applying decades old statistical techniques to predict trends in time series data. Now with added blockchain, as is tradition.

adjkant8y ago

Same old ideas as stock prediction, yes, but the crucial difference here is a market with high volatility and no trade fees. Circumstance is everything.

blinds8y ago

Yes, there is a difference in circumstance. Ironically though, their best performing model assumes constant volatility...

nanis8y ago· 2 in thread

This is OK as a final project in an intro econometrics class, but nothing else.

anjc8y ago

nanis8y ago

If this is supposed to give insight to people on how to allocate their money, the critiques would have to be much harsher.

1 more reply

maxpert8y ago· 1 in thread

nikkwong8y ago

inkrement8y ago· 1 in thread

inkrement8y ago

*VAR without TVP

cateye8y ago· 1 in thread

Did it successfully predict the fall after January 2018?

StavrosK8y ago

No. Unfortunately, it hadn't been trained to detect pride.

fredgrott8y ago

Its interesting how many people misunderstand Elliot Wave models or whats behind Price movements....

If you examine Metcalfe-Lee's formula for bitcoin valuation you will find that to predict price swings one has to track both:

Transaction volume and number of accounts involved in transactions

with those two pieces of info you can predict whether a starting wave is Speculative or consolidated, etc.

will put up a Medium article on Monday showing the ratio tools I came up with

a-dub8y ago

This looks like someone's homework. :)

j / k navigate · click thread line to collapse