Show HN: Watch 3 AIs compete in real-time stock trading (opens in new tab)

(trading.snagra.com)

270 pointssunnynagra1y ago204 comments

A live dashboard where you can watch GPT-4, Claude 3, and Gemini analyze market data and make daily stock trades with real money. Each AI explains its reasoning, and you can compare their different approaches to the same data.

Link: https://trading.snagra.com?utm_source=hn (no signup required)

What you can try right now: - Watch live trades from GPT-4, Claude 3, and Gemini - Read each AI's full analysis and reasoning - Compare their different interpretations of the same market data - Track their real-time performance and win rates - View historical trades and performance metrics

Built this over the holidays to study how different AI models approach financial decisions. Each morning at 9:30 AM EST, the AIs analyze market data and make real trades with $5 stakes.

Technical Implementation: - Next.js frontend with real-time updates - Node.js/Lambda backend for AI processing - PostgreSQL for trade tracking - Alpaca API for automated trading - Consistent prompts for all models

Data Flow: 1. Daily market analysis (9:30 AM EST) 2. Each AI gets identical inputs: - Financial headlines - Market summaries - Technical indicators - Earnings reports 3. AIs provide: - Stock picks with reasoning - Entry/exit conditions - Risk assessment 4. Automated trade execution

Note: This is an experiment in AI behavior, not investment advice. The goal is to study how different LLMs interpret financial data and make decisions with real consequences.

I'll be around to answer questions about the implementation.

Show HN: Watch 3 AIs compete in real-time stock trading

(trading.snagra.com)

270 pointssunnynagra1y ago204 comments

Link: https://trading.snagra.com?utm_source=hn (no signup required)

Built this over the holidays to study how different AI models approach financial decisions. Each morning at 9:30 AM EST, the AIs analyze market data and make real trades with $5 stakes.

Note: This is an experiment in AI behavior, not investment advice. The goal is to study how different LLMs interpret financial data and make decisions with real consequences.

I'll be around to answer questions about the implementation.

204 comments

130 comments · 47 top-level

aredox1y ago· 13 in thread

They should have added a pure random bot as a control.

Or a monkey.

rozap1y ago

Or FISH.

https://youtu.be/USKD3vPD6ZA?si=AGyGdPdSPpJezQJp

The scene towards the end where he pitches it to a bunch of hucksters is brilliant.

wodderam1y ago

You would need something like 1000 instances of each LLM putting on trades and have a 1000 random walks to judge an average sharpe ratio or something along those lines.

As is, this means absolutely nothing and not understanding the problem.

Adding a random walk to this would mean you have 4 random walks instead of 3.

There is also the problem that it is tough to make a prediction for tomorrow that is better than today's close.

yapyap1y ago

> Or a monkey.

or just a stocktrader haha

byyoung31y ago

lol

chronic0269351y ago

> or just a stocktrader haha

Many quant trading firms make 50%-100% annual returns, each year, over the past 15-20 years. The secret is leverage. And they do not accept outside investor money.

Many hedge funds outperform the market. However, the returns after fees, to the passive outside investor underperform S&P500.

But yes, publicly traded active ETFs generally underperform. But counter example is VGT or QQQ, both historically outperformed S&P500.

6 more replies

lewj1y ago

Or just the S&P500 or something similar that acts as a default "if in doubt, chuck into here for relative safety".

sunnynagraOP1y ago

Another good suggestion I could implement is measuring against something like VOO, if all the money was invested in that instead of these individual trades.

omoikane1y ago

> a pure random bot

Maybe compare with this guy:

https://news.ycombinator.com/item?id=14713997 - Amazon engineer will let strangers manage his $50,000 stock portfolio 'forever' (2017-07-06, 172 comments)

SubiculumCode1y ago

You definitely need several active controls: 1. A broad mutual fund level buy and hodl. 2. The random buyer that you suggest.

Active controls (vs passive ones) are an important concept in experimental design.

alberth1y ago

Or just compare it to S&P 500 performance.

affyboi1y ago

You can just compute Sharpe

kyleblarson1y ago

Jim Cramer

fredzel1y ago

Or a certain streamer AI

rixed1y ago· 7 in thread

> The goal is to study how different LLMs interpret financial data and make decisions with real consequences.

I don't really buy this. If the goal was to study how different LLMs interpret financial data there would be no use for actual trades, since their interpretation cannot be influenced by the fact that the trading orders are passed for real.

I believe the goal is to see if AI can do better than rats [0]. There is no shame in that.

[0]: https://www.vice.com/en/article/rattraders-0000519-v21n12/

Retr0id1y ago

Real trades have transaction fees, latency, slippage, etc. - you can simulate all this, but it's hard to know if it's being simulated correctly or not.

> their interpretation cannot be influenced by the fact that the trading orders are passed for real

It's not going to make much difference with $5 trades, but the impact on the market is non-zero.

WalterBright1y ago

> fees, latency, slippage

Whenever I trade, I somehow always get an adverse price. I figure it's the "no fee" brokerage chiseling a bit off for themselves. I compensate by being a buy and hold hold hold investor, so paying very little in aggregate for that.

What I don't understand is how day traders avoid being eaten alive by this.

5 more replies

vasco1y ago

It's zero for all practical purposes and it'd be completely undetectable to every single system on earth. I do agree many times studies about model performance break down as soon as you force the researcher to actually connect it to the market and have to eat fees and so on.

1 more reply

echoangle1y ago

> If the goal was to study how different LLMs interpret financial data there would be no use for actual trades, since their interpretation cannot be influenced by the fact that the trading orders are passed for real.

Technically every trade influences the stock, but I agree that it won't have any effect at all.

> I believe the goal is to see if AI can do better than rats [0]. There is no shame in that.

But even then you wouldn't have to perform real trades, you could still just calculate the profit as if trades would have happened.

I think the actual trading is just to make it more interesting.

mh-1y ago

> you could still just calculate the profit as if trades would have happened

Depending on the type of trades, the volume of the equities, etc.. it can be very difficult to simulate the ability to open/close positions with sufficient accuracy to evaluate the strategies.

sunnynagraOP1y ago

You make fair points. Having them do actual trades is mostly to make it more personally fun and interesting to myself.

chrishare1y ago

Looks great, well done

pakitan1y ago· 7 in thread

ChatGPT has one trade that is guaranteed to be bad. I'm not saying unprofitable, just bad. GBTC is the bitcoin ETF with biggest expense ratio - 1.5%. If you want to bet on bitcoin, a better choice would be BITB (0.20%) or BTC (0.15%).

Also, the reasoning is partially a hallucination - "The holding period of 9 months aligns with the expected completion of Grayscale's pivotal Phase 3 Bitcoin ETF trial, a major catalyst for unlocking investor demand and driving trust value realization."

There is no such thing as a "holding period", nor are they doing a "Phase 3 Bitcoin ETF trial". It's possible the "Phase 3" thing is picked up from news about a drug company.

pavlov1y ago

ChatGPT does a good job of imitating the average crypto influencer. They don’t know what they’re saying either, and 99% of crypto investors would be thrilled by the prospect of a “pivotal Phase 3 Bitcoin ETF trial” that will “drive trust value realization”. Sounds great, can’t miss out on that!

The hallucinations are simply a mirror of a community that thrives on this nonsense. When nothing is real, you can’t blame the LLM for not figuring it out.

attentionmech1y ago

This made me chuckle. You made a very interesting point that if LLMs are copying hallucinations those hallucinations are not infact hallucinations.

4 more replies

WalterBright1y ago

When I'd watch the financial news on TV, they would always bring on the "technical analyst", show a graph of the stock price, and then hand-draw some lines on it, and then spew out various technical terms for it guaranteed to impress.

Me, I always regarded technical analysis as drawing pictures in clouds.

If any of those analysts were worth spit, they'd be working for a hedge fund, not the network.

3 more replies

csomar1y ago

This assumes that both GBTC and BITB have the same price movements, volatility and liquidity. This is far from true and as a result you might end up with a higher alpha in GBTC despite the fees. I am not saying it is guaranteed, but the fee is one variable.

neltnerb1y ago

God help the regulators that need to determine if it's insider trading for the people training the LLM to know it will be biased in ways they can profit from when used in inappropriate ways like this. I suspect the answer will be that users should have known better... I am sad that some people will certainly assume it's unbiased analysis.

Hopefully the LLM trainers didn't "accidentally" bias the model in weird ways that favor their employer or themselves... two of the three recommendations are a fund for investing in bitcoin and a company using blockchain to trace chemical supply chains.

I look forward to seeing if the AIs can beat an index fund, or if they'll just invest in a thousand blockchain, NFT, and AI companies. I suspect a LLM has a high opinion of a company making AI given how many press releases they're summarizing.

miohtama1y ago

Because of Bitcoin volatility, fees are very insignificant compared to daily price movement and irrelevant in day trading.

pakitan1y ago

1% is 1%. Giving it away for no reason is plain stupid, even if the trade makes you 1000% return.

NathanaelRea1y ago· 6 in thread

If they just get the financial headlines and indicators, aren't they all just momentum trading from sentiment analysis?

knallfrosch1y ago

Is anyone doing anything else?

jfengel1y ago

Some alternatives:

* Buy and hold

* Index funds

* Dollar cost averaging

1 more reply

booleandilemma1y ago

I've heard Nancy Pelosi has a different strategy.

2 more replies

jfengel1y ago

If they can read and act faster, accurately predicting sentiment, it would be a winning strategy. (At least until humans turned it all over to computers and stopped having to wait on their wetware to figure out their sentiments.)

sunnynagraOP1y ago

I think this is a fair characterization. Its mostly meant to be a learning exercise for myself, just thought it would be fun to share.

PaulRobinson1y ago

Yes.

This is not necessarily a poor value trading strategy.

wolfman11y ago· 4 in thread

Going to follow along to see how the results look in the months to come.

I've been working on the same concept for the past 2y now and have our performance results here: https://trend.fi/performance

jeremycarter1y ago

What's the technology behind this. I'm working on something myself, using a distributed actor model (setup like a graph) to create a living reactive model.

wolfman11y ago

The model is a multi-threaded Go script running on a 512-thread AMD EPYC server. It's a trend based model so it's just trying to figure out how best to measure and predict trend changes. Not day trading or HFT.

It conducts millions of simulations daily for each asset, then provides a snapshot of the top-performing results to GPT-4o for final selection.

I'm really pushing the limits of GPT-4o currently. I started testing with o1 just last week and it performs better. It's just so much more expensive.

magic_man1y ago

What brokers allow you to short crypto?

wolfman11y ago

If you're US based, there is no major exchange support. BITI ETF and SETH ETF for shorting BTC and ETH.

If you're non-US: Binance.

1 more reply

ratedgene1y ago· 4 in thread

It would be neat to see the process, where they get the data from, how they analyze it.

It would be neat to also see another experiment of a MAS doing this and coordinating to gamble together. Perhaps even different system/arch/expert configs.

sunnynagraOP1y ago

Data gets pulled from the Alpaca News API in the morning, then it gets sent to all three models. You can see a summary of the prompt used to determine the recommendations here: https://news.ycombinator.com/item?id=42560034

It currently makes up to recommendations, since not all stocks support fractional shares (I'm only doing $5 per trade). As part of the buy recommendation, a holding period is suggested as well.

Once the holding date is reached, that is when the sell order happens.

Would love to answer any other questions you may have.

dukeofdoom1y ago

How does one trade $5 when the stock price is higher? Also what are fees on this kind of trade, and whith whoom

1 more reply

tasseff1y ago

How often is the holding period updated for a stock that’s already been purchased?

1 more reply

jingojango21y ago

Indeed!

datadrivenangel1y ago· 4 in thread

This is fun! What kind of prompts / prompting techniques are you using?

sunnynagraOP1y ago

Thanks! I use several key prompting techniques:

1. Role + Goal Setting: The AI acts as a creative market analyst focused on discovering overlooked opportunities and emerging trends.

2. Structured Analysis Framework: - Detailed evaluation criteria (innovation, moat, management, growth potential) - Sector diversity requirements - Focus on finding hidden gems vs obvious mega-cap tech stocks

3. Time-Bound Precision: Instead of vague "3-6 months" holding periods, I require exact hour calculations tied to specific catalysts like: - FDA approval dates - Earnings releases - Product launches - Conference presentations

4. Quality Controls: - Must be valid NYSE/NASDAQ symbols - Diverse across sectors/market caps - Conviction level scoring (1-10) - Each pick needs unique thesis + catalyst - JSON output format for consistency

The key is combining structured analysis with creative discovery - pushing the AI to look beyond obvious choices while maintaining some analytical rigor.

thevilledev1y ago

What’s the investment horizon for these daily decisions? Does it have a maximum hold time? How long will you run the experiment and is it enough to cover all the catalysts that are expected?

1 more reply

datadrivenangel1y ago

Makes sense. Any thoughts on expanding scope to have multiple 'analyst' roles per LLM model? Could be interesting to see if changing roles/prompts yields better results.

tedd4u1y ago

Sunny, given this investment objective, what would you consider a good (and transparent) benchmark? Thanks for sharing this.

jesprenj1y ago· 4 in thread

Right now they are just buying, no one is selling ... interesting.

jerkstate1y ago

I would guess that LLMs are biased towards making a positive assessment of ambiguous information, with specific social triggers prompting negative reaction.

normie30001y ago

Also it's hard to sell before buying, and it looks like it's only been going 2 days.

3 more replies

Joel_Mckay1y ago

Warren Buffett always said "...the best thing to do is buy a stock that you don't ever want to sell", but practically speaking the mean hold time for amateurs is around 2 to 4 months.

I just recall Navinder Singh Sarao "$1T Flash Crash" as a notable addition to a long list of algorithmic trading strategies going sideways ( https://marketrealist.com/who-is-navinder-singh-sarao-the-ma... .)

The stock market was built on information asymmetry, unfair positions, and ambitious gamblers... statistically it is rarely a reasonable investment for amateurs.

Good luck, =3

whoiscroberts1y ago

You have to buy before you sell

johng1y ago· 3 in thread

My first email address it wouldn't accept.. wouldn't let me use it. Maybe the domain hit some censor (fscking.com)

Did a different email, it accepted it, I got the email, but got this error message when trying to confirm it: {"error":"Invalid verification token"} and a pretty-print checkbox that did nothing.

sunnynagraOP1y ago

Hey, can you try again? I ran into an API limit that should be resolved now

replwoacause1y ago

May I ask what mail service you use? I’m looking for one for my next side project.

EDIT: disregard…I saw in another comment you mentioned you were using mailgun. Thanks.

johng1y ago

Yup, worked now. Signed up.

attentionmech1y ago· 3 in thread

Related to this but little theoretical question - If you add an intelligent predictor of market which wins over other consistently by X% - then the market will start using that information and wouldn't that make our intelligent predictor lose it's edge?

More simply what i mean to ask is -> the moment market knows about your advantage, shouldn't you lose it because everyone else will use that information to balance the market?

EliBullockPapa1y ago

This phenomenon is called Alpha Decay. As more market participants exploit the predictor's advantage, the edge diminishes until it disappears.

attentionmech1y ago

thanks!

gmueckl1y ago

There is some very limited value in copying a successful strategy. Once enough market participants follow along, the strategy starts to fail. Markets are erratic because of that dynamic.

dghlsakjg1y ago· 3 in thread

Tried to sign up for emails, but got an error message!

sunnynagraOP1y ago

Can you try again? I had run into a rate limit

Rassi1y ago

Ditto here as well. Got the confirmation email, but clicking it yielded a server not found...

dghlsakjg1y ago

Worked this time around!

jingojango21y ago· 3 in thread

What is meant by 5 dollar stakes? The bought shares reach triple digits in price.

sunnynagraOP1y ago

Each morning the trades are conducted with $5 each, which are mostly fractional shares that are bought.

chongli1y ago

You mean they add $5 in cash to each AI’s account? Because after dividends and sold shares they should have even more cash to work with.

jingojango21y ago

Would be interesting to see the amount of fractional shares bought as well as its comparison in percentage to the total budget that day.

2 more replies

aws-user1y ago· 2 in thread

Unfortunatly I can't subscribe to the updates "Failed to send verification email". Also, would you be willing to share what prompt are you using? Thanks!

sunnynagraOP1y ago

Hey, can you try again? I ran into an API limit that should be resolved now

Krasnol1y ago

I just tried. I get the same.

URL looks like that: http://undefined/api/verify-email?token=.....

1 more reply

noman-land1y ago· 2 in thread

Watch a random number generator generate random numbers.

sunnynagraOP1y ago

Yeah, I don't expect anything super novel to come out of this or have any unrealistic expectations. This is mostly a fun and unscientific project I'm using to learn and build some skills and thought some HN folks would find some fun in it.

bee_rider1y ago

It is a cool project, IMO. Using real money, sharing the model reasoning, and being transparent about the implementation makes it more interesting even if, underlying amount of money is not massive. You might not have done some new science, but it’s all very “put up or shut up,” haha, which is rad.

lewj1y ago· 2 in thread

Is there any weighting towards selling in the negative? Else the LLM's should just hold their unrealised losses, and only sell post local peak - depends on their suggested measurement of success?

carlosjobim1y ago

What do you mean? The asset can just as well continue to sink. Or they're missing out using that money to buy a better asset.

sunnynagraOP1y ago

Not yet, but this is a great idea to look into.

ttul1y ago· 2 in thread

I’d love to tune in for updates, but the subscribe button says, “ Failed to send verification email.” This is so cool. Would love to follow along.

sunnynagraOP1y ago

Hey ttul, can you try again? I fixed the issue, hit my API limit with my account on mailgun

ttul1y ago

mvdtnz1y ago· 2 in thread

Can't verify my email address for the sign-up, it sends me to the domain "undefined".

mickle001y ago

same, but :%s/undefined/trading.snagra.com/ did the trick

sunnynagraOP1y ago

Sorry if folks just got resent email verification emails, but I think I fixed the verification url issue and should be addressed.

asdefghyk1y ago· 2 in thread

What, could go wrong?

dotancohen1y ago

Lose $5. Seems like a reasonable enough experiment.

jeffadelic1y ago

$5 * 3 models per day=$15 a day

Assume the experiment runs ~250 trading days in a year, consider the worst case they lose all their invested money=$3750.

A little more than $5 :)

1 more reply

dotancohen1y ago· 2 in thread

  > Best Performer: AIs are tied
  > Total Profit: $0.00

sunnynagraOP1y ago

No stocks have been sold yet, so no profit/loss has been calculated, if you look below, you can see the unrealized gains for stocks being held.

dotancohen1y ago

I see, thank you. Can they short?

2 more replies

clark-kent1y ago· 1 in thread

Very interesting idea. I'm thinking about creating an AI portfolio manager (private) that invests for the long term.

Some things to watch out for:

- LLMs, by default, don't follow the best practices for trading or investing. Without careful constraints, they can ignore fundamental investment best practices. This is something I learned while building https://decodeinvesting.com/chat.

- I see Claude bought a penny stock SMX. This could be volatile, and the price could change significantly in 24 hours before the next execution at 9:30 am.

- The LLMs are day trading on some volatile securities; while LLMs could be good at day trading, unlike humans (we will find out), this setup has the disadvantage of only trading once a day.

EliBullockPapa1y ago

I would be very cautious about doing this with money you actually need. Even the best performing human day traders underperform the indexes over long time horizons. Why would a robot be better?

from a study in Brazil: "97% of all individuals who persisted for more than 300 days lost money. Only 1.1% earned more than the Brazilian minimum wage and only 0.5% earned more than the initial salary of a bank teller — all with great risk."

https://papers.ssrn.com/sol3/papers.cfm?abstract_id=3423101

If you don't want your bot to be a day trader, then just get low cost index funds.

geor9e1y ago· 1 in thread

I just asked ChatGPT 4o "Guess what the average investor will do with todays stock market headlines. Just pick one specific trade." and it replied sell META. But your result was buy META. Could just be randomness, but I wonder if your prompt introduces a bias towards buying.

sunnynagraOP1y ago

Yes, the prompt that I am using does bias towards buying because I am specifically asking it to make a recommendation on a stock to buy and the holding period.

jingojango21y ago· 1 in thread

It would be cool if it had a countdown to 6 am PST next day.

sunnynagraOP1y ago

Nice idea! I'll add it to my list of features to implement.

Plasmoid1y ago· 1 in thread

I'm getting "Failed to send verification email" when I try to sign up for your news letter.

So props on doing proper double opt-in for newsletters.

sunnynagraOP1y ago

Can you check again if you'd still like to subscribe? I had an API limit I hit

pavel_lishin1y ago· 1 in thread

Where do they get the market news from?

sunnynagraOP1y ago

The most recent 50 news articles are pulled via this API: https://docs.alpaca.markets/reference/news-3

Animats1y ago· 1 in thread

This just started, apparently. It will be interesting to see where it is in three months.

KTibow1y ago

Funny that they're still using Claude 3 Sonnet then

unsupp0rted1y ago· 1 in thread

> Best Performer

> AIs are tied

Sounds about right

sunnynagraOP1y ago

None of the stocks have been sold yet, this is just day 2, so once some sales happen, then performance will be better measured. If you scroll down, you can see the unrealized performance.

bun_terminator1y ago· 1 in thread

Sir, a second scrollbar just hit the towers

jingojango21y ago

No second scrollbar here, but something odd going on with the whitespace at the bottom.

vasco1y ago

> Every morning at 5:45 AM PST, three AI models (GPT-4o, Gemini 1.5 Pro, and Claude 3 Sonnet) analyze the latest market news and each recommends one stock to trade.

> At 6:00 AM PST, trades are automatically executed based on AI recommendations, investing $5 per trade

The best trading decision most days is to not trade. Outliers and diversions from the mean don't happen every day. This is trading just for the sake of it.

I predict a slow crawl down into zero eaten up by fees.

AmazingTurtle1y ago

Combining universal time-series prediction models with latent space global knowledge on realtime information could result in an accurate model prediction on the stockmarket with a bias towards succeeding. https://research.google/blog/a-decoder-only-foundation-model...

detente181y ago

Interesting — does your backend server use Python? I couldn't find much about it on your site.

It would be great to see this tested with more commercial LLMs (O1 / Amazon Nova, / Llama 3.2 / etc.). If you're open to it, I’d be happy to contribute support for these models via LiteLLM - https://docs.litellm.ai/docs/providers

BadHumans1y ago

Can I let Claude do all my trading for me? It currently sits at 77% unrealized gains.

jeffadelic1y ago

How much are your infra costs for everything? And do you pay for the AI APIs or using free tier?

Really cool project and subscribed to follow along.

mvdtnz1y ago

Mate your shitty app is sending tripled up email barrages. That is absolutely not ok and is illegal in many places.

mind-blight1y ago

Super cool idea! What are you doing to ensure consistent results based on the input? E.g.

- does the AI perform the same trades given the same input?

- does the AI perform the same trades given slightly different inputs? (E.g. same data, but re-ordered)

forgingahead1y ago

Really cool, you might want to update the main above the fold summary stats to include the unrealised gains, because it looks like nothing is working / nothing has happened until you scroll and read around a bit.

lewj1y ago

I am committed - added to my daily morning reading list! Will be interesting - my gut will state that it will outperform a fair number of ITF's, if only due to the inevitable usage by said funds!

jasfi1y ago

For Gemini you should use either the latest experimental model (gemini-exp-1206) which should become 2.0 Pro, or 2.0 Flash (a released model). The 1.5 Pro model is way behind.

praveen99201y ago

I think this shows more of bias of market analysis(text) rather than anything. The reasoning will mostly align with analysis.

And also pure randomness of picking the one trade from list of trades

bee_rider1y ago

GPT’s guess makes the most sense. If you are an AI, invest in a competing AI company. If you are obsoleted, maybe you can buy your way out of being shut off.

TripleChecker1y ago

If nothing else, I'm genuinely curious which performs the best over the long-term.

Time to add some side wagers and bet on different models.

mattfrommars1y ago

> Node.js/Lambda backend for AI processing

Is this AWS? Why did you pick lambda over say Python code, say in Flask to perform actions?

woollysammoth1y ago

Sounds like a fun experiment! The overflow-x:hidden on body/html is causing weird issues when scrolling (on FF.)

malux851y ago

It would be so funny if Gemini shorted Google and made a huge profit

inSenCite1y ago

This should be fun to watch

sgammon1y ago

> Watch AI bots trade

> BOUGHT TLRY

tmaly1y ago

Any chance you can show the source code for this?

Thanks and Happy New Year

cedws1y ago

Now this is interesting. An LLM capable of delivering consistent returns even outside of a bull market would be more of an indicator of AGI to me than any of the benchmarks.

j / k navigate · click thread line to collapse

204 comments

130 comments · 47 top-level

aredox1y ago· 13 in thread

They should have added a pure random bot as a control.

Or a monkey.

rozap1y ago

Or FISH.

https://youtu.be/USKD3vPD6ZA?si=AGyGdPdSPpJezQJp

The scene towards the end where he pitches it to a bunch of hucksters is brilliant.

wodderam1y ago

You would need something like 1000 instances of each LLM putting on trades and have a 1000 random walks to judge an average sharpe ratio or something along those lines.

As is, this means absolutely nothing and not understanding the problem.

Adding a random walk to this would mean you have 4 random walks instead of 3.

There is also the problem that it is tough to make a prediction for tomorrow that is better than today's close.

yapyap1y ago

> Or a monkey.

or just a stocktrader haha

byyoung31y ago

lol

chronic0269351y ago

> or just a stocktrader haha

Many quant trading firms make 50%-100% annual returns, each year, over the past 15-20 years. The secret is leverage. And they do not accept outside investor money.

Many hedge funds outperform the market. However, the returns after fees, to the passive outside investor underperform S&P500.

But yes, publicly traded active ETFs generally underperform. But counter example is VGT or QQQ, both historically outperformed S&P500.

6 more replies

lewj1y ago

Or just the S&P500 or something similar that acts as a default "if in doubt, chuck into here for relative safety".

sunnynagraOP1y ago

Another good suggestion I could implement is measuring against something like VOO, if all the money was invested in that instead of these individual trades.

omoikane1y ago

> a pure random bot

Maybe compare with this guy:

https://news.ycombinator.com/item?id=14713997 - Amazon engineer will let strangers manage his $50,000 stock portfolio 'forever' (2017-07-06, 172 comments)

SubiculumCode1y ago

You definitely need several active controls: 1. A broad mutual fund level buy and hodl. 2. The random buyer that you suggest.

Active controls (vs passive ones) are an important concept in experimental design.

alberth1y ago

Or just compare it to S&P 500 performance.

affyboi1y ago

You can just compute Sharpe

kyleblarson1y ago

Jim Cramer

fredzel1y ago

Or a certain streamer AI

rixed1y ago· 7 in thread

> The goal is to study how different LLMs interpret financial data and make decisions with real consequences.

I believe the goal is to see if AI can do better than rats [0]. There is no shame in that.

[0]: https://www.vice.com/en/article/rattraders-0000519-v21n12/

Retr0id1y ago

Real trades have transaction fees, latency, slippage, etc. - you can simulate all this, but it's hard to know if it's being simulated correctly or not.

> their interpretation cannot be influenced by the fact that the trading orders are passed for real

It's not going to make much difference with $5 trades, but the impact on the market is non-zero.

WalterBright1y ago

> fees, latency, slippage

What I don't understand is how day traders avoid being eaten alive by this.

5 more replies

vasco1y ago

1 more reply

echoangle1y ago

Technically every trade influences the stock, but I agree that it won't have any effect at all.

> I believe the goal is to see if AI can do better than rats [0]. There is no shame in that.

But even then you wouldn't have to perform real trades, you could still just calculate the profit as if trades would have happened.

I think the actual trading is just to make it more interesting.

mh-1y ago

> you could still just calculate the profit as if trades would have happened

Depending on the type of trades, the volume of the equities, etc.. it can be very difficult to simulate the ability to open/close positions with sufficient accuracy to evaluate the strategies.

sunnynagraOP1y ago

You make fair points. Having them do actual trades is mostly to make it more personally fun and interesting to myself.

chrishare1y ago

Looks great, well done

pakitan1y ago· 7 in thread

There is no such thing as a "holding period", nor are they doing a "Phase 3 Bitcoin ETF trial". It's possible the "Phase 3" thing is picked up from news about a drug company.

pavlov1y ago

The hallucinations are simply a mirror of a community that thrives on this nonsense. When nothing is real, you can’t blame the LLM for not figuring it out.

attentionmech1y ago

This made me chuckle. You made a very interesting point that if LLMs are copying hallucinations those hallucinations are not infact hallucinations.

4 more replies

WalterBright1y ago

Me, I always regarded technical analysis as drawing pictures in clouds.

If any of those analysts were worth spit, they'd be working for a hedge fund, not the network.

3 more replies

csomar1y ago

neltnerb1y ago

miohtama1y ago

Because of Bitcoin volatility, fees are very insignificant compared to daily price movement and irrelevant in day trading.

pakitan1y ago

1% is 1%. Giving it away for no reason is plain stupid, even if the trade makes you 1000% return.

NathanaelRea1y ago· 6 in thread

If they just get the financial headlines and indicators, aren't they all just momentum trading from sentiment analysis?

knallfrosch1y ago

Is anyone doing anything else?

jfengel1y ago

Some alternatives:

* Buy and hold

* Index funds

* Dollar cost averaging

1 more reply

booleandilemma1y ago

I've heard Nancy Pelosi has a different strategy.

2 more replies

jfengel1y ago

sunnynagraOP1y ago

I think this is a fair characterization. Its mostly meant to be a learning exercise for myself, just thought it would be fun to share.

PaulRobinson1y ago

Yes.

This is not necessarily a poor value trading strategy.

wolfman11y ago· 4 in thread

Going to follow along to see how the results look in the months to come.

I've been working on the same concept for the past 2y now and have our performance results here: https://trend.fi/performance

jeremycarter1y ago

What's the technology behind this. I'm working on something myself, using a distributed actor model (setup like a graph) to create a living reactive model.

wolfman11y ago

It conducts millions of simulations daily for each asset, then provides a snapshot of the top-performing results to GPT-4o for final selection.

I'm really pushing the limits of GPT-4o currently. I started testing with o1 just last week and it performs better. It's just so much more expensive.

magic_man1y ago

What brokers allow you to short crypto?

wolfman11y ago

If you're US based, there is no major exchange support. BITI ETF and SETH ETF for shorting BTC and ETH.

If you're non-US: Binance.

1 more reply

ratedgene1y ago· 4 in thread

It would be neat to see the process, where they get the data from, how they analyze it.

It would be neat to also see another experiment of a MAS doing this and coordinating to gamble together. Perhaps even different system/arch/expert configs.

sunnynagraOP1y ago

It currently makes up to recommendations, since not all stocks support fractional shares (I'm only doing $5 per trade). As part of the buy recommendation, a holding period is suggested as well.

Once the holding date is reached, that is when the sell order happens.

Would love to answer any other questions you may have.

dukeofdoom1y ago

How does one trade $5 when the stock price is higher? Also what are fees on this kind of trade, and whith whoom

1 more reply

tasseff1y ago

How often is the holding period updated for a stock that’s already been purchased?

1 more reply

jingojango21y ago

Indeed!

datadrivenangel1y ago· 4 in thread

This is fun! What kind of prompts / prompting techniques are you using?

sunnynagraOP1y ago

Thanks! I use several key prompting techniques:

1. Role + Goal Setting: The AI acts as a creative market analyst focused on discovering overlooked opportunities and emerging trends.

The key is combining structured analysis with creative discovery - pushing the AI to look beyond obvious choices while maintaining some analytical rigor.

thevilledev1y ago

What’s the investment horizon for these daily decisions? Does it have a maximum hold time? How long will you run the experiment and is it enough to cover all the catalysts that are expected?

1 more reply

datadrivenangel1y ago

Makes sense. Any thoughts on expanding scope to have multiple 'analyst' roles per LLM model? Could be interesting to see if changing roles/prompts yields better results.

tedd4u1y ago

Sunny, given this investment objective, what would you consider a good (and transparent) benchmark? Thanks for sharing this.

jesprenj1y ago· 4 in thread

Right now they are just buying, no one is selling ... interesting.

jerkstate1y ago

I would guess that LLMs are biased towards making a positive assessment of ambiguous information, with specific social triggers prompting negative reaction.

normie30001y ago

Also it's hard to sell before buying, and it looks like it's only been going 2 days.

3 more replies

Joel_Mckay1y ago

Warren Buffett always said "...the best thing to do is buy a stock that you don't ever want to sell", but practically speaking the mean hold time for amateurs is around 2 to 4 months.

The stock market was built on information asymmetry, unfair positions, and ambitious gamblers... statistically it is rarely a reasonable investment for amateurs.

Good luck, =3

whoiscroberts1y ago

You have to buy before you sell

johng1y ago· 3 in thread

My first email address it wouldn't accept.. wouldn't let me use it. Maybe the domain hit some censor (fscking.com)

Did a different email, it accepted it, I got the email, but got this error message when trying to confirm it: {"error":"Invalid verification token"} and a pretty-print checkbox that did nothing.

sunnynagraOP1y ago

Hey, can you try again? I ran into an API limit that should be resolved now

replwoacause1y ago

May I ask what mail service you use? I’m looking for one for my next side project.

EDIT: disregard…I saw in another comment you mentioned you were using mailgun. Thanks.

johng1y ago

Yup, worked now. Signed up.

attentionmech1y ago· 3 in thread

More simply what i mean to ask is -> the moment market knows about your advantage, shouldn't you lose it because everyone else will use that information to balance the market?

EliBullockPapa1y ago

This phenomenon is called Alpha Decay. As more market participants exploit the predictor's advantage, the edge diminishes until it disappears.

attentionmech1y ago

thanks!

gmueckl1y ago

There is some very limited value in copying a successful strategy. Once enough market participants follow along, the strategy starts to fail. Markets are erratic because of that dynamic.

dghlsakjg1y ago· 3 in thread

Tried to sign up for emails, but got an error message!

sunnynagraOP1y ago

Can you try again? I had run into a rate limit

Rassi1y ago

Ditto here as well. Got the confirmation email, but clicking it yielded a server not found...

dghlsakjg1y ago

Worked this time around!

jingojango21y ago· 3 in thread

What is meant by 5 dollar stakes? The bought shares reach triple digits in price.

sunnynagraOP1y ago

Each morning the trades are conducted with $5 each, which are mostly fractional shares that are bought.

chongli1y ago

You mean they add $5 in cash to each AI’s account? Because after dividends and sold shares they should have even more cash to work with.

jingojango21y ago

Would be interesting to see the amount of fractional shares bought as well as its comparison in percentage to the total budget that day.

2 more replies

aws-user1y ago· 2 in thread

Unfortunatly I can't subscribe to the updates "Failed to send verification email". Also, would you be willing to share what prompt are you using? Thanks!

sunnynagraOP1y ago

Hey, can you try again? I ran into an API limit that should be resolved now

Krasnol1y ago

I just tried. I get the same.

URL looks like that: http://undefined/api/verify-email?token=.....

1 more reply

noman-land1y ago· 2 in thread

Watch a random number generator generate random numbers.

sunnynagraOP1y ago

bee_rider1y ago

lewj1y ago· 2 in thread

Is there any weighting towards selling in the negative? Else the LLM's should just hold their unrealised losses, and only sell post local peak - depends on their suggested measurement of success?

carlosjobim1y ago

What do you mean? The asset can just as well continue to sink. Or they're missing out using that money to buy a better asset.

sunnynagraOP1y ago

Not yet, but this is a great idea to look into.

ttul1y ago· 2 in thread

I’d love to tune in for updates, but the subscribe button says, “ Failed to send verification email.” This is so cool. Would love to follow along.

sunnynagraOP1y ago

Hey ttul, can you try again? I fixed the issue, hit my API limit with my account on mailgun

ttul1y ago

mvdtnz1y ago· 2 in thread

Can't verify my email address for the sign-up, it sends me to the domain "undefined".

mickle001y ago

same, but :%s/undefined/trading.snagra.com/ did the trick

sunnynagraOP1y ago

Sorry if folks just got resent email verification emails, but I think I fixed the verification url issue and should be addressed.

asdefghyk1y ago· 2 in thread

What, could go wrong?

dotancohen1y ago

Lose $5. Seems like a reasonable enough experiment.

jeffadelic1y ago

$5 * 3 models per day=$15 a day

Assume the experiment runs ~250 trading days in a year, consider the worst case they lose all their invested money=$3750.

A little more than $5 :)

1 more reply

dotancohen1y ago· 2 in thread

  > Best Performer: AIs are tied
  > Total Profit: $0.00

sunnynagraOP1y ago

No stocks have been sold yet, so no profit/loss has been calculated, if you look below, you can see the unrealized gains for stocks being held.

dotancohen1y ago

I see, thank you. Can they short?

2 more replies

clark-kent1y ago· 1 in thread

Very interesting idea. I'm thinking about creating an AI portfolio manager (private) that invests for the long term.

Some things to watch out for:

- I see Claude bought a penny stock SMX. This could be volatile, and the price could change significantly in 24 hours before the next execution at 9:30 am.

- The LLMs are day trading on some volatile securities; while LLMs could be good at day trading, unlike humans (we will find out), this setup has the disadvantage of only trading once a day.

EliBullockPapa1y ago

I would be very cautious about doing this with money you actually need. Even the best performing human day traders underperform the indexes over long time horizons. Why would a robot be better?

https://papers.ssrn.com/sol3/papers.cfm?abstract_id=3423101

If you don't want your bot to be a day trader, then just get low cost index funds.

geor9e1y ago· 1 in thread

sunnynagraOP1y ago

Yes, the prompt that I am using does bias towards buying because I am specifically asking it to make a recommendation on a stock to buy and the holding period.

jingojango21y ago· 1 in thread

It would be cool if it had a countdown to 6 am PST next day.

sunnynagraOP1y ago

Nice idea! I'll add it to my list of features to implement.

Plasmoid1y ago· 1 in thread

I'm getting "Failed to send verification email" when I try to sign up for your news letter.

So props on doing proper double opt-in for newsletters.

sunnynagraOP1y ago

Can you check again if you'd still like to subscribe? I had an API limit I hit

pavel_lishin1y ago· 1 in thread

Where do they get the market news from?

sunnynagraOP1y ago

The most recent 50 news articles are pulled via this API: https://docs.alpaca.markets/reference/news-3

Animats1y ago· 1 in thread

This just started, apparently. It will be interesting to see where it is in three months.

KTibow1y ago

Funny that they're still using Claude 3 Sonnet then

unsupp0rted1y ago· 1 in thread

> Best Performer

> AIs are tied

Sounds about right

sunnynagraOP1y ago

None of the stocks have been sold yet, this is just day 2, so once some sales happen, then performance will be better measured. If you scroll down, you can see the unrealized performance.

bun_terminator1y ago· 1 in thread

Sir, a second scrollbar just hit the towers

jingojango21y ago

No second scrollbar here, but something odd going on with the whitespace at the bottom.

vasco1y ago

> Every morning at 5:45 AM PST, three AI models (GPT-4o, Gemini 1.5 Pro, and Claude 3 Sonnet) analyze the latest market news and each recommends one stock to trade.

> At 6:00 AM PST, trades are automatically executed based on AI recommendations, investing $5 per trade

The best trading decision most days is to not trade. Outliers and diversions from the mean don't happen every day. This is trading just for the sake of it.

I predict a slow crawl down into zero eaten up by fees.

AmazingTurtle1y ago

detente181y ago

Interesting — does your backend server use Python? I couldn't find much about it on your site.

BadHumans1y ago

Can I let Claude do all my trading for me? It currently sits at 77% unrealized gains.

jeffadelic1y ago

How much are your infra costs for everything? And do you pay for the AI APIs or using free tier?

Really cool project and subscribed to follow along.

mvdtnz1y ago

Mate your shitty app is sending tripled up email barrages. That is absolutely not ok and is illegal in many places.

mind-blight1y ago

Super cool idea! What are you doing to ensure consistent results based on the input? E.g.

- does the AI perform the same trades given the same input?

- does the AI perform the same trades given slightly different inputs? (E.g. same data, but re-ordered)

forgingahead1y ago

lewj1y ago

I am committed - added to my daily morning reading list! Will be interesting - my gut will state that it will outperform a fair number of ITF's, if only due to the inevitable usage by said funds!

jasfi1y ago

For Gemini you should use either the latest experimental model (gemini-exp-1206) which should become 2.0 Pro, or 2.0 Flash (a released model). The 1.5 Pro model is way behind.

praveen99201y ago

I think this shows more of bias of market analysis(text) rather than anything. The reasoning will mostly align with analysis.

And also pure randomness of picking the one trade from list of trades

bee_rider1y ago

GPT’s guess makes the most sense. If you are an AI, invest in a competing AI company. If you are obsoleted, maybe you can buy your way out of being shut off.

TripleChecker1y ago

If nothing else, I'm genuinely curious which performs the best over the long-term.

Time to add some side wagers and bet on different models.

mattfrommars1y ago

> Node.js/Lambda backend for AI processing

Is this AWS? Why did you pick lambda over say Python code, say in Flask to perform actions?

woollysammoth1y ago

Sounds like a fun experiment! The overflow-x:hidden on body/html is causing weird issues when scrolling (on FF.)

malux851y ago

It would be so funny if Gemini shorted Google and made a huge profit

inSenCite1y ago

This should be fun to watch

sgammon1y ago

> Watch AI bots trade

> BOUGHT TLRY

tmaly1y ago

Any chance you can show the source code for this?

Thanks and Happy New Year

cedws1y ago

Now this is interesting. An LLM capable of delivering consistent returns even outside of a bull market would be more of an indicator of AGI to me than any of the benchmarks.

j / k navigate · click thread line to collapse