undefined | Better HN

0 pointsbetimsl1y ago0 comments

I mean, local LLM, for example: can it be run with llamafile instead of cloud AI?

0 comments

Great question! We explored local LLMs (including llamafile-type solutions) in our early development, but found that the reasoning capabilities and consistency weren't quite there yet for our specific needs.

That's why we currently optimize for cloud AI models while implementing intelligent plan caching to significantly reduce API costs. This approach gives you the best of both worlds: high-quality execution plans with minimal API costs, plus much faster performance for similar actions.

You might find our documentation on plan caching interesting - it explains how we maximize efficiency: https://github.com/orra-dev/orra/blob/main/docs/plan-caching...

We're always evaluating new LLM options though, so I'd be curious to hear about your specific use case.

betimslOP1y ago

Running a 7b coder in laptops with 4060 is possible and with very good results. Orra looks like a very good tool to be integrated with any IDE. Take a look at this: https://github.com/huggingface/llm.nvim -- it has a backend option. Ollama exposes a REST API, I think you guys should support it :)

ezodude1y ago

Thanks! Will take a look.

j / k navigate · click thread line to collapse

0 comments

ezodude1y ago

You might find our documentation on plan caching interesting - it explains how we maximize efficiency: https://github.com/orra-dev/orra/blob/main/docs/plan-caching...

We're always evaluating new LLM options though, so I'd be curious to hear about your specific use case.

betimslOP1y ago

ezodude1y ago

Thanks! Will take a look.

j / k navigate · click thread line to collapse