I feel like there could be 2 major advantages: costs at scale and privacy.
1. When talking about the cost, GPT-4o-mini is inexpensive and if we continue in that path, the cost for inference will become negligible soon. Unless your company makes huge use of the model (or uses huge contexts), like those running thousands of autonomous agents, investing in the hardware, does not seem like the best alternative.
2. Privacy. I would say this is more relevant for some industries that work with highly sensitive data. However, I can see how big companies simply engage in private cloud contracts with Azure or other cloud providers. They provide that peace of mind and scalability and at the same time, depending on the contract, some guarantees.
So my big question is, do you know use cases or companies deploying LLMs on their data centers, or looking to do it or is this just for hobbyists?
``` 401 GET / Expanse, a Palo Alto Networks company, searches across the global IPv4 space multiple times per day to identify customers' presences on the Internet. If you would like to be excluded from our scans, please send IP addresses/domains to: scaninfo@paloaltonetworks.com ```
I think it is just genius. It is a demonstration of effective marketing and knowing where your audience is.
I am currently considering moving a framework I built from Python to Rust to make it faster and take advantage of all the Rust safe features. However, one of my requirements is to still allow users to use Python code, thus, I was thinking about using RustPython for that. I have been doing basic experiment but I would like to ask if anyone has done that before, and the limitations you found on the road. I have read somewhere that RustPython now seems to support pip packages, but I am also not sure about the limitations of it.
Thanks in advance