https://chatgpt.com/share/6770c547-2f90-8004-ba41-21bfa4d3a7...
Curious about your local LLM usage -- do you have that documented, or can you recommend sources on how to get started in that domain? I self host most of my infrastructure, but not LLMs so far. Do you need special hardware? How do you interact with the LLMs? How to you keep them updated? Do you fine tune/do any training, or just of the shelf llama? Do you need to know a bunch about quantization? How fast are the responses? Can you use them in your IDE as a coding assistant? How is resource utilization?