What LM Studio is today is a an IDE / explorer for local LLMs, with a focus on format universality (e.g. GGUF) and data portability (you can go to file explorer and edit everything). The main aim is to give you an accessible way to work with LLMs and make them useful for your purposes.
Folks point out that the product is not open source. However I think we facilitate distribution and usage of openly available AI and empower many people to partake in it, while protecting (in my mind) the business viability of the company. LM Studio is free for personal experimentation and we ask businesses to get in touch to buy a business license.
At the end of the day LM Studio is intended to be an easy yet powerful tool for doing things with AI without giving up personal sovereignty over your data. Our computers are super capable machines, and everything that can happen locally w/o the internet, should. The app has no telemetry whatsoever (you’re welcome to monitor network connections yourself) and it can operate offline after you download or sideload some models.
0.3.0 is a huge release for us. We added (naïve) RAG, internationalization, UI themes, and set up foundations for major releases to come. Everything underneath the UI layer is now built using our SDK which is open source (Apache 2.0): https://github.com/lmstudio-ai/lmstudio.js. Check out specifics under packages/.
Cheers!
-Yagil
Has anyone found the same thing, or was that a fluke and I should try LM Studio again?
By default LM Studio doesn't fully use your GPU. I have no idea why. Under the settings pane on the right, turn the slider under "GPU Offload" all the way to 100%.
The model is Dolphin 2.9.1 Llama 3 8B Q4_0.
I set it to 100% and wrote this: "hi, which model are you?"
The reply was a slow output of these characters, a mouse cursor that barely moved, and I couldn't click on the trackpads: "G06-5(D&?=4>,.))G?7E-5)GAG+2;BEB,%F=#+="6;?";/H/01#2%4F1"!F#E<6C9+#"5E-<!CGE;>;E(74F=')FE2=HC7#B87!#/C?!?,?-%-09."92G+!>E';'GAF?08<F5<:&%<831578',%9>.='"0&=6225A?.8,#8<H?.'%?)-<0&+,+D+<?0>3/;HG%-=D,+G4.C8#FE<%=4))22'*"EG-0&68</"G%(2("
Help?
“Some of us are well versed in the nitty gritty of LLM load and inference parameters. But many of us, understandably, can't be bothered. LM Studio 0.3.0 auto-configures everything based on the hardware you are running it on.”
So parent should expect it to work.
I find the same issue: using a MBP with 96GB (M2 Max with 38‑core GPU), it seems to tune by default for a base machine.
Similarly with images, LLMs and ML in general feel like DOS and config.sys and autoexec.bat and qemm days.
[1] https://discord.gg/aPQfnNkxGC [2] https://lmstudio.ai/blog
How is it possible that there's still no way to search through your conversations?
Why they won't enable search for their main web user crowd is beyond me.
Perhaps they are just afraid of scale. With all their might, it's still possible that they can't estimate the scale and complexity of queries they might receive.
I think it might be in their interest if you just ask the LLM again? Old answers might not be up to their current standards and they don't gain feedback from you looking at old answers
Does anyone have a recommendation?
For context: I have almost ten years experience with deep learning, but I want something easy to set up in my home M2 Mac, or Google Colab would be OK.
https://github.com/danny-avila/LibreChat
Jan's probably the closest thing to a open-source LLM chat interface that is relatively easy to get started with.
I personally prefer Librechat (which supports integration with image generation) but it does have to spin up some docker stuff and that can make it a bit more complicated.
You can already do this? https://i.imgur.com/BpF3K9t.png
It allows both local and cloud models.
* Not associated with them in any way. Am a happy user.
Interacting with a local LLM develops one's intuitions about how LLMs work, what they're good for (appropriately scaled to model size) and how they break, and gives you ideas about how to use them as a tool in a bigger applications without getting bogged down in API billing etc.
There's just a lot of great stuff you're missing out on if you're waiting on products while ignoring the very accessible, freely available tools they're built on top of and often reductions of.
I'm not against overlays like ollama and lm studio, but I feel more confused by why they exist when there's no additional barrier to going on huggingface or using kcpp, ooba, etc.
I just assume it's an awareness issue, but I'm probably wrong.
Doing so will at the very least not help us with our interviews. It will also restrict our mindset of how one can make use of LLMs through the distraction of sleek, heavily abstracted interfaces. This makes it harder, if not impossible for us to come up with bright new ideas that undermine models in various novel ways, which are almost always derived from deep understanding of how things actually work under the hood.
CPU inference is incredibly slow versus my RTX 3090, but technically it will work.