undefined | Better HN

0 pointsmodeless2y ago0 comments

I'm using Llama2-chat-13B via mlc-llm @ 4bit quantization + whisper-streaming + coqui TTS, all running simultaneously on one 4090 in real time.

It didn't take long to prototype. Polishing and shipping it to non-expert users would take much longer than I've spent on it so far. I'd have to test for and solve a ton of installation problems, find better workarounds for whisper-streaming's hallucination issues, improve the heuristics for controlling when to start and stop talking, tweak the prompts to improve the suitability of the LLM responses for speech, fixup the LLM context when the LLM's speech is interrupted, probably port the whole thing to Windows for broader reach in the installed base of 4090s, possibly introduce a low-memory mode that can support 12GB GPUs that are much more common, document the requirements and installation process, and figure out hosting for the ginormous download it would be. I'd estimate at least 10x the effort I've spent so far on the prototype before I'd really be satisfied with the result.

I'd honestly love to do all that work. I've been prioritizing other projects because I judged that it was so obvious as a next step that someone else was probably working on the same thing with a lot more resources and would release before I could finish as a solo dev. But maybe I'm wrong...

0 comments

4 comments · 2 top-level

pmarreck2y ago· 2 in thread

> It didn't take long to prototype. Polishing and shipping it to non-expert users would take much longer than I've spent on it so far. I'd have to test for and solve a ton of installation problems

I've found some success at this by using Nix... but Nix is a whole 'nother ball of yarn to learn. It WILL get you to declarative/deterministic installs of any piece of the toolchain it covers, though, and it does a hell of a lot better job managing dependencies than anything in Python's ecosystem ever will (in fact, I am pretty sure that Python's being terrible at this is actually driving Nix adoption)

As an example of the power Nix might enable, check out https://nixified.ai/ (which is a project that hasn't been updated in some months and I hope is not dead... It does have some forks on Github, though). Assuming you already have Nix installed, you can get an entire ML toolchain up including a web frontend with a single command. I have dozens of projects on my work laptop, all with their own flake.nix files, all using their own versions of dependencies (which automatically get put on the PATH thanks to direnv), nothing collides with anything else, everything is independently updateable. I'm actually the director of engineering at a small startup and having our team's dev environments all controlled via Nix has been a godsend already (as in, a massive timesaver).

I do think that you could walk a live demo of this into, say, McDonald's corporate, and walk out with a very large check and a contract to hire a team to look into building it out into a product, though. (If you're going to look at chains, I'd suggest Wawa first though, as they seem to embrace new ordering tech earlier than other chains.)

modelessOP2y ago

I'm not the guy working on ordering, it's this guy https://news.ycombinator.com/user?id=TheEzEzz.

Nix sounds good for duplicating my setup on other machines I control. But I'd like a way to install it on user machines, users who probably don't want to install Nix just for my thing. Nix probably doesn't have a way to make self contained packages, right?

pmarreck2y ago

> But I'd like a way to install it on user machines, users who probably don't want to install Nix just for my thing. Nix probably doesn't have a way to make self contained packages, right?

I mean... That's the heart of the problem right there. You can either have all statically compiled binaries (which don't need Nix to run) which have no outside dependencies but result in a ton of wasted disk space with duplicate dependency data everywhere, or you can share dependencies via a scheme, of which the only one that makes real sense (because it creates real isolation between projects but also lets you share equal dependencies with zero conflicts) is Nix's (all of the others have flaws and nondeterminism).

joshspankit2y ago

I wish docker could be used more easily with graphic cards and other hardware peripherals (speakers/mic in this case). It would solve a lot of these issues.

j / k navigate · click thread line to collapse