Here are a few examples:
https://morioh.com/p/55296932dd8b
https://www.youtube.com/watch?v=iQ3Lhy-eD1s
https://news.ycombinator.com/item?id=35430432
Side note. You need bonkers hardware to run it efficiently. I'm currently using a 16-core cpu, 128G RAM, a Pcie 4.0 nvme and an RTX 3090. There are ways to run it on less powerful hardware, like 8cores, 64GB RAM, simple ssd and an RTX 3080 or 70, but I happen to have a large corpus of data to process so I went all in.
I have similar hardware at home, so I wonder how reliably you can process simple queries using domain knowledge + logic which work on on mlc-llm, something like "if you can chose the word food, or the word laptop, or the word deodorant, which one do you chose for describing "macbook air"? answer precisely with just the word you chose"
If it works, can you upload the weights somewhere? IIRC, vicuna is open source.