And it has a headphone jack, OK? I just hate Bluetooth earbuds. And yeah, it isna problem, but I digress.
When I run a 2.5B model, I get respectable output. Takes a minute or two to process the context, then output begins at somewhere on the order of 4 to 10 tokens per sec.
So, I just make a query and give it a few and I have my response.
Here is how I see it:
That little model, which is Gemma 2.2b sorry, knows a lot of stuff. It has knowledge I don't and it gives it to me in a reasonable, though predictable way. Answers are always of a certain teacher reminding student how it all goes way.
I don't care. Better is nice, but if I were stuck somewhere with no network, being able to query that model is amazing!
First aid, how to make fires, materials and uses. Fixing stuff, theories of operation, what things mean and more are in that thing ready for me to take advantage of.
I consider what I have fast. And it will get one or two orders faster over the next few years too.
I did it on a lark (ask the model what that means) and was surprised to see I gained a nice tool.