Ichigo: Local real-time voice AI (opens in new tab)

(github.com)

217 pointsegnehots1y ago40 comments

There was an announcement on LocalLLaMA: https://www.reddit.com/r/LocalLLaMA/comments/1g38e9s/ichigol...

There were several links:

- Blog for details: https://homebrew.ltd/blog/llama-learns-to-talk

- Code: https://github.com/homebrewltd/ichigo

- Run locally: https://github.com/homebrewltd/ichigo-demo/tree/docker

- Demo on a single 3090: https://ichigo.homebrew.ltd/

40 comments

29 comments · 10 top-level

cassepipe1y ago· 10 in thread

Finally I can use one of the random facts that have entered my brain for decades now even though I can't remember where my keys are.

If I remember correctly, "ichigo" means strawberry in japanese. You are welcome.

SapporoChris1y ago

Sorry, you're wrong. It means 1 5. Just kidding, it is strawberry but it can also be read as one and five. However, it is not fifteen.

fooqux1y ago

> it can also be read as one and five. However, it is not fifteen.

Can you help me wrap my brain around this? Does it mean six? I'm struggling to understand how a word can mean two numbers and how this would actually be used in a conversation.

Thanks. I'm curious and trying to search for this to understand just returns anime.

2 more replies

d3w3y1y ago

There are strawberries all over the readme so I reck you're right.

mmastrac1y ago

Is this a continuation of the meme that GPT can't identify the number of "R"s in "strawberry"?

2 more replies

AtlasBarfed1y ago

Getsuga tenshou!!

dumb12241y ago

haha was looking for that!

Ban-kai 卍解

adammarples1y ago

From the book tomorrow and tomorrow and tomorrow?

zarmin1y ago

Your keys are in the fridge with the remote control.

greydius1y ago

I think it's a bit of word play. 苺 (strawberry) and 一語 (one word) are both read "Ichigo".

beretguy1y ago

Tatakae!

emreckartal1y ago· 2 in thread

Emre here from Homebrew Research. It's great to see Ichigo on HN!

A quick intro: We're a Local AI company building local AI tools and training open-source models.

Ichigo is our training method that enables LLMs to understand human speech and talk back with low latency - thanks to FishSpeech integration. It is open data, open weights, and weight initialized with Llama 3.1, extending its reasoning ability.

Plus, we are the creators and lead maintainers of: https://jan.ai/, Local AI Assistant - an alternative to ChatGPT & https://cortex.so/, Local AI Toolkit (soft launch coming soon)

Everything we build and train is done out in the open - we share our progress on:

https://x.com/homebrewltd https://discord.gg/hTmEwgyrEg

You can check out all our products on our simple website: https://homebrew.ltd/

gnuly1y ago

any plans to share progress on open channels like matrix.org or even irc?

silverliver1y ago

I second this request. Any publicly indexable channel would be fine.

I think Matrix is not publicly indexable unless the channel is unencrypted and set to public.

tmshapland1y ago· 2 in thread

This is a really cool project! What have people built with it? I'd love to learn about what local apps people are building on this.

emreckartal1y ago

Thanks! We've received feedback on use cases like live translation, safe and untrackable educational tools for kids, and language-learning apps. There are so many possibilities, and hope to see guys building amazing products on top of Ichigo.

itake1y ago

I just tried to use the demo website for live translation. The AI always responded in English, either ignoring my request to only respond in French or Lao, or preface the translation with english ("I can translate that to French. the translation is: ...").

I'm trying to use chatgpt for ai translation, but the other big problem I run into is TTS and SST on non-top 40 languages (e.g. lao). Facebook has a TTS library, but it isn't open for commercial use unfortunately.

1 more reply

lostmsu1y ago· 2 in thread

Very cool, but a bit less practical than some alternatives because it does not seem to do request transcription.

emreckartal1y ago

Actually, it does. You can turn on the transcription feature from the bottom right corner and even type to Ichigo if you want. We didn’t show it in the launch video since we were focusing on the verbal interaction side of things.

emreckartal1y ago

Ah, I see now.

To clarify, while you can enable transcription to see what Ichigo says, Ichigo's design skips directly from audio to speech representations without creating a text transcription of the user’s input. This makes interactions faster but does mean that the user's spoken input isn't transcribed to text.

The flow we use is Speech → Encoder → Speech Representations → LLM → Text → TTS. By skipping the text step, we're able to speed things up and focus on the verbal experience.

Hope this clears things up!

1 more reply

cchance1y ago· 1 in thread

its amazing to see cool projects like this really REALLY based in opensource and open training like this wow

emreckartal1y ago

Thanks! It's all open research, source code, data, and weights.

frankensteins1y ago· 1 in thread

Great initiative! before adding more comments, I'm trying to set up on my local Mac M3 machine. I'm having a hard time to install dependencies. Anyone here have the same issue?

emreckartal1y ago

Thanks! You can't run Ichigo on a Mac M3 just yet. It'll be possible to run it locally on Mac once we integrate it with Jan.ai

p0larboy1y ago· 1 in thread

Tried demo but all I got was "I'm sorry, I can't quite catch that".

emreckartal1y ago

We're running the demo on a single 3090, so it may sometimes be a bit buggy. You can try running it locally here: https://github.com/homebrewltd/ichigo-demo/tree/docker

The documentation isn't very detailed yet, but we're planning to improve it and add support for various hardware.

thruflo1y ago

Great stuff. Voice AI is great to run locally not just for privacy / access to personal data but also because of the low latency requirement. If there's a delay in conversation caused by a network call, it just feels weird, like an old satellite phone call.

famahar1y ago

Looks impressive. I'm guessing the demo isn't representative of the full possibilities of this? Tried to have a basic conversation in Japanese and it kept on sticking with English. When it did eventually speak Japanese the pronunciation was completely off. I'm really excited about the possibility of local language learning with near realtime conversation practice. Will keep an eye on this.

mentalgear1y ago

Kudos to the team, this is truly impressive work! It's exciting to see how AI connects with the local-first movement, which is also really exploding in popularity. (The idea of local-first, where data processing and functionality are prioritized on users' own devices, aligns perfectly with emerging privacy concerns and the push for decentralization.)

Bringing AI into this space enhances user experience while respecting their autonomy over data. It feels like a promising step toward a future where we can leverage the power of AI without compromising on privacy or control. Really looking forward to seeing how this evolves!

j / k navigate · click thread line to collapse

40 comments

29 comments · 10 top-level

cassepipe1y ago· 10 in thread

Finally I can use one of the random facts that have entered my brain for decades now even though I can't remember where my keys are.

If I remember correctly, "ichigo" means strawberry in japanese. You are welcome.

SapporoChris1y ago

Sorry, you're wrong. It means 1 5. Just kidding, it is strawberry but it can also be read as one and five. However, it is not fifteen.

fooqux1y ago

> it can also be read as one and five. However, it is not fifteen.

Can you help me wrap my brain around this? Does it mean six? I'm struggling to understand how a word can mean two numbers and how this would actually be used in a conversation.

Thanks. I'm curious and trying to search for this to understand just returns anime.

2 more replies

d3w3y1y ago

There are strawberries all over the readme so I reck you're right.

mmastrac1y ago

Is this a continuation of the meme that GPT can't identify the number of "R"s in "strawberry"?

2 more replies

AtlasBarfed1y ago

Getsuga tenshou!!

dumb12241y ago

haha was looking for that!

Ban-kai 卍解

adammarples1y ago

From the book tomorrow and tomorrow and tomorrow?

zarmin1y ago

Your keys are in the fridge with the remote control.

greydius1y ago

I think it's a bit of word play. 苺 (strawberry) and 一語 (one word) are both read "Ichigo".

beretguy1y ago

Tatakae!

emreckartal1y ago· 2 in thread

Emre here from Homebrew Research. It's great to see Ichigo on HN!

A quick intro: We're a Local AI company building local AI tools and training open-source models.

Plus, we are the creators and lead maintainers of: https://jan.ai/, Local AI Assistant - an alternative to ChatGPT & https://cortex.so/, Local AI Toolkit (soft launch coming soon)

Everything we build and train is done out in the open - we share our progress on:

https://x.com/homebrewltd https://discord.gg/hTmEwgyrEg

You can check out all our products on our simple website: https://homebrew.ltd/

gnuly1y ago

any plans to share progress on open channels like matrix.org or even irc?

silverliver1y ago

I second this request. Any publicly indexable channel would be fine.

I think Matrix is not publicly indexable unless the channel is unencrypted and set to public.

tmshapland1y ago· 2 in thread

This is a really cool project! What have people built with it? I'd love to learn about what local apps people are building on this.

emreckartal1y ago

itake1y ago

1 more reply

lostmsu1y ago· 2 in thread

Very cool, but a bit less practical than some alternatives because it does not seem to do request transcription.

emreckartal1y ago

Ah, I see now.

The flow we use is Speech → Encoder → Speech Representations → LLM → Text → TTS. By skipping the text step, we're able to speed things up and focus on the verbal experience.

Hope this clears things up!

1 more reply

cchance1y ago· 1 in thread

its amazing to see cool projects like this really REALLY based in opensource and open training like this wow

emreckartal1y ago

Thanks! It's all open research, source code, data, and weights.

frankensteins1y ago· 1 in thread

Great initiative! before adding more comments, I'm trying to set up on my local Mac M3 machine. I'm having a hard time to install dependencies. Anyone here have the same issue?

emreckartal1y ago

Thanks! You can't run Ichigo on a Mac M3 just yet. It'll be possible to run it locally on Mac once we integrate it with Jan.ai

p0larboy1y ago· 1 in thread

Tried demo but all I got was "I'm sorry, I can't quite catch that".

emreckartal1y ago

We're running the demo on a single 3090, so it may sometimes be a bit buggy. You can try running it locally here: https://github.com/homebrewltd/ichigo-demo/tree/docker

The documentation isn't very detailed yet, but we're planning to improve it and add support for various hardware.

thruflo1y ago

famahar1y ago

mentalgear1y ago

j / k navigate · click thread line to collapse