Either LFM2.5-1.6B-4bit or Qwen3.5-2B-8bit or Qwen3.5-4B-4bit
I installed it and it's none of that. It is a mere wrapper around small local LLM models. And, it's not even multi-modal! Anyone could've one-shotted this in Claude in an hour (I'm not exaggerating).
What's the target audience here? Your average person doesn't care about the privacy value proposition (at least not by severely sacrificing chat model's quality). And users who do want that control can already install LMStudio/Llama.cpp (which is dead simple to setup).
The actual release product should've been what's described in "What's next" section.
> Instead of general chat, we shape Ensu to have a more specialized interface, say like a single, never-ending note you keep writing on, while the LLM offers suggestions, critiques, reminders, context, alternatives, viewpoints, quotes. A second brain, if you will.
> A more utilitarian take, say like an Android Launcher, where the LLM is an implementation detail behind an existing interaction that people are already used to.
> Your agent, running on your phone. No setup, no management, no manual backups. An LLM that grows with you, remembers you, your choices, manages your tasks, and has long-term memory and personality.
I think they did. If you start the download and then open the sidebar and/or background the app, the download progress bar disappears and is replaced by the download button. If you press the download button again, the progress bar reappears at the correct point.
I find that Claude often makes little statefulness mistakes like that. Human developers do too, but the slower and more iterative nature of human development makes it more likely that that would get caught.
This probably could have been one-shotted with Sonnet, not even Opus. Given how over indexed they are on LLM coding, Haiku might even be able to do it.
This is actually an interesting coding model benchmark task now that I think about it.
There is truly nothing original here and the product doesn't have a chance in hell of earning money. Local LLMs on-device will be dominated by the device vendors, whose control of the hardware stack combined with their ability to subsidize billions of dollars of machine learning research gives them an unfair advantage. Apple knows what the next generation of silicon will deliver, and their ML engineers are already hard at work building models that will be highly optimized for that silicon a year or two ahead of time. Open source models are really great and are backed by well funded labs; however, delivering these models on-device in a way that pleases users will never be easier than it is for the vendors of the devices.
Plus, device vendors have ways of making money from local LLMs that third-party app providers do not. They can make their local LLM free and earn money on the hardware play, without skipping a beat on the billions of dollars of ongoing R&D. I don't see how third party app vendors make money here when they will be competing with the decent, totally free alternative that Apple and Google (and Samsung etc.) will load on in the next year or two.
We have not seen a tidal wave of untechnical people vibe coding up their own software solutions.
Ideally if you "participate" in the network, you would get "credits" to use it proportionally to how much GPU power you have provided to the network. Or if you can't, then buy credits (payment would be distributed as credits to other participants).
That way we could build huge LLMs that area really open and are not owned by any network.
I would LOVE to participate in building that as well.
This was posted the other day, but only briefly made the front page - seems kinda like what you’re talking about
Have a comparison chart to Ollama, LMStudio, LocalAI, Exo, Jan.AI, GPT4ALL, PocketPal, etc.
Going to give this a try...
Does this seem sound?
What I'm missing is a way to create and use Passkeys across devices. My use case does not support creating a new Passkey on every device, I need to sync them via servers I control. The system that supports that will be the system that I migrate to.
Expressly harvesting creds through a 2FA app seems a little more direct.
When the comments here say "there's no value because anyone could've compiled llama.cpp", you can see how detached from reality these people are.
Even jumping through the hoops to get an app on Play Store and Apple Store — an app that I can tell my friends to look up and download — is worth a lot.
An app that is also is available on Mac and PC, mind you.
I'm an ex-Google/Meta/Microsoft/Roblox software engineer, and I couldn't be bothered to do any of that.
Neither could the rest of HN. But I'm not the one complaining about lack of novelty or value in this proposition.
However, it’s a bit confusing because, for example, a larger LLM model was downloaded to my smartphone than to my computer. It would probably make the most sense if the app simply categorized devices into five different tiers and then, depending on which performance tier a device falls into, downloaded the appropriate model and simply informed the user of the performance tier. Over time, it would then be possible to periodically replace the LLM for each tier with better ones, or to redefine the device performance tiers based on hardware advancements.
A little bit of cleanup on their site to break out "Ente, our original photo sharing app" from the rest of their apps would do wonders, because I had to search around on the announcement to find the download for this app, which feels about like trying to find the popular Ente Auth app on their website
EDIT: and there are long-standing bugs like this one, unaddressed: https://github.com/ente-io/ente/issues/3087
Helping non-technical people get off of ChatGPT.com and using increasingly better local models seems worth celebrating and continued iteration.
https://github.com/ente-io/ente/blob/f254af939ff6950b63edf5f... Here is the system prompt, kinda embarassing
This does the same for language models.
https://github.com/Arthur-Ficial/apfel
Apple Ai on the command line
> Ente is becoming like Proton: too many products and a lack of focus, leading to lower quality and not delivering what customers want
https://github.com/ente-io/ente/discussions/552#discussionco...
If you have any follow up questions, please do ask.
I'd love to know a few more local LLM apps that are available on Android and iOS and Mac/PC under the same branding that I can point my non-technical friends to as a ChatGPT alternative that works offline (but still has sync across the devices).
Could you recommend a few?
I've found https://github.com/alichherawalla/off-grid-mobile-ai but haven't tried anything in this space yet.
Absolutely no one called them crazy.
hundreds of local llm apps exist
the what's next section acts novel but all of them have been achieved or created in some format already: you can run a local llm on a phone and connect it to a agent
Then moved to pocket pal now for local llm.
How does it compare to Jan AI for example? or LM Studio? or ????
They also have a TOTP auth app?
If their photos app stopped crashing and they pursued basic feature parity between their iOS and desktop apps (IMO table stakes for a photo sync service) I'd have no issue recommending them. Instead, it seems like every so often they just branch off into a new direction, leaving the existing products unfinished. It's like Mozilla-level lack of focus.
We'd like to fix the crashes.
Sorry for the troubles.
I have a phone in a drawer I could install termux and ollama on over tailscale and then I'd have an always on llm for super light tasks.
I do really long for a private chat bot but I simply don't have access to the hardware required. Sadly I think it's going to be years to get there..
If Ente is reading this : please add requirements to make it run (how many RAM, etc.)
Come onnnnnn. I would rather read a one line "Check out our offline llm" rather than a whole press release of slop.
This looks very neat. I'm not familiar with the nitty gritty of AI so I really don't understand how it can reply so quickly running on an iPhone 16. But I'm not even going to bother searching for details because I don't want to read slop.
It requires a Firefox add-on to act as a bridge: https://addons.mozilla.org/en-US/firefox/addon/ai-s-that-hel...
There is honestly not much to test just yet, but feel free to check it out here, provide feedback on the idea: https://codeberg.org/Helpalot/ais-that-helpalot
The essence works, I was able to let it make a simple summary on CMS content. So next is making it do something useful, and making it clear how other plugins could use it.
Also: "Your AI agent can now create, edit, and manage content on WordPress.com" https://wordpress.com/blog/2026/03/20/ai-agent-manage-conten...
For when wordpress doesn't have enough exploits and bugs as it is. Also why bother with wordpress in the first place if you're already having an LLM spit out content for you ?