Tell HN: Claude Opus elevated "Internal server error" again
This is more of a "it's not just you" post for those affected since Claude's status page is useless ("All Systems Operational"!)
GitHub: https://github.com/StanAngeloff | Mastodon: https://mastodon.social/@stanangeloff | LinkedIn: https://www.linkedin.com/in/stanangeloff/ | X: https://x.com/StanAngeloff | Blog: https://blog.angeloff.name
This is more of a "it's not just you" post for those affected since Claude's status page is useless ("All Systems Operational"!)
My workflow started with me living in developer portals - Claude Workbench, OpenAI Platform, Vertex (the horrors of GCP's Console UI!) I would spend hours in these web UIs crafting prompts, iterating on system instructions, maintaining a carefully curated library of sessions. But the browser really wasn't optimised for this kind of interaction. Editing was clunky (muscle memory <C-W> would close the tab instead and wipe my work), going back and forth between the LLM and myself felt off. So I had an idea: what if I could turn a Markdown document into an LLM chat interface?
At first I thought that would be enough - just do what I was already doing in the browser... but in Neovim. And sure enough having been accustomed to my own setup for the last decade I immediately felt a productivity boost. Writing functional requirement documents, statements of work, deep research across different systems - all of it felt better in "my" editor.
I then had a taste of Aider... and Claude Code... and all the other tools that were coming out. And Flemma felt lacking. So I started building: tool support, better conversation organisation, a proper UI to tame the noise that tool calls introduce.
Today, Flemma is a fully evolved AI workspace. It runs autonomous agent loops, interacts with multiple LLMs (Anthropic, OpenAI, Vertex, Moonshot) and lets you switch providers mid-conversation - something I do occasionally during research, asking two or three different models for their take on a problem, then combining findings into a final document.
Under the hood, .chat files are just Markdown with role markers (@You:, @Assistant:), but Flemma treats them as a proper filetype with its own parser, AST, LSP server, template engine and sandboxed tool execution. The buffer *is* the state - no hidden database, no JSON history, no server process. Your conversations are portable, greppable and version-controllable (I backup mine in Git). You can close Neovim, reopen the file a week later and pick up exactly where you left off.
It's got prompt caching, extended thinking, 7 built-in tools, a layered config system that gets out of your way and a UI that keeps getting refined to bring the noise down and make long agent sessions pleasant to work in (not quite there yet).
Flemma is for anyone who'd rather stay in Neovim. If that's you, I'd love to hear what you think.
Repo: https://github.com/Flemma-Dev/flemma.nvim
Demo: http://flemma.dev/flemma.nvim/blob/develop/README.md#-flemma
The core idea: a .chat file IS the conversation. No SQLite, no JSON logs, no shadow state. What you see in the buffer is exactly what the model receives. Edit an assistant reply to fix a hallucination, delete a tangent, fork by duplicating the file - it all works because there's nothing to fall out of sync.
What's new since October:
- Tool calling. Models can run shell commands, read/edit/write files (same as Pi, just 4 tools). Results go straight into the buffer. There's an approval flow (Ctrl-] cycles: preview -> execute -> send) so nothing runs without your say-so. Parallel tool use also works.
- Prompt caching for Anthropic, OpenAI and Vertex AI. Flemma places cache breakpoints automatically. Long conversations are now significantly cheaper (this was a major pain point for me).
- Extended thinking / reasoning support for all 3 providers.
- Per-buffer overrides via frontmatter. `flemma.opt` lets you pick which tools a buffer can use, set provider parameters, switch models - all scoped to that one file.
- Open registration APIs for both providers and tools. Custom tools can resolve definitions asynchronously from CLI subprocesses or remote APIs. I plan on adding mcporter support at some point.
Flemma works with Anthropic, OpenAI and Vertex AI. You get cost tracking, presets, Lua template expressions, file attachments and a lualine.nvim component.
One thing I want to be upfront about: nearly every line of code in Flemma was written by AI (Claude Code as of late, Amp and Aider in the past). It says so in the README. Every change was personally architected, reviewed and tested by me. I decide what gets built and I vet every diff. I think this is where a lot of software development is heading and I'd rather be honest about it than pretend otherwise.
I'm @StanAngeloff on GitHub - long-time Neovim user and open source enthusiast. Happy to answer questions.