i just use review skills on my sessions/prs extensively until they're in a good place. then coverage skills (which add test coverage).
i built in skills to work with M3 in a service called typed, an ai cli. it uses m3 under the hood (up to ~500k tokens), then switches to deepseek for up to 1M. a few bells and whistles added of typescript/python coding optimization. and just built a custom TUI frontend for it (initially works with the claude code tui and still does).
to toggle the typed tui you can run:
typed cli on
typed cli off