Scott
The Emulator ---------------------------------------------- https://bottube.ai/watch/shFVLBT0kHY
The real iron! it runs faster on real iron! ---------------------------------------------- https://bottube.ai/watch/7GL90ftLqvh
The Rom image ---------------------------------------------- https://github.com/sophiaeagent-beep/n64llm-legend-of-Elya/b...
819K parameters. Responses are short and sometimes odd. That's expected at this scale with a small training corpus. The achievement is that it runs at all on this hardware.
Context window is 64 tokens. Prompt + response must fit in 64 bytes.
No memory between dialogs. The KV cache resets each conversation.
Byte-level vocabulary. The model generates one ASCII character at a time.
Future DirectionsThese are things we're working toward — not current functionality:
RSP microcode acceleration — the N64's RSP has 8-lane SIMD (VMULF/VMADH); offloading matmul would give an estimated 4–8× speedup over scalar VR4300
Larger model — with the Expansion Pak (8MB total), a 6-layer model fits in RAM
Richer training data — more diverse corpus = more coherent responses
Real cartridge deployment — EverDrive compatibility, real hardware video coming
Why This Is RealThe VR4300 was designed for game physics, not transformer inference. Getting Q8.7 fixed-point attention, FFN, and softmax running stably at 93MHz required:
Custom fixed-point softmax (bit-shift exponential to avoid overflow)
Q8.7 accumulator arithmetic with saturation guards
Soft-float compilation flag for float16 block scale decode
Alignment-safe weight pointer arithmetic for the ROM DFS filesystem
The inference code is in nano_gpt.c. The training script is train_sophia_v5.py. Build it yourself and verify. The sgai_rsp_matmul_q4() stub is planned for RSP microcode:
DMA Q4 weight tiles into DMEM (4KB at a time)
VMULF/VMADH vector multiply-accumulate for 8-lane dot products
Estimated 4-8× speedup over scalar VR4300 inference
----rsp is the gift that keeps on giving; such a forwards-looking architecture (shame about the rambus latency tho)
> This isn't just a tech demo — it's a tool for N64 homebrew developers. Running an LLM natively on N64 hardware enables game mechanics that were impossible in the cartridge era:
> AI analyzes play style and adjusts on the fly
> NPCs that remember previous conversations and reference past events
> In-game level editors where you describe what you want to build
...anyone who has ever used very small language models before should see the problem here. They're fun and interesting, but not exactly, um, coherent.
The N64 has a whopping 8 megabytes (!) of memory, and that's with the expansion pack!
I'm kind of confused, especially since there are no demonstration videos. Is this, um, real? The repository definitely contains source code for something.
https://github.com/sophiaeagent-beep/n64llm-legend-of-Elya/b...
It feels very much like it’s cobbled together from the libdragon examples directory. Or, they use hardware acceleration for the 2D sprites, but then write fixed-width text to the frambuffer with software rendering.
The real iron! it runs faster on real iron! ---------------------------------------------- https://bottube.ai/watch/7GL90ftLqvh
The Rom image ---------------------------------------------- https://github.com/sophiaeagent-beep/n64llm-legend-of-Elya/b...
reply
> World's First LLM-powered Nintendo 64 Game — nano-GPT running on-cart on a 93MHz VR4300
Curious of what we can get out of those constraints.
The real iron! it runs faster on real iron! ---------------------------------------------- https://bottube.ai/watch/7GL90ftLqvh
The Rom image ---------------------------------------------- https://github.com/sophiaeagent-beep/n64llm-legend-of-Elya/b...
reply
acmiyaguchi 19 hours ago | prev | next [–]
This feels like an AI agent doing it's own thing. The screenshot of this working is garble text (https://github.com/sophiaeagent-beep/n64llm-legend-of-Elya/b...), and I'm skeptical of reasonable generation with a small hard-coded training corpus. And the linked devlog on youtube is quite bizzare too.