It’s the original use case for LLMs.
It just can’t be a case simply because how ML works. In short, the more diverse and high quality texts with reasoning reach examples were in the training set, the better model performs on a given language.
So unless Spanish subset had much more quality-dense examples, to make up for volume, there is no way the quality of reasoning in Spanish is on par with English.
I apologise for the rambling explanation, I sure someone with ML expertise here can it explain it better.
Translation in/out of human languages only happens at the edges of the model.
Internal layer activations for the same concept are similar regardless of language, while activations at the top/bottom layers diverge. Meanwhile the pattern is reversed for same language different content.
Even if it happens at the edge, on every conversation turn, I may assume non captured small subtleties of meaning over time can accumulate into significant error.
Lets take a language with orders of magnitude less training data, like Nahuatl. The following is what happens if I ask Claude Code in Nahuatl to give me an example of a simple program that I can execute inside a web browser (avoiding any direct keyword prompts like 'Javascript' or 'Hello World'). Interestingly, it responds in Spanish. I doubt that it has any relevant training data in Nahuatl. (Apologies if the Nahuatl is wrong – I just used Google translate.)
¿Uelis techmakas se neskayotl tlen se programa tlen amo ouij tlen ueliskia nijchiuas ipan se navegador web?
-----
Entiendo tu pregunta en náhuatl. Aquí un ejemplo sencillo — una calculadora en HTML puro que puedes abrir directo en el navegador, sin servidor:
<!DOCTYPE html>
<html>
<head>
<meta charset="UTF-8">
<title>Calculadora</title>
<style>
body { font-family: sans-serif; display: flex; justify-content: center; padding: 40px; }
.calc { border: 1px solid #ccc; padding: 20px; border-radius: 8px; width: 200px; }
input { width: 100%; margin-bottom: 10px; font-size: 1.2em; text-align: right; }
button { width: 22%; margin: 2px; padding: 8px; font-size: 1em; cursor: pointer; }
.wide { width: 47%; }
</style>
</head>
<body>
<div class="calc">
<input type="text" id="pantalla" readonly>
<div>
<button onclick="tecla('7')">7</button>
<button onclick="tecla('8')">8</button>
<button onclick="tecla('9')">9</button>
<button onclick="tecla('/')">÷</button>
<button onclick="tecla('4')">4</button>
<button onclick="tecla('5')">5</button>
<button onclick="tecla('6')">6</button>
<button onclick="tecla('*')">×</button>
<button onclick="tecla('1')">1</button>
<button onclick="tecla('2')">2</button>
<button onclick="tecla('3')">3</button>
<button onclick="tecla('-')">−</button>
<button onclick="tecla('0')">0</button>
<button onclick="tecla('.')">.</button>
<button onclick="tecla('+')">+</button>
<button onclick="limpiar()">C</button>
<button class="wide" onclick="calcular()">=</button>
<button class="wide" onclick="borrar()">⌫</button>
</div>
</div>
<script>
const pantalla = document.getElementById('pantalla');
function tecla(val) {
pantalla.value += val;
}
function calcular() {
try {
pantalla.value = eval(pantalla.value);
} catch {
pantalla.value = 'Error';
}
}
function limpiar() { pantalla.value = ''; }
function borrar() { pantalla.value = pantalla.value.slice(0, -1); }
</script>
</body>
</html>
Guarda esto como calculadora.html y ábrelo en cualquier navegador — no necesita servidor ni dependencias. Es un buen punto de partida para aprender HTML,
CSS y JavaScript.It’s not! And I’ve never said that.
Anyways, I’m not even sure what we are arguing about, as it’s 100% fact that SOTA models perform better in English, the only interesting question here how much better, is it negligible or actually makes a difference in real world use-cases.