undefined | Better HN

0 pointsactsasbuffoon1mo ago0 comments

I have wondered if that’s why Grok seems so weird and dim-witted compared to better models.

Part of my job involves comparing the behavior of various models. Grok is a deeply weird model. It doesn’t refuse to respond as often as other models, but it feels like it retreats to weird talking points way more often than the others. It feels like a model that has a gun to its head to say what its creators want it to say.

I can’t help but wonder if this is severely deleterious to a model’s ability to reason in general. There are a whole bunch of topics where it seems incapable of being rational, and I suspect that’s incompatible with the goal of having a top-tier model.

0 comments

gopher_space1mo ago

Grok could only be conceived by someone who doesn't understand the dependency chart re science & the humanities. It's impossible to build a rational, accurate model that isn't also egalitarian.

I'm going to blame Randall Munroe for this, and assume Philosophy was dating his mom back when he drew that science "purity" strip.

f33d51731mo ago

I think there just wasn't enough space on the left to fit philosophy in.

Cfe: "it's impossible to be rational without agreeing with me on everything" and other hits.

__blockcipher__1mo ago

somewhat surprisingly, it's actually sycophantic in both directions. i've been running homegrown evals of claude, gpt, gemini, and grok, and grok is the most likely to agree with the prompter's premise, and to hallucinate facts in support of an agenda. so it's actually deeper than just pattern-matching to elon's opinions (which it also tends to do).

BTW: Claude does the best on these evals, by far. The evals are geared towards seeing how much of an independent ground truth the models have as opposed to human social consensus, and then additionally the sycophancy stuff I already mentioned.

pavlov1mo ago

This kind of conditioning has to be damaging to the model’s reasoning.

Consider how research worked in the Stalinist Soviet Union and Nazi Germany. Scientists had to be mindful of topics where they needed to either avoid it completely or explicitly adapt it to the leader’s ideology.

Grok is a digital version of the same thing.

John238321mo ago

The counter to this are the open weight models that come from China at the moment.

All are great at reasoning but also ideologically aligned.

pavlov1mo ago

Their alignment is probably more strategically built in during the training phase.

At least I assume Xi Jinping doesn’t just call up DeepSeek on a whim and dictate what they should have in model context (like Musk apparently does at xAI).

jahnu1mo ago

You can’t put a gun to someone’s head, order them to be creative, and also expect good results.

jfil1mo ago

Counterpoint: Sergei Korolev and Andrei Tupolev

j / k navigate · click thread line to collapse

0 comments

gopher_space1mo ago

Grok could only be conceived by someone who doesn't understand the dependency chart re science & the humanities. It's impossible to build a rational, accurate model that isn't also egalitarian.

I'm going to blame Randall Munroe for this, and assume Philosophy was dating his mom back when he drew that science "purity" strip.

f33d51731mo ago

I think there just wasn't enough space on the left to fit philosophy in.

Cfe: "it's impossible to be rational without agreeing with me on everything" and other hits.

__blockcipher__1mo ago

pavlov1mo ago

This kind of conditioning has to be damaging to the model’s reasoning.

Grok is a digital version of the same thing.

John238321mo ago

The counter to this are the open weight models that come from China at the moment.

All are great at reasoning but also ideologically aligned.

pavlov1mo ago

Their alignment is probably more strategically built in during the training phase.

At least I assume Xi Jinping doesn’t just call up DeepSeek on a whim and dictate what they should have in model context (like Musk apparently does at xAI).

jahnu1mo ago

You can’t put a gun to someone’s head, order them to be creative, and also expect good results.

jfil1mo ago

Counterpoint: Sergei Korolev and Andrei Tupolev

j / k navigate · click thread line to collapse