undefined | Better HN

0 pointsstavros1mo ago0 comments

What causes these? Given how simple the LLM interface is (just completion), why don't teams make a simple, standardized template available with their model release so the inference engine can just read it and work properly? Can someone explain the difficulty with that?

0 comments

Yukonv1mo ago

The model does have the format specified but there is no _one_ standard. For this model it’s defined in the [ tokenizer_config.json [0]. As for llama.cpp they seem to be using a more type safe approach to reading the arguments.

[0] https://huggingface.co/google/gemma-4-31B-it/blob/main/token...

stavrosOP1mo ago

Hm, but surely there will be converters for such simple formats? I'm confused as to how there can be calling bugs when the model already includes the template.

j / k navigate · click thread line to collapse

0 comments

Yukonv1mo ago

[0] https://huggingface.co/google/gemma-4-31B-it/blob/main/token...

stavrosOP1mo ago

Hm, but surely there will be converters for such simple formats? I'm confused as to how there can be calling bugs when the model already includes the template.

j / k navigate · click thread line to collapse