undefined | Better HN

0 pointswolttam3d ago0 comments

They’re valid things to be concerned about IMO.

I think you’re looking for an answer you’re not going to get unfortunately. I think there actually is a higher than average risk of data leakage with the insane optimizations that go into model serving - GLM5.1 had an issue of going into jibberish when their infra was under high load, and it turned out to be a cross-request KV cache contamination issue.[1]

Personally, my effort has been to use local models only as of late, and it’s gone pretty well!

[1]: https://z.ai/blog/scaling-pain

0 comments

1 comments · 1 top-level

jcgrillo2d ago

Thanks for the link, that is an interesting writeup!

j / k navigate · click thread line to collapse