One of the reasons I'm not a huge fan of PyTorch.
It isn't like a multi gigabyte game for example, where knowing if there is any malicious code could easily be a multi-month reverse engineering project to get to the answer of 'probably not, but we don't have time to check every byte with a fine tooth comb'
In practice, who's going to bother checking the language model? All the code that runs Stable Diffusion or other Hugging Face models that I've seen just downloads the model dynamically, then uses it without asking question. That's a pretty low-hanging supply chain attack waiting to happen, I believe.
Some solutions for checking: https://huggingface.co/docs/hub/security-pickle
or run them in an isolated env.
But seriously, why not something more human readable and text-based if it's just weights?
The usage boils down to
import safer_unpickle from safer_unpickle
safer_unpickle.patch_torch_load()
This overrides default torch unpickler with a relatively safe one. Hope this helps.
$ fickling --check-safety consolidated.00.pth
File "/usr/lib/python3.10/pickletools.py", line 359, in read_stringnl
data = codecs.escape_decode(data)[0].decode("ascii")
UnicodeDecodeError: 'ascii' codec can't decode byte 0x80 in position 63: ordinal not in range(128)