be sure to read the warning in their repo:
https://github.com/openlm-research/open_llama#loading-the-we...> Please note that it is advised to avoid using the Hugging Face fast tokenizer for now, as we’ve observed that the auto-converted fast tokenizer sometimes gives incorrect tokenization