A few other examples include LLaVA[0], IDEFICS[1][2], and CogVLM[3]. Mini-GPT[4] might be another one to look at. I'm pretty sure all of these have better licenses than Fuyu. Fuyu's architecture does sound really interesting, but the license on the pre-trained model is a complete non-starter for almost anything.
[0]: https://github.com/haotian-liu/LLaVA
[1]: https://huggingface.co/blog/idefics
[2]: https://huggingface.co/HuggingFaceM4/idefics-80b-instruct
No comments yet.