Multimodal large language model
Large multimodal language model
Large language multimodal model
Large language model (multimodal)
I prefer 1, because this is a multimodal type of an existing technique already referred to as LLM. If I was king, I’d do Omnimodal Linguistic Minds, but no one asks me such things, thank god