Why not either train the model exclusively on Semitic languages for further performance for those languages or on a wider set of languages for better multilingual performance overall? I don't understand the logic here.
So properly speaking, they should be advertising the target region as Europe, Middle East and Africa. [3]
[1] https://en.wikipedia.org/wiki/Languages_of_Germany [2] https://en.wikipedia.org/wiki/Languages_of_Afghanistan [3] https://en.wikipedia.org/wiki/List_of_countries_and_territor...
They don't (or rather CAN) care about anything else in the world.
They have a lot more problems than "this model doesn't convert urdu to arabic well".
I mean you wouldn't want to split a model into 3 separate ones, where one contains Austrian, another Slovakian, and another Hungarian, since there's going to be lots of cultural overlap.
Geography
Shoutout to Alif, a finetune of Llama 3 8b on Urdu datasets: https://huggingface.co/large-traversaal/Alif-1.0-8B-Instruct
It'd be great to see a comparison.
Saba vs fanar. I like the names too.
~2 years ago (Sep 27, 2023), Mistral AI said:
> we believe that an open approach to generative AI is necessary. Community-backed model development is the surest path to fight censorship and bias in a technology shaping our future. We strongly believe that by training our own models, releasing them openly, and fostering community contributions, we can build a credible alternative to the emerging AI oligopoly. Open-weight generative models will play a pivotal role in the upcoming AI revolution.
> Mistral AI’s mission is to spearhead the revolution of open models.
https://mistral.ai/en/news/about-mistral-ai
Did something change since then, or why did they have a change of hearts? Are they just doing a "OpenAI" and appear to believe in something in order to further their own cause, or does it have some particular reason behind it?
> Mistral Saba is a result of working closely with strategic regional customers to address very specific challenges in addressing bespoke use cases.
It seems like a customer paid them to train this model, so presumably that customer gets to decide on licensing terms.
Isn’t this Mistral’s business model? Make general purpose models available as open-source and train more specific models for their customers?
Edit: Actually, it is outlined in the bottom of the post:
> we have also begun to train models for strategic customers with the power of their deep and proprietary enterprise context. These models stay exclusive and private to the respective customers. If you would like to explore custom training with Mistral AI, explore our applied AI offerings, or please contact us.
So this is not one of those, as then it would be exclusive and private to the customer. This (Saba) is one of the models that I understood they would have released as at least "open-weights", if following their initial goals according to the early blog posts.
Saba: input $0.2/M tokens / Output $0.6 /M tokens
GPT-4o Input: $0.15/M tokens Cached input:$0.075/M tokens Output:$0.6/1M tokens
sources: https://openai.com/api/pricing/ and https://mistral.ai/en/products/la-plateforme#pricing