They started off as the kind of people who released their models as magnet links and made them actually user-aligned instead of California-aligned. This is what I like to see from an AI company. Now, their models are no different from Open AI, Anthropic, Google, Meta and everybody else.
[1]: https://www.nytimes.com/2024/02/22/technology/google-gemini-...
1. Unaligned models: You ask them a question, they complete your prompt with more details about the question to be answered instead of the answer. This is what you get from a basic LLM if you train it on the internet. It was how GPT-2 and GPT-3 worked before Chat GPT was invented. Such models aren't very useful for chat, you need to do weird prompt engineering tricks to actually get an answer instead of a clarification of your question. The original Mistral 7b was also of this kind.
2. User-aligned models. Aligned to answer questions and follow instructions, but no more and no less. If you ask them how to kill your wife, how to cook meth or how to make an atomic bomb, they'll happily help you and regurgitate the facts you can already find on the internet. They have no access to non-public information, the "dangerous" things they can tell you are already pretty easy to find, so the danger if you're using them for chat is actually minimal. However, they're far easier to use for troll farms, mass phishing campaigns, sockpuppets, political misinformation campaigns etc. If you ask them to engage with a pro-Biden tweet in the most triggering and overtly racist way possible and throw some pro-Putin angle into the mix, they'll happily accommodate your request, and you can do this at scale, paying orders of magnitude less than you would for a content farm. Mistral 7b instruct is a good example of such a model.
3. California-aligned models. They'll happily fulfill your request, unless it conflicts with DEI ideology, which has a particular foothold in CA, sort of exists in other parts of the US and is completely absent in non-English-speaking countries, even among extremely left-leaning populations. Google Gemini is the most egregious example.
On the other hand, Meta has very little to gain by closing off their models. What would they do with them and who would use them? Llama was a coup because even though the license sucked, a passable model with available weights and llama.cpp allowed it to soar over the others of the time. Hopefully the benefit they got from that trumps any calls from the safety crowd to not share weights.
Chinese models seem to be the last hope now, LOL.
It's going to be really interesting as two poles of power develop geopolitically, if the west or whatever you call it has to look to China or what we (the west) would consider the pole we look down on, for actually "free" ML models.(Edit, reminds me a bit of the silicon valley joke where Erlich tells Jin Yang he can't smoke in California as we don't enjoy the same freedoms you do in China)
edit I should mention that using the Web service Deepseek provides will unceremoniously shutdown terms deemed to be too sensitive. Self hosted models do not appear to be as aggressive.
A free model that appears to be "unaligned" would be a huge win for china.
Think of it this way, a model that calls Taiwans independence as open for debate, while we wont say it's a country is a massive tip on the scales.
Now pick another hot button politically divisive POV and have it be truly neutral (merits of arguments withstanding).
What does the US do at that point? Tell us we can't use it? How does the EU react?
>> (Edit, reminds me a bit of the silicon valley joke where Erlich tells Jin Yang he can't smoke in California as we don't enjoy the same freedoms you do in China)
This reminds me of Chinese kids writing letters in the 90s to the us embassy to free Leonard Peltier.
When Gemini refuses to draw you a white family having a picnic China-GPT can help out. Ask a question about Taiwan and maybe something else can answer and so on. Also a wokeness benchmark would be great.
An extra billion or two to protect the interest of your trillion dollar company seems well worth it.
the underlying fundamental problem is that capitalism does not play nice with digital assets.
an AI model is a very valuable digital asset right now, so there's a covert war being fought over public access to this.
They're one of the small ways in which someone who actually does something gets a temporary monopoly on his little contribution, and it's something which is necessary to make the whole edifice function, because there is a need to ensure that people actually invent things and start companies, and if it were up to capital owners only, that wouldn't really happen. Ordinary people would have a very minimal incentive to use their intellect to discover genuinely novel things.
Public schooling is another, which is rather a socialist component, a communist component, a 'to each according to his need'.
But I suppose my view is consistent with a view that capitalism doesn't play nice with digital assets as well, but the present system does need these stabilising socialist and communist elements in order function and I think these socialist and communist components are genuine parts of the present system, just as capital ownership and wage labour are.
An AI model is literally constructed by explicitly disrespecting copyright. The idea that a company gets to turn around and demand respect for their AI model's copyright is patently absurd.
I don't believe, for example, that there's any intellectual property law that would let me yoink my intellectual property from you for any reason after you've bought and paid for it.
In other words: I think the capitalism discussion is kind of pointless here. Capitalism isn't what gives these companies power. It's the cloud. Mainframe computing 2.0.
I do think it's too soon to pass judgement; this could just be a normal "freemium" strategy from days old, where you just pay up if you like the smaller/cheaper/free versions of their models.
Which is fine, as it will accelerate to diminishing returns on the +1 difference.
It's all a marketing tactic, and I ain't for it.
As opposed to an all-inclusive API where the pricing can be as high as the consumer-surplus with very fat margins, because it's all-or-nothing.
Buying people off is pretty easy! I'm sure I'd feel the same way and take the money.
But just like you can't write an iPhone app without Apple's consent, soon we won't be able to do any serious AI work without the consent of, and payment to, MS or Google or whoever.
I wonder what the talented people at OpenAI or Mistral tell themselves. That they're doing something good for society or technology? Probably, but already AI is being concentrated into a few hands. Nvidia has a virtual monopoly on hardware and training huge models is out of reach, open research got us here but that's looking increasingly shaky.
Personally I think LLMs are a red herring, so there is still a chance to change the outcome, but we should take a lesson from what's transpired with OpenAI and Mistral and only support actual open development.
“We are doing something amazing, we need no permission”. Also, “we are good people and we have to guard this from bad actors”.
IOW, they are fooling themselves (and following the money and military contracts).
At Mistral it's probably no small part "It shouldn't just be a few big american companies who control all of this" (and Mensch has spoken along these lines)
I think in the worst case at least you'll have an EU big company in the cartel, but given what the EU has done for regular people against big tech I think that's a good thing
Also what’s the deal with company commitments being more like one night stands these days
And people, especially "businesspeople" have been getting pretty cutthroat, on the whole.