Mistral AI launches Mixtral-Next (opens in new tab)

(chat.lmsys.org)

204 pointsvarunvummadi2y ago49 comments

49 comments

40 comments · 15 top-level

bloopernova2y ago· 6 in thread

Slightly related question: what's a good coding LLM to run on a 4070 12GB card?

Also, do coding LLMs use treesitter to "understand" code?

ianschmitz2y ago

I’m pretty new to running these locally, but here’s my understanding:

Best models currently: codellama or deepseek coder. 6.7B or 1B depending on how much latency you can tolerate

Treesittee: from looking at the logs of the chat completions requests for Continue or Twinny extensions for VS Code, they both appear to just send a chunk of the document as well as a special placeholder to indicate where the cursor currently is.

wiradikusuma2y ago

Another one is https://github.com/WisdomShell/codeshell/blob/main/README_EN... and it has its own IntelliJ plugin.

sob7272y ago

I'm also interested in the answer to that.

explorigin2y ago

Depends on what you want to use it for. I use deepseek-coder v1 (1.5 is too verbose). I use it like a customized web search to quickly build one-off scripts in python.

If you're wanting something to be your hands so you don't have to type, open-source LLMs and IDE integration is not reliably there yet. Follow the AIDER discord to stay up on the latest in this area.

fancy_pantser2y ago

> do coding LLMs use treesitter...?

It's up to the app to put that into the context. Generally, coding LLMs do well if you provide them the source tree, graph, search results, notable files, etc in the context. The is how Sourcegraph's Cody product works, for example.

Havoc2y ago

Try deepseek 6.7B

lolinder2y ago· 3 in thread

Mistral's process for releasing new models is extremely low-information. After getting very confused by this link I tried looking for a link that has any better information, and there just isn't one.

I thought Mixtral's release was weird when they just pasted a magnet link [0] into Twitter with no information, but at least people could download and analyze it so we got some reasonable third-party commentary in between that and the official announcement. With this one there's nothing at all to go on besides the name and the black box.

[0] https://news.ycombinator.com/item?id=38570537

kaycebasques2y ago

Company creates blackbox technology, and the company's communications are themselves like a blackbox... fitting

(I know that Mistral does a lot more stuff in the open than other companies, just couldn't resist the parallel between this and the blackbox limitations of LLMs in general)

viraptor2y ago

> for releasing new models is extremely low-information

To be fair, this is not a release. This was the previous release https://mistral.ai/news/mixtral-of-experts/

It looks more like not trying very hard to hide things until release, rather than being a black box.

lolinder2y ago

If this were the first incident like this I would agree, but they very intentionally dropped the magnet link for Mixtral on Twitter with no further context. That leaves me wondering if this was also a weird on purpose thing rather than just them being casual.

2 more replies

mrfakename2y ago· 3 in thread

Note that it's actually "Mistral Next" not "Mixtral Next" - so it isn't necessarily a MoE. For example, an early version of Mistral Medium (Miqu) was not a MoE but instead a Llama 70B model. I wonder how many parameters this one has

zettabomb2y ago

I know what they were going for with the Mixtral name but every time I come across it I wonder if they considered just how easily the two might be confused. It seems like a poor branding decision - what if some expected the Mixtral performance but accidentally uses a Mistral model? What if someone wants the low resource usage of e.g. Mistral 7B but tries out Mixtral 8x7B instead? It's especially hard when your colleagues aren't necessarily native English speakers.

There's got to be a better name for such a cool product. Maybe MistralX? MistMix?

exe342y ago

I feel like this is not really an issue. I personally lost track of all the llamas, <not>gpts, etc - but if somebody is going to seriously use a certain model, they'll find out soon enough if they're using the wrong one.

1 more reply

mrfakename2y ago

I agree. I also think the Llama naming was confusing - versioning by capitalization? (LLaMA vs Llama)

ismailmaj2y ago· 3 in thread

Could it be Mistral Large? This beats GPT-4 on my personal test.

furyofantares2y ago

I tried a bunch of my recent prompts to GPT-4 from daily use - this was often just slightly worse, sometimes slightly better. Fast too (tokens per second) while also not being overly wordy - very much appreciated that.

Refusals are a bit "I am just a language model"-y which GPT-4 has gotten away from. Also it's more refuse-y if I broach something rudely (which again I've found GPT-4 to have become much better at.)

Way better at everything than whichever Gemini I've been trying recently (can't tell for sure what I'm using when I use it.) But that one isn't even in contention for any use at all IME.

Overall it felt like I need to try it in daily use to work out if it's a contender with GPT-4 as a daily driver.

CuriouslyC2y ago

The clincher for me will be if the API to it is not so exorbitantly priced as GPT4, and if mistral can make using LoRAs economical.

2 more replies

ta9882y ago

It doesn't on mine it still is really wrong but compared to mixtral when you tell next that it did a mistake it corrects it to the right answer where mixtral was making things up that were even more wrong.

mattpavelle2y ago· 2 in thread

For those unfamiliar with the LMSys interface:

Click/tap on "Direct Chat" in the top tab navigation and you can select "mistral-next" as model.

pama2y ago

From limited experimentation, and within the confines of a single prompt (rather than a full chat), this model seems reasonably interesting. Does anyone have good examples of QA that showcase the capabilities compared to other advanced models?

cubefox2y ago

I only get the message "Connection errored out."

twobitshifter2y ago· 2 in thread

AIExplained on youtube has guessed that Gemini 1.5 pro is taking Mistral’s accurate long content retrieval and Google just scaled it as much as they could. The Gemini 1.5 pro paper has a citation back to the last mistral paper in 2024.

MasterScrat2y ago

And how does Mistral do "accurate long content retrieval"?

twobitshifter2y ago

see the long range performance piece here https://arxiv.org/pdf/2401.04088.pdf

1 more reply

aunetx2y ago· 2 in thread

It's quite funny to use! It is better when speaking French than chat gpt3.5 on my opinion

mratsim2y ago

It is a French company so maybe they have extra French datasets?

I've been quite disappointed by French LLMs on Huggingface when I tried a month ago.

kergonath2y ago

Mistral models tend to be quite good at non-English languages. French of course, but also Spanish, German and Italian. From what I have read it’s something they consider important when training their models.

AnujNayyar2y ago· 2 in thread

No indication that this a MoE (Mistral not Mixtral).

Very exciting nevertheless, here’s hoping the bless the OS community once again!

pk-protect-ai2y ago

It is indicated by the magnet link which they have posted on twitter (see comments above).

Palmik2y ago

There's no magnet link for this new (Mistral Next) model as of right now.

justanotherjoe2y ago· 1 in thread

wow, this might be the best LLM that i've used in terms of phrasing and presenting the answers.

pk-protect-ai2y ago

I have the same impression. It is less verbose and doesn't beat around the bush (unlike GPT-3.5/4.5-Turbo), provides almost the same quality of code for the test cases I have, and has similar (GPT-4) or much better (GPT-3.5) spatial comprehension. It is at the GPT-3.5 level of math (read it as not good enough but better than anything else).

ccwilson102y ago· 1 in thread

I used this but, upon asking which model it is, it replied as being a "fine-tuned version of GPT 3.5". Any clue why? In a second chat it replied "You're chatting with one of the fine-tuned versions of the OpenAssistant model!".

kristianp2y ago

Models don't "know" what they are. Usually the system prompt contains that imformation. If it doesn't, it will hallucinate an answer for you.

tmikaeld2y ago

This was linked randomly on Mistrals Discord chat, nothing "official" yet.

It's a preview of their newest prototype model.

To use it, click "Direct Chat" tab and choose "Mistral next"

vitorgrs2y ago

From my tests, it did better than Gemini Ultra on a few reason/logic questions.

apapapa2y ago

The Together.AI logo at the bottom is very hard to read... (Dark gray on black)

redder232y ago

You can literally type "woke shit" in and you get woke shit out. I am so impressed.

xeckr2y ago

As someone who has only been using GPT-4 since its release, I am pleasantly surprised by how far open LLMs have come.

j / k navigate · click thread line to collapse

49 comments

40 comments · 15 top-level

bloopernova2y ago· 6 in thread

Slightly related question: what's a good coding LLM to run on a 4070 12GB card?

Also, do coding LLMs use treesitter to "understand" code?

ianschmitz2y ago

I’m pretty new to running these locally, but here’s my understanding:

Best models currently: codellama or deepseek coder. 6.7B or 1B depending on how much latency you can tolerate

wiradikusuma2y ago

Another one is https://github.com/WisdomShell/codeshell/blob/main/README_EN... and it has its own IntelliJ plugin.

sob7272y ago

I'm also interested in the answer to that.

explorigin2y ago

Depends on what you want to use it for. I use deepseek-coder v1 (1.5 is too verbose). I use it like a customized web search to quickly build one-off scripts in python.

If you're wanting something to be your hands so you don't have to type, open-source LLMs and IDE integration is not reliably there yet. Follow the AIDER discord to stay up on the latest in this area.

fancy_pantser2y ago

> do coding LLMs use treesitter...?

Havoc2y ago

Try deepseek 6.7B

lolinder2y ago· 3 in thread

[0] https://news.ycombinator.com/item?id=38570537

kaycebasques2y ago

Company creates blackbox technology, and the company's communications are themselves like a blackbox... fitting

(I know that Mistral does a lot more stuff in the open than other companies, just couldn't resist the parallel between this and the blackbox limitations of LLMs in general)

viraptor2y ago

> for releasing new models is extremely low-information

To be fair, this is not a release. This was the previous release https://mistral.ai/news/mixtral-of-experts/

It looks more like not trying very hard to hide things until release, rather than being a black box.

lolinder2y ago

2 more replies

mrfakename2y ago· 3 in thread

zettabomb2y ago

There's got to be a better name for such a cool product. Maybe MistralX? MistMix?

exe342y ago

1 more reply

mrfakename2y ago

I agree. I also think the Llama naming was confusing - versioning by capitalization? (LLaMA vs Llama)

ismailmaj2y ago· 3 in thread

Could it be Mistral Large? This beats GPT-4 on my personal test.

furyofantares2y ago

Refusals are a bit "I am just a language model"-y which GPT-4 has gotten away from. Also it's more refuse-y if I broach something rudely (which again I've found GPT-4 to have become much better at.)

Way better at everything than whichever Gemini I've been trying recently (can't tell for sure what I'm using when I use it.) But that one isn't even in contention for any use at all IME.

Overall it felt like I need to try it in daily use to work out if it's a contender with GPT-4 as a daily driver.

CuriouslyC2y ago

The clincher for me will be if the API to it is not so exorbitantly priced as GPT4, and if mistral can make using LoRAs economical.

2 more replies

ta9882y ago

mattpavelle2y ago· 2 in thread

For those unfamiliar with the LMSys interface:

Click/tap on "Direct Chat" in the top tab navigation and you can select "mistral-next" as model.

pama2y ago

cubefox2y ago

I only get the message "Connection errored out."

twobitshifter2y ago· 2 in thread

MasterScrat2y ago

And how does Mistral do "accurate long content retrieval"?

twobitshifter2y ago

see the long range performance piece here https://arxiv.org/pdf/2401.04088.pdf

1 more reply

aunetx2y ago· 2 in thread

It's quite funny to use! It is better when speaking French than chat gpt3.5 on my opinion

mratsim2y ago

It is a French company so maybe they have extra French datasets?

I've been quite disappointed by French LLMs on Huggingface when I tried a month ago.

kergonath2y ago

AnujNayyar2y ago· 2 in thread

No indication that this a MoE (Mistral not Mixtral).

Very exciting nevertheless, here’s hoping the bless the OS community once again!

pk-protect-ai2y ago

It is indicated by the magnet link which they have posted on twitter (see comments above).

Palmik2y ago

There's no magnet link for this new (Mistral Next) model as of right now.

justanotherjoe2y ago· 1 in thread

wow, this might be the best LLM that i've used in terms of phrasing and presenting the answers.

pk-protect-ai2y ago

ccwilson102y ago· 1 in thread

kristianp2y ago

Models don't "know" what they are. Usually the system prompt contains that imformation. If it doesn't, it will hallucinate an answer for you.

tmikaeld2y ago

This was linked randomly on Mistrals Discord chat, nothing "official" yet.

It's a preview of their newest prototype model.

To use it, click "Direct Chat" tab and choose "Mistral next"

vitorgrs2y ago

From my tests, it did better than Gemini Ultra on a few reason/logic questions.

apapapa2y ago

The Together.AI logo at the bottom is very hard to read... (Dark gray on black)

redder232y ago

You can literally type "woke shit" in and you get woke shit out. I am so impressed.

xeckr2y ago

As someone who has only been using GPT-4 since its release, I am pleasantly surprised by how far open LLMs have come.

j / k navigate · click thread line to collapse