Zephyr 141B, a Mixtral 8x22B fine-tune, is now available in Hugging Chat (opens in new tab)

I like that they say how the model was trained for 1.3 hours on 4 nodes of 8 x H100s. By my rough calculation, that should probably have cost around $100 or so. (At $2 per hour, x 8 gpus x 4 nodes). Not free, but pretty cheap in the scheme of things. At least, once you know what you're doing.

dloss2y ago

I wanted to write that TGI inference engine is not Open Source anymore, but they have reverted the license back to Apache 2.0 for the new version TGI v2.0: https://github.com/huggingface/text-generation-inference/rel...

Good news!

leblancfg2y ago

What does ORPO stand for? Can't seem to find related links.

cateye2y ago

Odds Ratio Preference Optimization (ORPO): https://arxiv.org/abs/2403.07691

mjewkes2y ago· 3 in thread

My current favorite “LLM breaker” below. GPT4, Claude, and this all fail.

—-

Apples are better than bananas. Cherries are worse than apples. Are cherries better than bananas?

cowanon442y ago

FWIW, Claude Opus (the paid model) gets the correct answer, and explains it well:

Based on the information provided, we cannot definitively determine whether cherries are better than bananas. The statements given only provide relative comparisons between apples and bananas, and apples and cherries, but do not directly compare cherries and bananas.

Here's what we know: 1. Apples are better than bananas. 2. Cherries are worse than apples.

However, these statements do not necessarily imply a transitive relation. In other words, just because A is better than B, and C is worse than A, it doesn't always mean that C is worse than B.

For example, if we assign numerical values to represent the quality of each fruit (higher numbers indicate better quality), we could have a scenario like this:

- Apples: 8 - Bananas: 6 - Cherries: 7

In this case, apples (8) are better than bananas (6), and cherries (7) are worse than apples (8), but cherries (7) are still better than bananas (6).

Therefore, more information would be needed to determine the relative quality of cherries compared to bananas.

loudmax2y ago

To be fair, a lot of humans fail that. Including people that should know better.

mjewkes2y ago

For sure. It's not a fair prompt at all. I'm super bullish on LLMs and am using GPT-4 in production right now. This stuff is magic.

It's actually hard to find short, simple, "plain english" failure cases like the above.

The "chain of reasoning" that the modern models deploy before the fail is funny too. This is GPT-4:

---

To determine the relationship between cherries and bananas based on your statements, let's break it down:

  1. Apples are better than bananas.
  2. Cherries are worse than apples.

From statement 1, we know apples rank higher than bananas. Statement 2 tells us cherries rank lower than apples. By this logic, since cherries are lower than apples, which are higher than bananas, it follows that cherries are also lower than bananas.

Therefore, based on these comparisons, cherries are not better than bananas.

1 more reply

adt2y ago

Added, thanks.

https://lifearchitect.ai/models-table/

j / k navigate · click thread line to collapse

12 comments

10 comments · 3 top-level

osansevieroOP2y ago· 4 in thread

Zephyr 141B is a Mixtral 8x22B fine-tune. Here are some interesting details

- Base model: Mixtral 8x22B, 8 experts, 141B total params, 35B activated params

- Fine-tuned with ORPO, a new alignment algorithm with no SFT step (hence much faster than DPO/PPO)

- Trained with 7K open data instances -> high-quality, synthetic, multi-turn

- Apache 2

Everything is open:

- Final Model: https://huggingface.co/HuggingFaceH4/zephyr-orpo-141b-A35b-v...

- Base Model: https://huggingface.co/mistral-community/Mixtral-8x22B-v0.1

- Fine-tune data: https://huggingface.co/datasets/argilla/distilabel-capybara-...

- Recipe/code to train the model: https://huggingface.co/datasets/argilla/distilabel-capybara-...

- Open-source inference engine: https://github.com/huggingface/text-generation-inference

- Open-source UI code https://github.com/huggingface/chat-ui

Have fun!

loudmax2y ago

dloss2y ago

Good news!

leblancfg2y ago

What does ORPO stand for? Can't seem to find related links.

cateye2y ago

Odds Ratio Preference Optimization (ORPO): https://arxiv.org/abs/2403.07691

mjewkes2y ago· 3 in thread

My current favorite “LLM breaker” below. GPT4, Claude, and this all fail.

—-

Apples are better than bananas. Cherries are worse than apples. Are cherries better than bananas?

cowanon442y ago

FWIW, Claude Opus (the paid model) gets the correct answer, and explains it well:

Here's what we know: 1. Apples are better than bananas. 2. Cherries are worse than apples.

However, these statements do not necessarily imply a transitive relation. In other words, just because A is better than B, and C is worse than A, it doesn't always mean that C is worse than B.

For example, if we assign numerical values to represent the quality of each fruit (higher numbers indicate better quality), we could have a scenario like this:

- Apples: 8 - Bananas: 6 - Cherries: 7

In this case, apples (8) are better than bananas (6), and cherries (7) are worse than apples (8), but cherries (7) are still better than bananas (6).

Therefore, more information would be needed to determine the relative quality of cherries compared to bananas.

loudmax2y ago

To be fair, a lot of humans fail that. Including people that should know better.

mjewkes2y ago

For sure. It's not a fair prompt at all. I'm super bullish on LLMs and am using GPT-4 in production right now. This stuff is magic.

It's actually hard to find short, simple, "plain english" failure cases like the above.

The "chain of reasoning" that the modern models deploy before the fail is funny too. This is GPT-4:

---

To determine the relationship between cherries and bananas based on your statements, let's break it down:

  1. Apples are better than bananas.
  2. Cherries are worse than apples.

Therefore, based on these comparisons, cherries are not better than bananas.

1 more reply

adt2y ago

Added, thanks.

https://lifearchitect.ai/models-table/

j / k navigate · click thread line to collapse