Apple/OpenELM: Efficient Open-Source Family Language Models (opens in new tab)

(huggingface.co)

82 pointspanqueca2y ago13 comments

13 comments

11 comments · 7 top-level

unraveller2y ago· 2 in thread

Why'd it drop today? One supposes that instead of pressing shift+delete on their repo they click publish now so they get to write the headline that 2 big tech companies release small language models on the same day.

orra2y ago

I presume they're releasing it because they traibed it using the just-announced CoreNet library.

However, the model is proprietary. I'm tired of the open washing.

orra2y ago

I retract the claim of proprietary. I misunderstood some of the licence wording. The license appears to be well accepted as a permissive open source license. https://spdx.org/licenses/AML.html

1 more reply

sunflowerfly2y ago· 2 in thread

Any idea how much ram this requires?

vineyardmike2y ago

The model sizes are:

> 270M, 450M, 1.1B and 3B parameters

Which roughly translates to 3GB for the highest end one, depending on context length used.

SushiHippie2y ago

* ~3GB with 8bit quantization. Without quantization it is ~6GB [0].

8 bits = 1 byte

3 billion * 1 byte = 3 gigabyte

+ Some memory for the context of the LLM

[0]

3b-instruct has a total file size of 4.94GB + 1.13GB which is 6.07GB which can be seen here:

https://huggingface.co/apple/OpenELM-3B-Instruct/tree/main

A bit of overhead will always be there, as you probably want to store some metadata next to the raw weights.

Roshni1990r2y ago

OpenELM, a family of Efficient Language Models Developed by Apple, is trending on Hugging Face!

OpenELM offers models with 270M to 3B parameters, pre-trained and instruction-tuned, with Good results across various benchmarks.

My Feedback:

First Phi 3, now OpenELM. It's great to see these small models improving. I know they're not ready for production in all cases, but they're really great for specific tasks.

I see small open-source models as the future because they offer better speed, require less compute, and use fewer resources, making them more accessible and practical for a wider range of applications.

What do you think about this? Do you consider using small opensource. If yes what you are thinking to make?

I am going to use it on my smartphone

monkeydust2y ago

https://github.com/apple/corenet/tree/main/projects/openelm

panquecaOP2y ago

ArXiv Paper:

https://arxiv.org/abs/2404.14619

buildbot2y ago

Huh, They used the pile - that's a pretty interesting choice for a corporate research team?

gnabgib2y ago

Article title(h1): OpenELM: An Efficient Language Model Family with Open-source Training and Inference Framework

j / k navigate · click thread line to collapse

13 comments

11 comments · 7 top-level

unraveller2y ago· 2 in thread

orra2y ago

I presume they're releasing it because they traibed it using the just-announced CoreNet library.

However, the model is proprietary. I'm tired of the open washing.

orra2y ago

I retract the claim of proprietary. I misunderstood some of the licence wording. The license appears to be well accepted as a permissive open source license. https://spdx.org/licenses/AML.html

1 more reply

sunflowerfly2y ago· 2 in thread

Any idea how much ram this requires?

vineyardmike2y ago

The model sizes are:

> 270M, 450M, 1.1B and 3B parameters

Which roughly translates to 3GB for the highest end one, depending on context length used.

SushiHippie2y ago

* ~3GB with 8bit quantization. Without quantization it is ~6GB [0].

8 bits = 1 byte

3 billion * 1 byte = 3 gigabyte

+ Some memory for the context of the LLM

[0]

3b-instruct has a total file size of 4.94GB + 1.13GB which is 6.07GB which can be seen here:

https://huggingface.co/apple/OpenELM-3B-Instruct/tree/main

A bit of overhead will always be there, as you probably want to store some metadata next to the raw weights.

Roshni1990r2y ago

OpenELM, a family of Efficient Language Models Developed by Apple, is trending on Hugging Face!

OpenELM offers models with 270M to 3B parameters, pre-trained and instruction-tuned, with Good results across various benchmarks.

My Feedback:

First Phi 3, now OpenELM. It's great to see these small models improving. I know they're not ready for production in all cases, but they're really great for specific tasks.

What do you think about this? Do you consider using small opensource. If yes what you are thinking to make?

I am going to use it on my smartphone

monkeydust2y ago

https://github.com/apple/corenet/tree/main/projects/openelm

panquecaOP2y ago

ArXiv Paper:

https://arxiv.org/abs/2404.14619

buildbot2y ago

Huh, They used the pile - that's a pretty interesting choice for a corporate research team?

gnabgib2y ago

Article title(h1): OpenELM: An Efficient Language Model Family with Open-source Training and Inference Framework

j / k navigate · click thread line to collapse