AuraFlow v0.1: a open source alternative to Stable Diffusion 3 (opens in new tab)

(blog.fal.ai)

164 pointstreesciencebot1y ago35 comments

35 comments

25 comments · 7 top-level

viraptor1y ago· 8 in thread

Passes the "woman on grass" test ; - )

Seriously though, there are some minor hand issues and a rare missing body part. "Correct anatomy, no missing body parts." seems to fix it mostly. Still pretty good for an early 0.1 announcement.

Following full sentences is pretty good. Although this: "A photo of a table. On the table there's a green box on the right, a red ball on the left. There's a yellow cone on the box." keeps putting the cone on the table.

Not trained on naked bodies though - generates blob monsters instead.

konata3901y ago

Can you give me your prompt that generates passable humans? Even stuff that worked on SD3 generates flesh demons for me in the playground linked in the post.

viraptor1y ago

You're right. Turns out I've just been lucky and got 5/5 good results. Every time I try now, I get blob demons as well. The joys of random generation...

kyriakos1y ago

Is there any model that can actually generate realistic naked human body? Thought they are all deliberately avoiding the subject in order to steer away from compliance issues.

RobotToaster1y ago

https://civitai.com/models has plenty of fine tuned models.

GaggiX1y ago

Go to CivitAI, many models are able to generate naked bodies, they are SD finetuned models and you can download them to run them locally.

Der_Einzige1y ago

Well over more than 50% of all SD models in existence have a focus on NSFW. It's more like "Which models DON'T generate realistic hardcore porn?"

viraptor1y ago

Sure, there's lots of people really into that. Discords for apps like DrawThings have NSFW sections where people share models/processes/results.

1 more reply

crngefest1y ago

[flagged]

executesorder661y ago· 5 in thread

AIs are still not able to understand negations.

Try "ramen without egg" or "ramen with no egg" and it will show ramen WITH egg.

Or "man without striped shirt" will give "man WITH striped shirt"

viraptor1y ago

It's not trained for it, because that use case is handled differently. It would be mostly a waste of time to train the concept compared to other things you want to achieve. Instead you put things you don't want in the negative prompt. This example doesn't expose the option, but you can try it here for a different model: https://huggingface.co/spaces/gokaygokay/Kolors

Set the seed to 0 and prompt to "man in a loud shirt" - you get flowers. Sweet the negative prompt to "floral shirt" - no not flowers.

Sentence processors can definitely understand negation, (any non-trivial LLM can) but it would be a waste of time to train that in the image generators -vs- making other ideas better.

gowld1y ago

Why can't an imagen generate run a tiny little automatic text-to-text rewrite first, to apply these special linguistic rules?

2 more replies

pyinstallwoes1y ago

That’s what negative prompt is for. Stable diffusion also isn’t like llms. LLMs certainly understand negation.

executesorder661y ago

I did mean AI's in general, so I have edited my original post.

> That’s what negative prompt is for.

This is what I mean by it "not understanding negations" You need whole separate prompt, just to say you want e.g. "ramen without egg" instead of just saying it in a single prompt that it understands.

2 more replies

GaggiX1y ago

>AIs are still not able to understand negations.

AIs are able to understand negations, just ask an LLM a question. Text-to-image models are the ones that struggle the most with this, they usually do not have a very nuanced understanding of text.

gorkemyurt1y ago· 3 in thread

https://x.com/ostrisai/status/1811620901420429441

notachatbot12341y ago

> The prompt comprehension is incredible! #auraflow

> "a cat that is half orange tabby and half black, split down the middle. Holding a martini glass with a ball of yarn in it. He has a monocle on his left eye, and a blue top hat, art nouveau style "

Plus an image that somewhat resembles that prompt. The cat has a human-like hand with a chopped off thumb and 6 fingers in total, differently colored eyes, a branch in front of its face, the ball of yarn is somehow floating in mid-air.]

viraptor1y ago

These are somewhat valid issues. But given the currently available open models, this is a massive improvement. The human-like hand and changing the styles on the sides of the head isn't even bad - those are valid artistic choices you'd see on similar illustrations - they're just badly executed here.

Kiro1y ago

Somewhat resembles? Come on.

smusamashah1y ago· 2 in thread

Prompt adherence is great. I copied a few prompts from ideogram (which also adheres to prompt) and results were good until they involve female bodies. This for example https://ideogram.ai/g/ENMWd7PrQ32dIWSF91uMJQ/2 comes out exposing that training didn't have enough naked bodies. Prompt adherence is very very good otherwise. Can try top images of the day/hour from ideogram to test.

Mashimo1y ago

Just FYI, the ideogram.ai link is behind a login.

stavros1y ago

Not only that, but the only ways to sign up are either Google or Apple. What dystopia is this?

1 more reply

halr90001y ago

In case you missed it, the authors were pretty smart to include that folded section in the middle, "Prompt for prompt-enhancement". I slapped that into gpt (https://chatgpt.com/share/2e53403e-4bd7-4138-ac34-55378e2ed3...) and made a few prompts. Ran those on their online demo. Initial impressions:

  - prompt adherence is really good
  - it's somewhere between SD15 and SDXL at creating pictures of text 
  - aesthetic quality is good, but leaves some to be desired

Gonna play more with it in ComfyUI.

skybrian1y ago

Fails on “piano keyboard” (shows a full piano) and “close up of piano keyboard,” (bizarre duplicate keyboard monstrosity.)

It’s a difficult prompt. Nobody gets the grouping of black keys right. Maybe someday?

stale20021y ago

So, now that this is released are we no longer going to have pendant people complaining that this "isn't real open source"?

Here is your model, complainers.

I'm not really sure why you'd be so insistent on that, as opposed to just fine tuning the "totally not open source, but instead just open weights" models.

But go ahead, I guess.

Now we can get back to talking about capabilities, usage, and results, as opposed to arguing about the definition of words.

j / k navigate · click thread line to collapse

35 comments

25 comments · 7 top-level

viraptor1y ago· 8 in thread

Passes the "woman on grass" test ; - )

Seriously though, there are some minor hand issues and a rare missing body part. "Correct anatomy, no missing body parts." seems to fix it mostly. Still pretty good for an early 0.1 announcement.

Not trained on naked bodies though - generates blob monsters instead.

konata3901y ago

Can you give me your prompt that generates passable humans? Even stuff that worked on SD3 generates flesh demons for me in the playground linked in the post.

viraptor1y ago

You're right. Turns out I've just been lucky and got 5/5 good results. Every time I try now, I get blob demons as well. The joys of random generation...

kyriakos1y ago

Is there any model that can actually generate realistic naked human body? Thought they are all deliberately avoiding the subject in order to steer away from compliance issues.

RobotToaster1y ago

https://civitai.com/models has plenty of fine tuned models.

GaggiX1y ago

Go to CivitAI, many models are able to generate naked bodies, they are SD finetuned models and you can download them to run them locally.

Der_Einzige1y ago

Well over more than 50% of all SD models in existence have a focus on NSFW. It's more like "Which models DON'T generate realistic hardcore porn?"

viraptor1y ago

Sure, there's lots of people really into that. Discords for apps like DrawThings have NSFW sections where people share models/processes/results.

1 more reply

crngefest1y ago

[flagged]

executesorder661y ago· 5 in thread

AIs are still not able to understand negations.

Try "ramen without egg" or "ramen with no egg" and it will show ramen WITH egg.

Or "man without striped shirt" will give "man WITH striped shirt"

viraptor1y ago

Set the seed to 0 and prompt to "man in a loud shirt" - you get flowers. Sweet the negative prompt to "floral shirt" - no not flowers.

Sentence processors can definitely understand negation, (any non-trivial LLM can) but it would be a waste of time to train that in the image generators -vs- making other ideas better.

gowld1y ago

Why can't an imagen generate run a tiny little automatic text-to-text rewrite first, to apply these special linguistic rules?

2 more replies

pyinstallwoes1y ago

That’s what negative prompt is for. Stable diffusion also isn’t like llms. LLMs certainly understand negation.

executesorder661y ago

I did mean AI's in general, so I have edited my original post.

> That’s what negative prompt is for.

This is what I mean by it "not understanding negations" You need whole separate prompt, just to say you want e.g. "ramen without egg" instead of just saying it in a single prompt that it understands.

2 more replies

GaggiX1y ago

>AIs are still not able to understand negations.

AIs are able to understand negations, just ask an LLM a question. Text-to-image models are the ones that struggle the most with this, they usually do not have a very nuanced understanding of text.

gorkemyurt1y ago· 3 in thread

https://x.com/ostrisai/status/1811620901420429441

notachatbot12341y ago

> The prompt comprehension is incredible! #auraflow

> "a cat that is half orange tabby and half black, split down the middle. Holding a martini glass with a ball of yarn in it. He has a monocle on his left eye, and a blue top hat, art nouveau style "

viraptor1y ago

Kiro1y ago

Somewhat resembles? Come on.

smusamashah1y ago· 2 in thread

Mashimo1y ago

Just FYI, the ideogram.ai link is behind a login.

stavros1y ago

Not only that, but the only ways to sign up are either Google or Apple. What dystopia is this?

1 more reply

halr90001y ago

  - prompt adherence is really good
  - it's somewhere between SD15 and SDXL at creating pictures of text 
  - aesthetic quality is good, but leaves some to be desired

Gonna play more with it in ComfyUI.

skybrian1y ago

Fails on “piano keyboard” (shows a full piano) and “close up of piano keyboard,” (bizarre duplicate keyboard monstrosity.)

It’s a difficult prompt. Nobody gets the grouping of black keys right. Maybe someday?

stale20021y ago

So, now that this is released are we no longer going to have pendant people complaining that this "isn't real open source"?

Here is your model, complainers.

I'm not really sure why you'd be so insistent on that, as opposed to just fine tuning the "totally not open source, but instead just open weights" models.

But go ahead, I guess.

Now we can get back to talking about capabilities, usage, and results, as opposed to arguing about the definition of words.

j / k navigate · click thread line to collapse