FLUX1.1 [pro] – New SotA text-to-image model from Black Forest Labs (opens in new tab)

(replicate.com)

228 pointsfagerhult1y ago142 comments

142 comments

85 comments · 22 top-level

vessenes1y ago· 16 in thread

Flux is so frustrating to me. Really good prompt adherence, strong ability to keep track of multiple parts of a scene, it's technically very impressive. However it seems to have had no training on art-art. I can't get it to generate even something that looks like Degas, for instance. And, I can't even fine tune a painterly art style of any sort into Flux dev. I get that there was working, living artist backlash at SD and I can therefore imagine that the BFL team has decided not to train on art, but, it's a real loss. Both in terms of human knowledge of, say composition, emotion, and so on, but also for style diversity.

For goodness sake, the MET in New York has a massive trove of open CC0 type licensed art. Dear BFL, please ease up a bit on this, and add some art-art to your models, they will be better as a result.

crystal_revenge1y ago

I've had a similar experience, incredible at generating a very specific style of image, but not great at generating anything with a specific style.

I suspect we'll see the answer to this is LoRAs. Two examples that stick out are:

- Flux Tarot v1 [0]

- Flux Amateur Photography [1]

Both of these do a great job of combining all the benefits of Flux with custom styles that seem to work quite well.

[0] https://huggingface.co/multimodalart/flux-tarot-v1 [1] https://civitai.com/models/652699?modelVersionId=756149

vessenes1y ago

I like those, and there's an electroshock lora that's just awesome out there. That said, Tarot and others like it are "illustrator" type styles with extra juice. I have not successfully trained a LoRa for any painting style, Flux does not seem to know about painting.

2 more replies

whywhywhywhy1y ago

>However it seems to have had no training on art-art. I can't get it to generate even something that looks like Degas, for instance

It feels like they just removed names from the datasets to make it worse at recreating famous people and artists.

vessenes1y ago

No, they absolutely did not just do that in this case, although that was the SD plan. If you prompt for "painterly, oil painting, thick brush strokes, impressionistic oil painting style" to flux, you will get ... anime-ish renderings.

1 more reply

throwup2381y ago

I’ve had the same problem with photography styles, even though the photographer I’m going for is Prokudin-Gorskii who used emulsion plates in the 1910s and the entire Library of Congress collection is in the public domain. I’m curious how they even managed to remove them from the training data since the entire LoC is such an easy dataset to access.

vessenes1y ago

Yes, exactly. I think they purposely did not train on stuff like this. I'd bet that you could do a LoRa of Prokudin-Gorskii though; there's a lot of photographic content in flux's training set.

throwaway3141551y ago

i'm fairly confident they did a broad FirstName LastName removal.

gs171y ago

And I can't imagine there's a real copyright (or ethical) issue with including artwork in the public domain because the artist died over a century ago.

thomastjeffery1y ago

I think that's part of what makes FLUX.1 so good: the content it's trained on is very similar.

Diversity is a double-edged sword. It's a desirable feature where you want it, and an undesirable feature everywhere else. If you want an impressionist painting, then it's good to have Monet and Degas in the training corpus. On the other hand, if you want a photograph of water lilies, then it's good to keep Monet out of the training data.

doctorpangloss1y ago

DALL-E3 doesn't struggle with this. It's just opinions. There's no technical limitation. They chose to weaken the model in this regard.

1 more reply

weebull1y ago

I wonder if part of the reason it's good is because it's been trained for a more specific task. I can only imagine that if your concept of a "house" includes range from a stately home to "a pineapple under the sea" you're going to end up with a very generalised concept. It's then takes specific prompting to remove the influences you're not interested in.

I suspect the same goes for art styles. There's such huge variety that really they'd be better surveys by separate models.

DeathArrow1y ago

There are people who undistilled Flux so it can be further finetuned, so adding art training won't be an issue.

https://huggingface.co/nyanko7/flux-dev-de-distill

pdntspa1y ago

I wonder if you can use Flux to generate the base image then img2img on SD1.4 to impart artistic style?

vunderba1y ago

That's what a refiner is for in auto1111. Taking an image the last 10% and touching it up with an alternative model.

I actually use flux to generate image for purposes of adherence, then pull it in as a canny/depth controlnet with more established models like realvis, unstableXL, etc.

2 more replies

skort1y ago

>but, it's a real loss. Both in terms of human knowledge of, say composition, emotion, and so on, but also for style diversity

But that real art still exists, and can still be found, so what exactly is the loss here?

vessenes1y ago

We may differ on our take about the usefulness of diffusion models, but I'd say it's a loss in that many of the visuals humans will see in the next ten years are going to be generated by these models, and I for one wish they weren't just trained on weeb shit.

2 more replies

nirav721y ago· 11 in thread

Are there any projects that allow for easy setup and hosting Flux locally? Similar to SD projects like InvokeAI or a1111

vunderba1y ago

The answer is it really depends on your hardware, but the nice thing is that you can split out the text encoder when using ComfyUI. On a 24gb VRAM card I can run the Q8_0 GGUF version of flux-dev with the T5 FP16 text encoder. The Q8_0 gguf version in particular has very little visual difference from the original fp16 models. A 1024x1024 image takes about 15 seconds to generate.

nickthegreek1y ago

Forge

https://github.com/lllyasviel/stable-diffusion-webui-forge

https://www.reddit.com/r/StableDiffusion/comments/1esxkk8/ho...

doctorpangloss1y ago

https://huggingface.co/docs/diffusers/main/en/api/pipelines/...

It's about 6 lines of Python.

sophrocyne1y ago

Invoke is model agnostic, and supports Flux, including quantized versions.

minimaxir1y ago

Flux is more weird than old SD projects since Flux is extremely resource dependant and won't run on most hardware.

waffletower1y ago

Doesn't take a lot of effort to get Flux dev/schnell to run on 3090s unquantized, but I agree that 24gb is the consumer GPU memory limit and there are many with less than that. Flux runs great on modern Mac hardware as well, if you have at least 32gb of unified memory.

2 more replies

ziddoap1y ago

People have Flux running on pretty much everything at this point, assuming you are comfortable waiting 3+ minutes for a 512x512 image.

I managed to get it running on an old computer with a 2060 Super, taking ~1.5 minutes per image gen. People are generating on a 1080.

Filligree1y ago

The GGUF quantisations do run on most recent hardware, albeit at increasingly concerning quality tradeoffs.

1 more reply

leumon1y ago

Using comfyui with the official flux workflow is easy and works nicely. comfy can also be used via API.

Mashimo1y ago

I use InvokeAI to run flux.dev and flux.schnell.

pdntspa1y ago

DrawThings on Mac

ilaksh1y ago· 10 in thread

Pretty smart model. Here's one I made: https://replicate.com/p/6ez0x8xqvsrga0cjadg8m7bah0

jug1y ago

One thing that makes FLUX so special is the prompt understanding. I now gave FLUX 1.1 a prompt "Closeup of a doll house built to resemble a famous room in the TV show Friends" and it gave me one with the sign "Central Perk". I never prompted for the text "Central Perk". A Redditor also discovered that it has an associative understanding of emotions. For example "Rose of passion" and it may draw a flower that is burning, because passion is fiery.

This is miles ahead of most other image generation models available today.

drdaeman1y ago

Yet, it doesn't seem to know how a Tektronix 4010 actually looks like... ;)

I had similar issues trying to paint a "I cast non-magic missile" meme with a fantasy wizard using a missile launcher. No model out there (I've tried SD, SDXL, FLUX.1dev and now this FLUX1.1pro) knows how a missile launcher looks like (neither as a generic term, nor any specific systems) and even has no clue how it's held, so they all draw really weird contraptions.

morbicer1y ago

Isn't it because the shoulder launched weapon is usually called rocket launcher, rpg or bazooka? Never heard it referred as misille launcher.

1 more reply

nikcub1y ago

I've gone from counting fingers on a hand to keys on a keyboard

PcChip1y ago

agreed - pretty impressive! https://replicate.com/p/ajfrva4p4hrge0cjaf3bncfwn4

loufe1y ago

That is astoundingly good adherence to the description. I already liked and was impressed by Flux1 but that is perhaps the most impressive image generation I've ever seen.

miohtama1y ago

Is it going be able to go head-to-head against Midjourney?

1 more reply

loxias1y ago

It's quite good at following a detailed paragraph long description of an scene, which is a double edged sword. A lot of the fun for me with early text to image models was underspecifying an image and then enjoying how the model "invents" it. "Steampunk spaceship", "communist bear", "glass city".

flux is amazing, but I find it requires a very literal description, which pushes the "creative work" back to the text itself. Which can certainly be a good thing, just a bit less gratifying to non visual types like myself. :)

I wonder, only somewhat jokingly, if one could make text generators which "imagine" detailed fantastical scenes, suitable for feeding to a text to image model.

vunderba1y ago

That's what Fooocus is - it allows you to specify a "text expander" LLM that sits in between the input prompt and the diffusion model.

https://github.com/lllyasviel/Fooocus

ilaksh1y ago

Prompt enhancement is now a standard feature in many image generation tools.

sharkjacobs1y ago· 8 in thread

"state of the art" has become such tired marketing jargon.

"our most advanced and efficient model yet"

"a significant step forward in our mission to empower creators"

I get it, you can't sell things if you don't market them, and you can't make a living making things if you don't sell them, but it's exhausting.

johnfn1y ago

Flux is state of the art. You can see an ELO-scored leaderboard here:

https://huggingface.co/spaces/ArtificialAnalysis/Text-to-Ima...

bemmu1y ago

Flux genuinely is the best model I’ve tried though. If there is a better one I’d love to know.

GaggiX1y ago

Have you tried Ideogram v2?

1 more reply

halJordan1y ago

It is state of the art. And it's not like the art has stagnated.

vunderba1y ago

Agreed, but the flux dev model is easily the best model out there in terms of overall prompt adherence that can also be run locally.

Some comparisons against DALL-E 3.

https://mordenstar.com/blog/flux-comparisons

arizen1y ago

- How do copywriters greet each other in the morning?

- Take your morning to the next level!

minimaxir1y ago

The official blog post justifies the marketing copy a bit more with metrics.

sharkjacobs1y ago

The point is that the metrics say the thing, this stuff doesn't say actually anything.

What does "state of the art" mean? That it's using the latest "cutting edge" model technology?

When Apple releases a new iPhone Pro Max, it's "state of the art". When they release a new iPhone SE, there's an argument to be made that it's not because it uses 2 year old chips. But what would it even mean for BFL to release a model which wasn't "state of the art"

> our most advanced and efficient model yet

Yes, likewise, this is how technology companies work. They release something and then the next thing they release is more advanced.

> a significant step forward in our mission to empower creators

Going from 12 seconds to 4 seconds is a significant speed boost, but does it move the needle on their mission to empower creators? These are their words, not mine, it's a technical achievement and impressive incremental progress, but are there users out there who are more empowered by this? significantly more empowered!?

1 more reply

doctorpangloss1y ago· 8 in thread

I'm worried about what happens when more people find out about Ideogram.

There are a lot of things that don't appear in ELO scores. For one, they will not reflect that you cannot prompt women's faces in Flux. We can only speculate why.

liuliu1y ago

What do you mean? FLUX.1 prompts women or women faces just fine? Do you mean the skin texture is unrealistic or some other artifacts?

jjcm1y ago

Flux tends to gravitate towards a single face archetype for both sexes. For women it's a narrow face with a very slightly cleft chin. Men almost always appear with a very short cut beard or stubble. r/stablediffusion calls it the "flux face", and there are several LoRAs that aim to steer the model away from them.

1 more reply

doctorpangloss1y ago

Flux will not adhere to your detailed description of a woman's face nearly as well as it does for a man, and it doesn't adhere to text descriptions of faces well in general. This is not a technical limitation, this was a choice in the captioning of the model's dataset and maybe other more sophisticated decisions like loss. It exhibits similar flaws with its representation of male versus female celebrities; it also exhibits this flaw when you use language that describes male celebrities versus female celebrities appearances.

1 more reply

throwaway3141551y ago

what they really mean is that it's not useful for generating lewd imagery of women. It was likely nerfed in this regard on purpose because BFL didn't want to be associated with that (however legal it may be).

1 more reply

giancarlostoro1y ago

How locked down is it? My problem with a lot of these is I like to make really ridiculous meme type images, but I run into walls for dumb reasons. Like if I want to make something thats "copyrighted" like a mix of certain characters from one franchise or whatever, I cannot sometimes I get told that the model cannot generate copyrighted content, even though courts ruled that AI generated stuff cannot be copyrighted either way...

I feel like AI should just be treated as fair use as long as its not 100% blatantly a literal clone of the original work.

doctorpangloss1y ago

> How locked down is it? ... I get told that the model cannot generate copyrighted... AI should just be treated as fair use

Ideogram and Flux both have their own broad set of limitations that are non-technical and unpublished. IMO they are not really motivated by legal concerns, other than the lack of transparency itself.

So maybe the issue is that transparency, and that the hazy legal climate means no transparency. You can't go anywhere and see the detailed list of dataset collection and captioning opinions for proprietary models. Open Model Initiative, trying to make a model, did publish their opinions, and they're not getting sued anytime soon. However, their opinions are an endless source of conflict.

sdenton41y ago

It's perfectly happy to make an imperial storm trooper riding a dragon, for what it's worth

jjordan1y ago

I've been using Venice.ai which offers afaik the most uncensored service currently available, outside of running your own instances. No problem with prompts that include copyrighted terms.

whitehexagon1y ago· 5 in thread

I'm running Asahi Linux on a 32GB M1 Pro. Any chance of being able to run text-to-image models locally? I've had some success with LLMs, but only the smaller models. No idea where to start with images, everything seems geared towards msft+nvda.

loxias1y ago

Try https://github.com/leejet/stable-diffusion.cpp

LeoPanthera1y ago

"Draw Things" is a native Mac app for text to image. It's a a lot more advanced than DiffusionBee, it will download the models for you, and it's free. It's also available for iOS. (!)

smcleod1y ago

Draw things is neat but it's so damn slow compared to other tools (e.g. invokeai), I'm not sure why it takes so long to generate images with any model?

2 more replies

collinvandyck761y ago

DiffusionBee will let you do this quite easily.

edit: nevermind, it's a macos app

lagniappe1y ago

Is DiffusionBee still in development? I had stopped using it because it seemed like the dev interest had stalled.

1 more reply

skybrian1y ago· 2 in thread

It doesn’t get piano keyboards right, but it’s the first image generator I’ve tried that sometimes get “someone playing accordion” mostly right.

When I ask for a man playing accordion, it’s usually a somewhat flawed piano accordion, but If I ask for a woman playing accordion, it’s usually a button accordion. I’ve also seen a few that are half-button, half-piano monstrosities.

Also, if I ask for “someone playing accordion”, it’s always a woman.

vunderba1y ago

Periodic data is always hard for generative image systems - particularly if that "cycle" window is relatively large (as would be the case for octaves of a piano).

skybrian1y ago

Yeah, it's my informal test to see if a new model has made any progress on that.

nubinetwork1y ago· 2 in thread

I tried using schnell, it won't fit in a 16gb GPU, and I couldn't get it to run on CPU.

washadjeffmad1y ago

Try an fp8: https://huggingface.co/Kijai/flux-fp8/

TobTobXX1y ago

I've sucessfully run schnell and dev on a 12G GPU. They do take 40s/60s repectively, but it works. I used ComfyUI and didn't have to tweak anything.

byteknight1y ago· 1 in thread

I won't pay for a model, but that cake image looks dang good.

mainframed1y ago

Although culinarily incorrect :)

in3d1y ago

Better link https://blackforestlabs.ai/announcing-flux-1-1-pro-and-the-b...

ChrisArchitect1y ago

Announcement post: https://blackforestlabs.ai/announcing-flux-1-1-pro-and-the-b...

(https://news.ycombinator.com/item?id=41730626)

Der_Einzige1y ago

Far more interesting will be when pony diffusion V7 launches.

No one in the image space wants to admit it, but well over half of your user base wants to generate hardcore NSFW with your models and they mostly don’t care about any other capabilities.

Jackson__1y ago

Ah, that was one short gravy train even by modern tech company standards. Really wish the space was more competitive and open so it wouldn't just be one company at the top locking their models behind APIs.

evrim1891111y ago

I think Flux is better than SDXL and Dall e. I tried the models from here https://apps.apple.com/us/app/art-x-a-i-art-generator-aiart/...

fortran771y ago

I've been playing with Flux.Dev and such a big step forward from Stable Diffusion and all the other Generative AIs that could run on consumer GPUs.

I just tried this Flux1.1 pro page (prompt: "A sad Macintosh user who is upset because his computer can't play games") and was very impressed by the detail and "understanding" this model has.

jeffbee1y ago

I asked for a simple scene and it drew in the exact same AI girl that every text-to-image model wants to draw, same face, same hair, so generic that a Google reverse image search pulls up thousands of the exact same AI girl. No variety of output at all.

ks20481y ago

Is there a good site that compares text-to-image models - showing a bunch of examples of text w/ output on each model?

kindkang20241y ago

I really enjoy its service. It's promising for UI design. My advocacy website pages' UI design was bootstrapped using it. It is quite good for developers without much design ability.

Ironically, I am afraid to type the website out and will keep it unknown here. My account could be suspended because of this. It had already reached -1 karma. It's better to keep my account alive.

Mashimo1y ago

Oh neat. I wonder if they also improve .schnell and .dev soon. That would be nice :)

jchw1y ago

The generated images look impressive of course but I can't help but be mildly amused by the fact that the prompt for the second example image insists strongly that the image should say 1.1:

> ... photo with the text "FLUX 1.1 [Pro]", ..., must say "1.1", ...

...And of course, it does not.

1 more reply

ionwake1y ago

Sorry to be a noob, but how does this relate to fastflux.ai which seems to work great and creates an image in less than a second? Is this a new model on a slower host?

melvinmelih1y ago

In case you want to try it out without hassling with the API, I've set up a free tool for it so you can try it out on WhatsApp: https://instatools.ai/products/fluxprovisions

j / k navigate · click thread line to collapse

142 comments

85 comments · 22 top-level

vessenes1y ago· 16 in thread

For goodness sake, the MET in New York has a massive trove of open CC0 type licensed art. Dear BFL, please ease up a bit on this, and add some art-art to your models, they will be better as a result.

crystal_revenge1y ago

I've had a similar experience, incredible at generating a very specific style of image, but not great at generating anything with a specific style.

I suspect we'll see the answer to this is LoRAs. Two examples that stick out are:

- Flux Tarot v1 [0]

- Flux Amateur Photography [1]

Both of these do a great job of combining all the benefits of Flux with custom styles that seem to work quite well.

[0] https://huggingface.co/multimodalart/flux-tarot-v1 [1] https://civitai.com/models/652699?modelVersionId=756149

vessenes1y ago

2 more replies

whywhywhywhy1y ago

>However it seems to have had no training on art-art. I can't get it to generate even something that looks like Degas, for instance

It feels like they just removed names from the datasets to make it worse at recreating famous people and artists.

vessenes1y ago

1 more reply

throwup2381y ago

vessenes1y ago

Yes, exactly. I think they purposely did not train on stuff like this. I'd bet that you could do a LoRa of Prokudin-Gorskii though; there's a lot of photographic content in flux's training set.

throwaway3141551y ago

i'm fairly confident they did a broad FirstName LastName removal.

gs171y ago

And I can't imagine there's a real copyright (or ethical) issue with including artwork in the public domain because the artist died over a century ago.

thomastjeffery1y ago

I think that's part of what makes FLUX.1 so good: the content it's trained on is very similar.

doctorpangloss1y ago

DALL-E3 doesn't struggle with this. It's just opinions. There's no technical limitation. They chose to weaken the model in this regard.

1 more reply

weebull1y ago

I suspect the same goes for art styles. There's such huge variety that really they'd be better surveys by separate models.

DeathArrow1y ago

There are people who undistilled Flux so it can be further finetuned, so adding art training won't be an issue.

https://huggingface.co/nyanko7/flux-dev-de-distill

pdntspa1y ago

I wonder if you can use Flux to generate the base image then img2img on SD1.4 to impart artistic style?

vunderba1y ago

That's what a refiner is for in auto1111. Taking an image the last 10% and touching it up with an alternative model.

I actually use flux to generate image for purposes of adherence, then pull it in as a canny/depth controlnet with more established models like realvis, unstableXL, etc.

2 more replies

skort1y ago

>but, it's a real loss. Both in terms of human knowledge of, say composition, emotion, and so on, but also for style diversity

But that real art still exists, and can still be found, so what exactly is the loss here?

vessenes1y ago

2 more replies

nirav721y ago· 11 in thread

Are there any projects that allow for easy setup and hosting Flux locally? Similar to SD projects like InvokeAI or a1111

vunderba1y ago

nickthegreek1y ago

Forge

https://github.com/lllyasviel/stable-diffusion-webui-forge

https://www.reddit.com/r/StableDiffusion/comments/1esxkk8/ho...

doctorpangloss1y ago

https://huggingface.co/docs/diffusers/main/en/api/pipelines/...

It's about 6 lines of Python.

sophrocyne1y ago

Invoke is model agnostic, and supports Flux, including quantized versions.

minimaxir1y ago

Flux is more weird than old SD projects since Flux is extremely resource dependant and won't run on most hardware.

waffletower1y ago

2 more replies

ziddoap1y ago

People have Flux running on pretty much everything at this point, assuming you are comfortable waiting 3+ minutes for a 512x512 image.

I managed to get it running on an old computer with a 2060 Super, taking ~1.5 minutes per image gen. People are generating on a 1080.

Filligree1y ago

The GGUF quantisations do run on most recent hardware, albeit at increasingly concerning quality tradeoffs.

1 more reply

leumon1y ago

Using comfyui with the official flux workflow is easy and works nicely. comfy can also be used via API.

Mashimo1y ago

I use InvokeAI to run flux.dev and flux.schnell.

pdntspa1y ago

DrawThings on Mac

ilaksh1y ago· 10 in thread

Pretty smart model. Here's one I made: https://replicate.com/p/6ez0x8xqvsrga0cjadg8m7bah0

jug1y ago

This is miles ahead of most other image generation models available today.

drdaeman1y ago

Yet, it doesn't seem to know how a Tektronix 4010 actually looks like... ;)

morbicer1y ago

Isn't it because the shoulder launched weapon is usually called rocket launcher, rpg or bazooka? Never heard it referred as misille launcher.

1 more reply

nikcub1y ago

I've gone from counting fingers on a hand to keys on a keyboard

PcChip1y ago

agreed - pretty impressive! https://replicate.com/p/ajfrva4p4hrge0cjaf3bncfwn4

loufe1y ago

That is astoundingly good adherence to the description. I already liked and was impressed by Flux1 but that is perhaps the most impressive image generation I've ever seen.

miohtama1y ago

Is it going be able to go head-to-head against Midjourney?

1 more reply

loxias1y ago

I wonder, only somewhat jokingly, if one could make text generators which "imagine" detailed fantastical scenes, suitable for feeding to a text to image model.

vunderba1y ago

That's what Fooocus is - it allows you to specify a "text expander" LLM that sits in between the input prompt and the diffusion model.

https://github.com/lllyasviel/Fooocus

ilaksh1y ago

Prompt enhancement is now a standard feature in many image generation tools.

sharkjacobs1y ago· 8 in thread

"state of the art" has become such tired marketing jargon.

"our most advanced and efficient model yet"

"a significant step forward in our mission to empower creators"

I get it, you can't sell things if you don't market them, and you can't make a living making things if you don't sell them, but it's exhausting.

johnfn1y ago

Flux is state of the art. You can see an ELO-scored leaderboard here:

https://huggingface.co/spaces/ArtificialAnalysis/Text-to-Ima...

bemmu1y ago

Flux genuinely is the best model I’ve tried though. If there is a better one I’d love to know.

GaggiX1y ago

Have you tried Ideogram v2?

1 more reply

halJordan1y ago

It is state of the art. And it's not like the art has stagnated.

vunderba1y ago

Agreed, but the flux dev model is easily the best model out there in terms of overall prompt adherence that can also be run locally.

Some comparisons against DALL-E 3.

https://mordenstar.com/blog/flux-comparisons

arizen1y ago

- How do copywriters greet each other in the morning?

- Take your morning to the next level!

minimaxir1y ago

The official blog post justifies the marketing copy a bit more with metrics.

sharkjacobs1y ago

The point is that the metrics say the thing, this stuff doesn't say actually anything.

What does "state of the art" mean? That it's using the latest "cutting edge" model technology?

> our most advanced and efficient model yet

Yes, likewise, this is how technology companies work. They release something and then the next thing they release is more advanced.

> a significant step forward in our mission to empower creators

1 more reply

doctorpangloss1y ago· 8 in thread

I'm worried about what happens when more people find out about Ideogram.

There are a lot of things that don't appear in ELO scores. For one, they will not reflect that you cannot prompt women's faces in Flux. We can only speculate why.

liuliu1y ago

What do you mean? FLUX.1 prompts women or women faces just fine? Do you mean the skin texture is unrealistic or some other artifacts?

jjcm1y ago

1 more reply

doctorpangloss1y ago

1 more reply

throwaway3141551y ago

1 more reply

giancarlostoro1y ago

I feel like AI should just be treated as fair use as long as its not 100% blatantly a literal clone of the original work.

doctorpangloss1y ago

> How locked down is it? ... I get told that the model cannot generate copyrighted... AI should just be treated as fair use

Ideogram and Flux both have their own broad set of limitations that are non-technical and unpublished. IMO they are not really motivated by legal concerns, other than the lack of transparency itself.

sdenton41y ago

It's perfectly happy to make an imperial storm trooper riding a dragon, for what it's worth

jjordan1y ago

I've been using Venice.ai which offers afaik the most uncensored service currently available, outside of running your own instances. No problem with prompts that include copyrighted terms.

whitehexagon1y ago· 5 in thread

loxias1y ago

Try https://github.com/leejet/stable-diffusion.cpp

LeoPanthera1y ago

"Draw Things" is a native Mac app for text to image. It's a a lot more advanced than DiffusionBee, it will download the models for you, and it's free. It's also available for iOS. (!)

smcleod1y ago

Draw things is neat but it's so damn slow compared to other tools (e.g. invokeai), I'm not sure why it takes so long to generate images with any model?

2 more replies

collinvandyck761y ago

DiffusionBee will let you do this quite easily.

edit: nevermind, it's a macos app

lagniappe1y ago

Is DiffusionBee still in development? I had stopped using it because it seemed like the dev interest had stalled.

1 more reply

skybrian1y ago· 2 in thread

It doesn’t get piano keyboards right, but it’s the first image generator I’ve tried that sometimes get “someone playing accordion” mostly right.

Also, if I ask for “someone playing accordion”, it’s always a woman.

vunderba1y ago

Periodic data is always hard for generative image systems - particularly if that "cycle" window is relatively large (as would be the case for octaves of a piano).

skybrian1y ago

Yeah, it's my informal test to see if a new model has made any progress on that.

nubinetwork1y ago· 2 in thread

I tried using schnell, it won't fit in a 16gb GPU, and I couldn't get it to run on CPU.

washadjeffmad1y ago

Try an fp8: https://huggingface.co/Kijai/flux-fp8/

TobTobXX1y ago

I've sucessfully run schnell and dev on a 12G GPU. They do take 40s/60s repectively, but it works. I used ComfyUI and didn't have to tweak anything.

byteknight1y ago· 1 in thread

I won't pay for a model, but that cake image looks dang good.

mainframed1y ago

Although culinarily incorrect :)

in3d1y ago

Better link https://blackforestlabs.ai/announcing-flux-1-1-pro-and-the-b...

ChrisArchitect1y ago

Announcement post: https://blackforestlabs.ai/announcing-flux-1-1-pro-and-the-b...

(https://news.ycombinator.com/item?id=41730626)

Der_Einzige1y ago

Far more interesting will be when pony diffusion V7 launches.

No one in the image space wants to admit it, but well over half of your user base wants to generate hardcore NSFW with your models and they mostly don’t care about any other capabilities.

Jackson__1y ago

evrim1891111y ago

I think Flux is better than SDXL and Dall e. I tried the models from here https://apps.apple.com/us/app/art-x-a-i-art-generator-aiart/...

fortran771y ago

I've been playing with Flux.Dev and such a big step forward from Stable Diffusion and all the other Generative AIs that could run on consumer GPUs.

I just tried this Flux1.1 pro page (prompt: "A sad Macintosh user who is upset because his computer can't play games") and was very impressed by the detail and "understanding" this model has.

jeffbee1y ago

ks20481y ago

Is there a good site that compares text-to-image models - showing a bunch of examples of text w/ output on each model?

kindkang20241y ago

I really enjoy its service. It's promising for UI design. My advocacy website pages' UI design was bootstrapped using it. It is quite good for developers without much design ability.

Ironically, I am afraid to type the website out and will keep it unknown here. My account could be suspended because of this. It had already reached -1 karma. It's better to keep my account alive.

Mashimo1y ago

Oh neat. I wonder if they also improve .schnell and .dev soon. That would be nice :)

jchw1y ago

The generated images look impressive of course but I can't help but be mildly amused by the fact that the prompt for the second example image insists strongly that the image should say 1.1:

> ... photo with the text "FLUX 1.1 [Pro]", ..., must say "1.1", ...

...And of course, it does not.

1 more reply

ionwake1y ago

Sorry to be a noob, but how does this relate to fastflux.ai which seems to work great and creates an image in less than a second? Is this a new model on a slower host?

melvinmelih1y ago

In case you want to try it out without hassling with the API, I've set up a free tool for it so you can try it out on WhatsApp: https://instatools.ai/products/fluxprovisions

j / k navigate · click thread line to collapse