Comparing Adobe Firefly, Dalle-2, and OpenJourney (opens in new tab)

(blog.usmanity.com)

231 pointsmuhammadusman3y ago133 comments

133 comments

71 comments · 24 top-level

kouteiheika3y ago· 15 in thread

For reference, here's what you can get with a properly tweaked Stable Diffusion, all running locally on my PC. Can be set up on almost any PC with a mid range GPU in a few minutes if you know what you're doing. I didn't do any cherry picking; this is the first thing it generated. 4 images per prompt.

1st prompt: https://i.postimg.cc/T3nZ9bQy/1st.png

2nd prompt: https://i.postimg.cc/XNFm3dSs/2nd.png

3rd prompt: https://i.postimg.cc/c1bCyqWR/3rd.png

ewjt3y ago

Can you elaborate on “properly tweaked”? When I use one of the Stable Diffusion and AUTOMATIC1111 templates on runpod.io, the results are absolutely worthless.

This is using some of the popular prompts you can find on sites like prompthero that show amazing examples.

It’s been serious expectation vs. reality disappointment for me and so I just pay the MidJourney or DALL-E fees.

kouteiheika3y ago

> Can you elaborate on “properly tweaked”?

In a nutshell:

1. Use a good checkpoint. Vanilla stable diffusion is relatively bad. There are plenty of good ones on civitai. Here's mine: https://civitai.com/models/94176

2. Use a good negative prompt with good textual inversions. (e.g. "ng_deepnegative_v1_75t", "verybadimagenegative_v1.3", etc.; you can download those from civitai too) Even if you have a good checkpoint this is essential to get good results.

3. Use a better sampling method instead of the default one. (e.g. I like to use "DPM++ SDE Karras")

There are more tricks to get even better output (e.g. controlnet is amazing), but these are the basics.

2 more replies

orbital-decay3y ago

Are you using txt2img with the vanilla model? SD's actual value is in the large array of higher-order input methods and tooling; as a tradeoff, it requires more knowledge. Similarly to 3D CGI, it's a highly technical area. You don't just enter the prompt with it.

You can finetune it on your own material, or choose one of the hundreds of public finetuned models. You can guide it in a precise manner with a sketch or by extracting a pose from a photo using controlnets or any other method. You can influence the colors. You can explicitly separate prompt parts so the tokens don't leak into each other. You can use it as a photobashing tool with a plugin to popular image editing software. Things like ComfyUI enable extremely complicated pipelines as well. etc etc etc

2 more replies

famouswaffles3y ago

You're not going to get even close to Midjourney or even Bing quality on SD without finetuning. It's that simple. When you do finetune, it will be restricted to that aesthetic and you won't get the same prompt understanding or adherence.

For all the promise of control and customization SD boasts, Midjourney beats it hands down in sheer quality. There's a reason like 99% of ai art comic creators stick to Midjourney despite the control handicap.

4 more replies

capybara_20203y ago

First off are you using a custom model or the default SD model? The default model is not the greatest. Have you tried controlnet?

But yes SD can be a bit of a pain to use. Think of it like this. SD = Linux, Midjourney = Windows/MacOS. SD is more powerful and user controllable but that also means it has a steeper learning curve.

senko3y ago

I am sure you're right, but "if you know what you're doing" does a lot of heavy lifting here.

We could just as easily say "hosting your own email can be set up in a few minutes if you know what you're doing". I could do that, but I couldn't get local SD to generate comparable images if my life depended on it.

caseyf3y ago

If you have an apple device, there is free GUI for Stable Diffusion called "Draw Things. It is nice and it just works. https://apps.apple.com/us/app/6444050820

screenshot of the options interface: https://stash.cass.xyz/drawthings-1687292611.png

1 more reply

jfdi3y ago

Nice! Would you mind sharing which stable diff you used / where you obtained from?

kouteiheika3y ago

I'm using my own custom trained model.

Here, I've uploaded it to civitai: https://civitai.com/models/94176

There are plenty of other good models too though.

1 more reply

bluetidepro3y ago

Do you have any good tutorial links to setup Stable Diffusion locally?

muhammadusmanOP3y ago

thanks for doing this, I would like to include these into the blog post as well. Can I use these and credit you for them? (let me know what you'd like linked)

kouteiheika3y ago

Sure. No need to credit me.

1 more reply

tambourine_man3y ago

Those are amazing, please consider writing a blog post of the steps you did to install and tweak Stable Diffusion to achieve these results. I'm sure many of us would love to read it.

troupo3y ago

"Just" use a "properly tweaked" something.

bibanez3y ago

You got incorporated into the article! Nice.

cainxinth3y ago· 13 in thread

Amazing how quickly Dalle-2 went from among the best image transformers to among the worst.

gwern3y ago

The stagnation has been very curious. They are part of a large & generally competent org, which otherwise has remained far ahead of the competition, like GPT-4. Except... for DALL-E 2, where it did not just stagnate for over a year (on top of its bizarre blindspots like garbage anime generation), but actually seemed to get worse. They have an experimental model of some sort that some people have access to, but even there, it's nothing to write home about compared to the best models like Parti or eDiff-I etc.

TeMPOraL3y ago

I suspect that they consider txt2img to be more of a curiosity now. Sure, it's transformative; it's going to upend whole markets (and make some people a lot of money in the process) - however, it's just producing images. Contrast with LLMs, which have already proven to be generally applicable in great many domains, and that if you squint, are probably capturing the basic mechanisms of thinking. OpenAI lost the lead in txt2img, but GPT-4 is still way ahead of every other LLM. It makes sense for them to focus pretty much 100% on that.

1 more reply

sebzim45003y ago

I think they just don't care very much about DALL-E.

Which is fair enough, when you are a (relatively) small company competing with the likes of Google and Meta you really need to focus.

famouswaffles3y ago

Nobody is able to use Parti or eDiff. Compared to models you can use, the experimental Dall-e or Bing Image Creator is second only to midjourney in my experience.

2 more replies

Applejinx3y ago

I don't know, what I saw in there (particularly with the haunted house) was a far broader POTENTIAL RANGE of outputs. I get that they were cheesier outputs, but it seems to me that those outputs were just as capable of coming from the other 'AIs'… if you let them.

It's like each of these has a hidden giant pile of negative prompts, or additional positive prompts, that greatly narrow down the range of output. There are contexts where the Dall-E 'spoopy haunted house ooooo!' imagery would be exactly right… like 'show me halloweeny stock art'.

That haunted house prompt didn't explicitly SAY 'oh, also make it look like it's a photo out of a movie and make it look fantastic'. But something in the more 'competitive' AIs knew to go for that. So if you wanted to go for the spoopy cheesey 'collective unconscious' imagery, would you have to force the more sophisticated AIs to go against their hidden requirements?

Mind you if you added 'halloween postcard from out of a cheesey old store' and suddenly the other ones were doing that vibe six times better, I'd immediately concede they were in fact that much smarter. I've seen that before, too, in different Stable Diffusion models. I'm just saying that the consistency of output in the 'smarter' ones can also represent a thumb on the scale.

They've got to compete by looking sophisticated, so the 'Greg Rutkowskification' effect will kick in: you show off by picking a flashy style to depict rather than going for something equally valid, but less commercial.

jsnell3y ago

It's not just about the haunted house. Just look at the DALLE-2 living room pictures closely. None of it makes any sense. And we're not even talking of subtle details, all of the first three pictures have a central object that the eye should be drawn to that's just a total mess. (The table that's being subsumed by a bunch of melting brown chairs in the first one, the i-don't-even-know-what that seems to be the second picture, and the whatever-this-is on the blue carpet.)

mrtksn3y ago

OpenAI screwed up that one by trying to control it. StableDiffusion on the other hand, gives me hope that AI can be high quality and open(not only in name).

Can't wait to have something like StableDiffusion but for LLMs.

famouswaffles3y ago

Dall-e experimental is very good (Bing Image creator). I only prefer midjourney to it.

capybara_20203y ago

It might be a case of them seeing way more potential with LLMs compared to image generation.

whywhywhywhy3y ago

It’s more that their moat got obliterated on image gen.

If stable diffusion didn’t launch Dall-e 2 would have been still valuable.

hathym3y ago

chatgpt next...

ralusek3y ago

Dall-E 2 was almost immediately displaced by MidJourney. Nothing comes close to even GPT 3.5 at the moment.

1 more reply

denverllc3y ago

Why innovate when you can regulate?

1 more reply

poniko3y ago· 8 in thread

Midjurney is still so far ahead it's no competition. Did a lot of testing today and firefly generated so much errors with fingers and stuff, not seen that since the original stability release. Anyone know if the web firefly and the Photoshop version is the same model?

jsheard3y ago

It's worth noting the difference in how the training material is sourced though, Midjourney is using indiscriminate web scrapes while Firefly is taking the conservative approach of only using images that Adobe holds a license for. Midjourney has the Sword of Damocles hanging over its head that depending on how legal precedent shakes out, its output might end up being too tainted for commercial purposes, and Adobe is betting on being the safe alternative during the period of uncertainly and if the hammer does come down on web-scraping models.

rafark3y ago

Would mid-journey be liable though? I mean you can create copyrighted material using photoshop too. (Even paint!).

If I create a Mickey Mouse using photoshop would adobe be liable for it?

1 more reply

jrm43y ago

I'm presuming you're not including Stable Diffusion when you say this; the fact that SD and its variants are defacto extremely "free and open source" presently put it way ahead of anything else, and are likely to do so for some time.

ramraj073y ago

As far as I can tell anyone who’s creating images is using midjourney. This is likely the same “Linux is open so it’s way better” tell that to the trillion dollar companies that bet against that.

3 more replies

quitit3y ago

I share the same opinion, but also dislike these tests because each system benefits from a different approach to prompting. What I use to get a good result in MidJourney won't work in StableDiffusion for example. Instead when making these comparisons one needs to set an objective and have people who are familiar with each system to produce their nicest images - since this is a better reflection of the real world usage. For example, ask each participant to read a chapter/page from a book with a lot of specific imagery and then use AI to create what they think that looks like.

Regarding image generation in Photoshop I can confirm two things:

- It is excellent for in and out painting with a few exceptions*

- It remains poor for generating a brand new image

*Photoshop's generative fill is very good at extending landscapes, it will match lighting and according to the release video can be smart enough to observe what a reflection should contain even if that is not specifically included in the image (in their launch demo they showed how a reflection pool captured the underside of a vehicle.)

Where generative fill falls apart: Inserting new objects that are not well defined produces problems. Choosing something like a VW Beetle will produce a good result as it is well defined, choosing something like "boat", "dragon", or even "pirate's chest": will produce a range of images that do not necessarily fit the scene - this is likely because source imagery for such objects is likely vague and prone to different representations.

1st note about Firefly: Anything that is likely to produce a spherical looking shape tends to be blocked - likely because it resembles certain human anatomy. This is problematic when doing small touch ups such as fixing fingers.

A special note about photoshop versus other systems: Photoshop has the added problem of needing to match the resolution of the source material. Currently it achieves this from combining upscaling with resizing - this means that if one is extending an area with high detail, that detail cannot be maintained and instead is softer/blurrier than the original sections. It also means that if one extends directly from the border of an image, then a feathered edge becomes visible which must be corrected by hand.

I currently test the following AI generators, feel free to ask me about any of these: StableDiffusion (Automatic and InvokeAI), OpenAI's Dall-E 2, MidJourney, Stability AI's DreamStudio, and Adobe Firefly.

mettamage3y ago

Not with typography though, haha. It can't spell. I had to draw the letters myself

jamilton3y ago

None of these can do text well. There's a model that does do text and composition well, but the name escapes me. And the general quality is much lower overall, so it's a pretty heavy tradeoff.

1 more reply

ignite3y ago

If midjourney could count fingers, I'd be thrilled!

mdorazio3y ago· 3 in thread

Since the author didn't have access to Midjourney, here's the first two prompts in MJ with default settings (not upscaled):

https://imgur.com/a/siQG06O

https://imgur.com/a/vp2oOHu

muhammadusmanOP3y ago

thanks for sharing this, do you mind if I include this in the post. I will credit you of course (let me know what you'd like linked to).

update: I've edited the post to include these results as well

gl-prod3y ago

something something AI generated cannot be copyrighted [/s]

mdorazio3y ago

Go for it! Happy to help. Let me know if you want upscales.

snowe20103y ago· 3 in thread

not sure this is a good comparison. midjourney likes much shorter prompts, and honestly they're all absolutely terrible for anything that isn't 'photo' based. E.g. ask it to generate a word bubble of the most common programming languages and it will fail every time, no matter what you try. I love it for photo stuff, but for photoshop you'd expect it to be able to do other things as well.

jw12243y ago

That’s not a fair comparison, as Midjourney is outstanding at a wide range of styles beyond photography.

Generating a “word bubble” is going to look terrible in every major diffusion model. Cohesive words and writing in image models is still highly specialised.

capybara_20203y ago

Curious, midjourney does great art and cartoon/comic styles too. Not just realistic images.

Most image AI tools are terrible with words.

I am curious, what images did you try generating with midjourney?

TheOtherHobbes3y ago

In my first few hours with DiffusionBee I made a couple of very credible semi-abstract portraits by mashing up the styles of unrelated artists. And some splashy watercolours. And some logo line art.

And the inevitable booby cheesy rendered forest fairy.

I don't think they're terrible at all. They absolutely can make original art with decent production values.

They can't write text yet, but I'm sure that's coming soon.

Skywalker133y ago· 1 in thread

And here with BlueWillow https://www.bluewillow.ai/

1: https://media.discordapp.net/attachments/1060989219432054835...

2: https://media.discordapp.net/attachments/1060989219432054835...

3: https://media.discordapp.net/attachments/1060989219432054835...

kj_setup3y ago

Seems a lot better than some of the ones in the post

mdorazio3y ago· 1 in thread

Kind of strange to me that they didn't test any prompts with people in them. In my experience that tends to show the limitations of various models pretty quickly.

usaar3333y ago

Lighting also tends to be pretty bad in complex scenes. I find the unrealistic shadows tends to break the photorealism of few light source scenes.

1 more reply

theobromananda3y ago· 1 in thread

All three of these are horrible, and running Stable Diffusion locally produces incredibly better results as seen in this comment section.

fumar3y ago

MidJourney produces more consistent and usable results. I am running SD and also pay for MJ. I've tried several checkpoint and loras, but the output is often disappointing or incorrectly using the prompts.

pdntspa3y ago· 1 in thread

Why didnt this person include Stable Diffusion?

qiller3y ago

OpenJourney is fine tuned SD

whatscooking3y ago· 1 in thread

I like how simple Firefly’s images are, like something you’d want to work with in Photoshop. Dalle-2 looks terrible. Midjourney is still my favorite.

chankstein383y ago

As someone who has spent hours playing with it in Photoshop (Beta) Firefly is actually pretty damned cool!

FanaHOVA3y ago

I had done a similar comparison a couple months back but used Lexica instead of DALL-E.

Seems clear to me that Midjourney has by far the best "vibes" understanding. Most models get the items right but not the lighting. Firefly seems focused on realism which makes sense for a photography audience.

https://twitter.com/fanahova/status/1639325389955952640?s=46...

dvt3y ago

Adobe Firefly is actually extremely competent, especially since it doesn't use copyrighted images in its training set. Using MidJourney (which is fantastic) commercially will be a quagmire for the unlucky company that draws a lawsuit.

MediumD3y ago

*Shameless Plug*

If you want to play around with OpenJourney (or any other fine-tuned StableDiffusion model). I made my own UI with a free tier at https://happyaccidents.ai/.

It supports all open-sourced fine-tuned models & loras and I recently added ControlNet.

famouswaffles3y ago

Should be compared using Bing Image Creator(better version of dall-e) rather than the Dalle-2 site.

abeppu3y ago

Is it intentional that each of the prompts is given twice in that blockquote? It's done without a space, so e.g. in the 2nd example, the word "centeredvalley" appears because of the way the last/first words of the first/second repetition were mashed together. Does that indicate what was actually given to the engines, or was that a copy-paste issue made only while putting together the article? I could imagine that non-words like "cornera" in the last example could throw things off?

throwaway7423y ago

My result for prompt 2 using Dreamshaper Stable Diffusion model.

https://i.imgur.com/ipnf3f5.png

rgbrgb3y ago

For those curious, I tried the same prompts with Kandinsky 2.1 [0]. In my experience it kind of blends the conceptual understanding of DALL-E with the higher quality image generation of Stable Diffusion. Like Midjourney though it kind of injects it's own style and allows you to get "satisfying" results from short prompts.

The flaw with these comparisons is that you really shouldn't use the same prompt with different generators. If you want to get best results you do have to play with the prompts and do a bunch of iteration to kind of explore the latent space and find what you're looking for. The first super long prompt looks like it's tuned for stable diffusion for instance. Different generators also have different syntax (e.g. with stable diffusion you can surround a phrase with parens to give it extra emphasis).

[0]: https://iterate.world/s/clj4n19u20000jv08iqygiaqw

cubefox3y ago

Here is what the haunted house looks like with Dall-E ~3 (Bing Image Creator): https://www.bing.com/images/create/a-haunted-house-with-ghos...

Generally, this model is much better than Dall-E 2, and it beats Firefly in some areas (I didn't try Midjourney or Stable Diffusion). Firefly usually produces photos with significantly fewer visual mistakes (like the wrong number of fingers or messed up faces) than the Bing Dall-E. But the latter usually understands prompts much better and more often produces something that matches it well. Firefly also doesn't "know" a lot of pop culture or history things, e.g. Marilyn Monroe, or what Coca-Cola is.

SoKamil3y ago

Can we appreciate how well that lightbox works on this site in a mobile mobile browser, especially Safari? Also the gestures are smooth and do not cause any quirks like unintended refresh gesture

personjerry3y ago

The analysis at the end seems to be lacking. From my perspective, PhotoShop and Midjourney come out on top in terms of aesthetic and accuracy, with kouteiheika's Stable Diffusion results[0] a close second. Dall-E falls far behind, which makes sense considering all the work that's gone in to the other systems to fine-tune and build ecosystems around them.

[0]: https://news.ycombinator.com/item?id=36408744

senko3y ago

For comparison, these were generated using Stability.ai API: https://postimg.cc/gallery/MQfkgP7/ce388adf

I used stable-diffusion-xl-beta-v2-2-2 model, copypasted prompts from the blog post, one-shot for each prompt. I chose style presets that closely matched the prompt (added as suffixes in image filenames).

Aeolun3y ago

> small windows opening onto the garden

Literally all of the examples have floor to ceiling windows across the entire length of the wall…

dahwolf3y ago

I'm glad it's not just me getting unusable garbage out of Dall-E and glorious results from MidJourney.

muhammadusmanOP3y ago

Author here: I updated the post to include the generated results from Stable Diffusion and Midjourney (thanks to kouteiheika and mdorazio).

j / k navigate · click thread line to collapse

133 comments

71 comments · 24 top-level

kouteiheika3y ago· 15 in thread

1st prompt: https://i.postimg.cc/T3nZ9bQy/1st.png

2nd prompt: https://i.postimg.cc/XNFm3dSs/2nd.png

3rd prompt: https://i.postimg.cc/c1bCyqWR/3rd.png

ewjt3y ago

Can you elaborate on “properly tweaked”? When I use one of the Stable Diffusion and AUTOMATIC1111 templates on runpod.io, the results are absolutely worthless.

This is using some of the popular prompts you can find on sites like prompthero that show amazing examples.

It’s been serious expectation vs. reality disappointment for me and so I just pay the MidJourney or DALL-E fees.

kouteiheika3y ago

> Can you elaborate on “properly tweaked”?

In a nutshell:

1. Use a good checkpoint. Vanilla stable diffusion is relatively bad. There are plenty of good ones on civitai. Here's mine: https://civitai.com/models/94176

3. Use a better sampling method instead of the default one. (e.g. I like to use "DPM++ SDE Karras")

There are more tricks to get even better output (e.g. controlnet is amazing), but these are the basics.

2 more replies

orbital-decay3y ago

2 more replies

famouswaffles3y ago

4 more replies

capybara_20203y ago

First off are you using a custom model or the default SD model? The default model is not the greatest. Have you tried controlnet?

But yes SD can be a bit of a pain to use. Think of it like this. SD = Linux, Midjourney = Windows/MacOS. SD is more powerful and user controllable but that also means it has a steeper learning curve.

senko3y ago

I am sure you're right, but "if you know what you're doing" does a lot of heavy lifting here.

caseyf3y ago

If you have an apple device, there is free GUI for Stable Diffusion called "Draw Things. It is nice and it just works. https://apps.apple.com/us/app/6444050820

screenshot of the options interface: https://stash.cass.xyz/drawthings-1687292611.png

1 more reply

jfdi3y ago

Nice! Would you mind sharing which stable diff you used / where you obtained from?

kouteiheika3y ago

I'm using my own custom trained model.

Here, I've uploaded it to civitai: https://civitai.com/models/94176

There are plenty of other good models too though.

1 more reply

bluetidepro3y ago

Do you have any good tutorial links to setup Stable Diffusion locally?

muhammadusmanOP3y ago

thanks for doing this, I would like to include these into the blog post as well. Can I use these and credit you for them? (let me know what you'd like linked)

kouteiheika3y ago

Sure. No need to credit me.

1 more reply

tambourine_man3y ago

Those are amazing, please consider writing a blog post of the steps you did to install and tweak Stable Diffusion to achieve these results. I'm sure many of us would love to read it.

troupo3y ago

"Just" use a "properly tweaked" something.

bibanez3y ago

You got incorporated into the article! Nice.

cainxinth3y ago· 13 in thread

Amazing how quickly Dalle-2 went from among the best image transformers to among the worst.

gwern3y ago

TeMPOraL3y ago

1 more reply

sebzim45003y ago

I think they just don't care very much about DALL-E.

Which is fair enough, when you are a (relatively) small company competing with the likes of Google and Meta you really need to focus.

famouswaffles3y ago

Nobody is able to use Parti or eDiff. Compared to models you can use, the experimental Dall-e or Bing Image Creator is second only to midjourney in my experience.

2 more replies

Applejinx3y ago

jsnell3y ago

mrtksn3y ago

OpenAI screwed up that one by trying to control it. StableDiffusion on the other hand, gives me hope that AI can be high quality and open(not only in name).

Can't wait to have something like StableDiffusion but for LLMs.

famouswaffles3y ago

Dall-e experimental is very good (Bing Image creator). I only prefer midjourney to it.

capybara_20203y ago

It might be a case of them seeing way more potential with LLMs compared to image generation.

whywhywhywhy3y ago

It’s more that their moat got obliterated on image gen.

If stable diffusion didn’t launch Dall-e 2 would have been still valuable.

hathym3y ago

chatgpt next...

ralusek3y ago

Dall-E 2 was almost immediately displaced by MidJourney. Nothing comes close to even GPT 3.5 at the moment.

1 more reply

denverllc3y ago

Why innovate when you can regulate?

1 more reply

poniko3y ago· 8 in thread

jsheard3y ago

rafark3y ago

Would mid-journey be liable though? I mean you can create copyrighted material using photoshop too. (Even paint!).

If I create a Mickey Mouse using photoshop would adobe be liable for it?

1 more reply

jrm43y ago

ramraj073y ago

3 more replies

quitit3y ago

Regarding image generation in Photoshop I can confirm two things:

- It is excellent for in and out painting with a few exceptions*

- It remains poor for generating a brand new image

mettamage3y ago

Not with typography though, haha. It can't spell. I had to draw the letters myself

jamilton3y ago

None of these can do text well. There's a model that does do text and composition well, but the name escapes me. And the general quality is much lower overall, so it's a pretty heavy tradeoff.

1 more reply

ignite3y ago

If midjourney could count fingers, I'd be thrilled!

mdorazio3y ago· 3 in thread

Since the author didn't have access to Midjourney, here's the first two prompts in MJ with default settings (not upscaled):

https://imgur.com/a/siQG06O

https://imgur.com/a/vp2oOHu

muhammadusmanOP3y ago

thanks for sharing this, do you mind if I include this in the post. I will credit you of course (let me know what you'd like linked to).

update: I've edited the post to include these results as well

gl-prod3y ago

something something AI generated cannot be copyrighted [/s]

mdorazio3y ago

Go for it! Happy to help. Let me know if you want upscales.

snowe20103y ago· 3 in thread

jw12243y ago

That’s not a fair comparison, as Midjourney is outstanding at a wide range of styles beyond photography.

Generating a “word bubble” is going to look terrible in every major diffusion model. Cohesive words and writing in image models is still highly specialised.

capybara_20203y ago

Curious, midjourney does great art and cartoon/comic styles too. Not just realistic images.

Most image AI tools are terrible with words.

I am curious, what images did you try generating with midjourney?

TheOtherHobbes3y ago

In my first few hours with DiffusionBee I made a couple of very credible semi-abstract portraits by mashing up the styles of unrelated artists. And some splashy watercolours. And some logo line art.

And the inevitable booby cheesy rendered forest fairy.

I don't think they're terrible at all. They absolutely can make original art with decent production values.

They can't write text yet, but I'm sure that's coming soon.

Skywalker133y ago· 1 in thread

And here with BlueWillow https://www.bluewillow.ai/

1: https://media.discordapp.net/attachments/1060989219432054835...

2: https://media.discordapp.net/attachments/1060989219432054835...

3: https://media.discordapp.net/attachments/1060989219432054835...

kj_setup3y ago

Seems a lot better than some of the ones in the post

mdorazio3y ago· 1 in thread

Kind of strange to me that they didn't test any prompts with people in them. In my experience that tends to show the limitations of various models pretty quickly.

usaar3333y ago

Lighting also tends to be pretty bad in complex scenes. I find the unrealistic shadows tends to break the photorealism of few light source scenes.

1 more reply

theobromananda3y ago· 1 in thread

All three of these are horrible, and running Stable Diffusion locally produces incredibly better results as seen in this comment section.

fumar3y ago

pdntspa3y ago· 1 in thread

Why didnt this person include Stable Diffusion?

qiller3y ago

OpenJourney is fine tuned SD

whatscooking3y ago· 1 in thread

I like how simple Firefly’s images are, like something you’d want to work with in Photoshop. Dalle-2 looks terrible. Midjourney is still my favorite.

chankstein383y ago

As someone who has spent hours playing with it in Photoshop (Beta) Firefly is actually pretty damned cool!

FanaHOVA3y ago

I had done a similar comparison a couple months back but used Lexica instead of DALL-E.

https://twitter.com/fanahova/status/1639325389955952640?s=46...

dvt3y ago

MediumD3y ago

*Shameless Plug*

If you want to play around with OpenJourney (or any other fine-tuned StableDiffusion model). I made my own UI with a free tier at https://happyaccidents.ai/.

It supports all open-sourced fine-tuned models & loras and I recently added ControlNet.

famouswaffles3y ago

Should be compared using Bing Image Creator(better version of dall-e) rather than the Dalle-2 site.

abeppu3y ago

throwaway7423y ago

My result for prompt 2 using Dreamshaper Stable Diffusion model.

https://i.imgur.com/ipnf3f5.png

rgbrgb3y ago

[0]: https://iterate.world/s/clj4n19u20000jv08iqygiaqw

cubefox3y ago

Here is what the haunted house looks like with Dall-E ~3 (Bing Image Creator): https://www.bing.com/images/create/a-haunted-house-with-ghos...

SoKamil3y ago

Can we appreciate how well that lightbox works on this site in a mobile mobile browser, especially Safari? Also the gestures are smooth and do not cause any quirks like unintended refresh gesture

personjerry3y ago

[0]: https://news.ycombinator.com/item?id=36408744

senko3y ago

For comparison, these were generated using Stability.ai API: https://postimg.cc/gallery/MQfkgP7/ce388adf

Aeolun3y ago

> small windows opening onto the garden

Literally all of the examples have floor to ceiling windows across the entire length of the wall…

dahwolf3y ago

I'm glad it's not just me getting unusable garbage out of Dall-E and glorious results from MidJourney.

muhammadusmanOP3y ago

Author here: I updated the post to include the generated results from Stable Diffusion and Midjourney (thanks to kouteiheika and mdorazio).

j / k navigate · click thread line to collapse