A prompt engineering guide for DALLE-2 (opens in new tab)

(dallery.gallery)

242 pointskeveman3y ago62 comments

62 comments

48 comments · 13 top-level

Great document.

Damn I am salivating to get access to Dall-E for some projects. Been on the waiting list for quite a while.

I've been experimenting with Midjourney, which is amazing for spooky/ethereal artwork, but it struggles with complex prompts and realism.

zitterbewegung3y ago

You can have free access here to dalle at https://replicate.com/nicholascelestin/dalle-mega and dalle mini https://huggingface.co/spaces/dalle-mini/dalle-mini

pjgalbraith3y ago

That's an open source recreation based on DALL-E 1. It's different to DALL-E 2, if you want that look for DALLE2-pytorch, but note that it hasn't been trained fully yet.

konfusinomicon3y ago

My propmt of 'penguin smoking a bong' does not disappoint on either, although hugging face more accurately portrayed the act of smoking, while replicate gave me images of penguin shaped bongs

1 more reply

Nursie3y ago

DALL-E mini/Craiyon is fantastic, but it doesn't compare to DALL-E2 at present, when you're talking about photorealism.

That said, some styles (Comic book spreads) seem to come out better on Craiyon. And DALLE 2 does not know what a Crungus is.

1 more reply

Centmo3y ago

Is this significantly different from Dall-E2?

1 more reply

DecayingOrganic3y ago

Hang in there — I only got my invitation a couple days ago. They're still rolling out invitations at a steady pace. But, just as a side note, one of the first things they tell you is that they own the full copyright for any images you generate.

You definitely have to play around with prompts to get a feel for how it works and to maximize the chance of getting something closer to what you want.

woojoo6663y ago

When did you sign up? I just signed up, and it sounds like it takes a year to get access, probably longer now. It's a bit frustrating because I didn't sign up when it came out because I didn't need it at the time, but now I'm afraid of waiting a year when I do. These types of waitlist systems encourage everybody to sign up for everything on the off-chance that they might need it later. Wish they just went with a simple pay-as-you-go model (with free access for researchers and other special cases who request it), like how Copilot does it.

1 more reply

nyanpasu643y ago

I don't think the provider of an AI image generator service can decide they own the copyrights to it (perhaps they can require you assign the copyrights, though it may not even be copyrightable?), only courts can (and they decided the person setting up cameras for monkeys didn't own the copyrights to the monkey photos)?

2 more replies

noduerme3y ago

I've played with Midjourney for awhile and just got my invite to DALL-E last night. One thing I think is really cool about Midjourney is the ability to give it image URLs as part of the prompts. I can't say I've had tremendous success with it, and it still feels a little half-baked, but I wish DALL-E had something along those lines. (Unless it does and I'm missing it). It's much easier to show examples of a particular style than to try to describe it, especially if it isn't something specifically named in the AI's training set.

pjgalbraith3y ago

You can upload an image to DALL-E, edit it and add a prompt to it as well.

1 more reply

kromem3y ago

Just got mine last night. I think they are scaling up invites in the past few days.

raunak3y ago

Same here - I got mine 2 days ago. Signed up when it first dropped.

kriro3y ago

I'm also waiting but only put myself on the waitlist recently. I want to use it to generate synthetic image datasets from text descriptions. Very curious to explore the depth of what can be generated.

margoguryan3y ago

Get familiar with CLIP regardless! I have very little interest in DALL-E as an artist/prompter but as a futurist it is quite exciting.

muzani3y ago

I use both too. Dall E has heavy restrictions. It's basically G rated, so no horror. And no real world stuff like "Donald Trump with a mohawk".

MJ falls apart when you ask for fine detail. It's a bit of the AI cliche where you have to describe the colour, shape, etc in detail to mold what you want. Asking for a "monkey, gorilla, and chimp riding a bicycle" might have a chimp riding a monkey-gorilla as a bicycle.

Dall E is a lot better with words. It seems to "smooth" some stuff. Like asking for a bone axe will still show regular axes.

But MJ is probably the best choice if you want to do landscapes and stuff, especially horror/dystopian themed.

indiv03y ago· 4 in thread

I was expecting some clickbait/spam (the layout of the website has that feel) but this was surprisingly super in-depth and 100% matches up with my experience doing prompt engineering.

There's a fine line between so descriptive that the AI hits an edge case and can't get out of it (so every attempt looks the same) and not being descriptive enough (so you can't capture the output you're looking for). DALL-E is already incredibly fast compared to public models and I can't wait for the next order-of-magnitude improvement in generation speed.

Real-time traversal of the generation space is absolutely key for getting the output you want. The feedback loop needs to be as quick as possible, just like with programming.

muzani3y ago

I'm surprised at the artistic skill of the person who wrote the book, in contrast with the terrible web UI skill of the person who designed the site.

eru3y ago

Wouldn't surprise me too much, if they were the same person, but had vastly different amounts of experience with the different media?

margoguryan3y ago

As someone who makes very weird and experimental stuff, DALL-E is like a Segway and CLIP is like a horse (especially with those edge cases that tend to self-engorge/get worse if you aren't clever). It's a shame compute costs aren't much different between the two (correct me if I'm wrong) - I don't think there is much of a purely artistic process with DALL-E, although I do like to use DALL-E Mini thumbnails as start images or upscale testers.

>Real-time traversal of the generation space is absolutely key for getting the output you want.

I've been sketching around a two-person browser game where a pair of prompters can plug things in together in real-time :D

jsiaajdsdaa3y ago

Another interesting thing with prompt engineering is that attempt #1 with prompt x might yield something you don't want, but attempt n might yield something you do :)

yoyopa3y ago· 4 in thread

i really don't understand how people can appreciate something like this. to me it just filling the world with literally mindless garbage.

andybak3y ago

I really don't understand how someone wouldn't find this incredibly fascinating as well as intensely fun.

estevaoam3y ago

Sure, because having a synthetic intelligence that seems to understand complex concepts to create coherent visual art is something humans are used to.

oxplot3y ago

Mindless garbage is what majority of humans create in every field.

muzani3y ago

It's more similar to photography/fishing than other art forms.

aantix3y ago· 4 in thread

Dall-e still has a lot of work to be done with face construction.

Maybe that’s a feature not a bug.

muzani3y ago

It's seems to be by far the best of any other drawing AI besides the "this person does not exist" series, but those are quite specialized.

You could be right though. It does "digital art" well, but realistic faces poorly, and they slap down lots of restrictions to avoid deepfaking.

astrange3y ago

Google's internal models (Imagen and Parti) are much better. It looks like DALLE2 is just not big enough to accurately draw faces, which are very detailed things.

"This person doesn't exist" uses StyleGAN which can definitely do faces, but can't do general pictures.

1 more reply

gfodor3y ago

I think they’re not training on faces on purpose.

pjgalbraith3y ago

You are probably right. Having used it I sometimes get images with white polygons covering the faces of people as if they have been blanked out.

skybrian3y ago· 2 in thread

Nice! I was wondering why there are example images of real-looking people, but it seems this is allowed now:

https://www.vice.com/en/article/g5vbx9/dall-e-is-now-generat...

IshKebab3y ago

Hmm I signed up 2 days ago and it still says "Please don't share images of realistic faces." when you sign up.

skybrian3y ago

Yeah, I saw that too, but it doesn't seem to be in the terms of use?

codeshaunted3y ago· 2 in thread

This would be super useful if I actually had access :P

skybrian3y ago

When did you sign up? They seem to opening the gates more:

https://mobile.twitter.com/sama/status/1547212678644371457

ccmcarey3y ago

I signed up on day 2, still no access :')

1 more reply

alexjray3y ago· 1 in thread

The Open AI clear content policy is quite interesting to me. It's reasonable but clearly controlling.

teaearlgraycold3y ago

They’re trying to walk a fine line. Maximizing revenue while avoiding regulation.

godmode20193y ago· 1 in thread

Can anybody recommend a prompt engineering resource for language models?

Interesting topic

eru3y ago

Perhaps https://arxiv.org/pdf/2102.07350.pdf

Also Gwern has done a lot on this.

seydor3y ago· 1 in thread

What's the copyright situation for images from dalle/imagen?

jazzyjackson3y ago

AFAIK the only lawsuit that tests this so far was a kind of weird case where the programmer was trying to register his algorithm as the creator of the image, as a "work-for-hire". The copyright office's reasoning however banged on about the necessity of "human authorship"

> The Office also stated that it would not “abandon its longstanding interpretation of the Copyright Act, Supreme Court, and lower court judicial precedent that a work meets the legal and formal requirements of copyright protection only if it is created by a human author.”

https://www.copyright.gov/rulings-filings/review-board/docs/...

trention3y ago· 1 in thread

Calling this "engineering" is just beyond parody.

wnkrshm3y ago

It's as much engineering as SEO. Though with the 'prompt engineering' it's the human brain trying to coax something out of the black box - ironically, an algorithm might be better at generating the prompts after being given points in its parameter space that fit the aesthetic direction the user wants to explore.

tracyhenry3y ago

Based on this, an interesting project would be paraphrasing any regular prompt into a prompt that works for DALLE-2.

alana3143y ago

This is great, lots of good ideas in the deck.

totetsu3y ago

There is some shared Google docs in the dalle2 discord community about this too.

j / k navigate · click thread line to collapse

62 comments

48 comments · 13 top-level

o_____________o3y ago· 15 in thread

Great document.

Damn I am salivating to get access to Dall-E for some projects. Been on the waiting list for quite a while.

I've been experimenting with Midjourney, which is amazing for spooky/ethereal artwork, but it struggles with complex prompts and realism.

zitterbewegung3y ago

You can have free access here to dalle at https://replicate.com/nicholascelestin/dalle-mega and dalle mini https://huggingface.co/spaces/dalle-mini/dalle-mini

pjgalbraith3y ago

That's an open source recreation based on DALL-E 1. It's different to DALL-E 2, if you want that look for DALLE2-pytorch, but note that it hasn't been trained fully yet.

konfusinomicon3y ago

My propmt of 'penguin smoking a bong' does not disappoint on either, although hugging face more accurately portrayed the act of smoking, while replicate gave me images of penguin shaped bongs

1 more reply

Nursie3y ago

DALL-E mini/Craiyon is fantastic, but it doesn't compare to DALL-E2 at present, when you're talking about photorealism.

That said, some styles (Comic book spreads) seem to come out better on Craiyon. And DALLE 2 does not know what a Crungus is.

1 more reply

Centmo3y ago

Is this significantly different from Dall-E2?

1 more reply

DecayingOrganic3y ago

You definitely have to play around with prompts to get a feel for how it works and to maximize the chance of getting something closer to what you want.

woojoo6663y ago

1 more reply

nyanpasu643y ago

2 more replies

noduerme3y ago

pjgalbraith3y ago

You can upload an image to DALL-E, edit it and add a prompt to it as well.

1 more reply

kromem3y ago

Just got mine last night. I think they are scaling up invites in the past few days.

raunak3y ago

Same here - I got mine 2 days ago. Signed up when it first dropped.

kriro3y ago

margoguryan3y ago

Get familiar with CLIP regardless! I have very little interest in DALL-E as an artist/prompter but as a futurist it is quite exciting.

muzani3y ago

I use both too. Dall E has heavy restrictions. It's basically G rated, so no horror. And no real world stuff like "Donald Trump with a mohawk".

Dall E is a lot better with words. It seems to "smooth" some stuff. Like asking for a bone axe will still show regular axes.

But MJ is probably the best choice if you want to do landscapes and stuff, especially horror/dystopian themed.

indiv03y ago· 4 in thread

I was expecting some clickbait/spam (the layout of the website has that feel) but this was surprisingly super in-depth and 100% matches up with my experience doing prompt engineering.

Real-time traversal of the generation space is absolutely key for getting the output you want. The feedback loop needs to be as quick as possible, just like with programming.

muzani3y ago

I'm surprised at the artistic skill of the person who wrote the book, in contrast with the terrible web UI skill of the person who designed the site.

eru3y ago

Wouldn't surprise me too much, if they were the same person, but had vastly different amounts of experience with the different media?

margoguryan3y ago

>Real-time traversal of the generation space is absolutely key for getting the output you want.

I've been sketching around a two-person browser game where a pair of prompters can plug things in together in real-time :D

jsiaajdsdaa3y ago

Another interesting thing with prompt engineering is that attempt #1 with prompt x might yield something you don't want, but attempt n might yield something you do :)

yoyopa3y ago· 4 in thread

i really don't understand how people can appreciate something like this. to me it just filling the world with literally mindless garbage.

andybak3y ago

I really don't understand how someone wouldn't find this incredibly fascinating as well as intensely fun.

estevaoam3y ago

Sure, because having a synthetic intelligence that seems to understand complex concepts to create coherent visual art is something humans are used to.

oxplot3y ago

Mindless garbage is what majority of humans create in every field.

muzani3y ago

It's more similar to photography/fishing than other art forms.

aantix3y ago· 4 in thread

Dall-e still has a lot of work to be done with face construction.

Maybe that’s a feature not a bug.

muzani3y ago

It's seems to be by far the best of any other drawing AI besides the "this person does not exist" series, but those are quite specialized.

You could be right though. It does "digital art" well, but realistic faces poorly, and they slap down lots of restrictions to avoid deepfaking.

astrange3y ago

Google's internal models (Imagen and Parti) are much better. It looks like DALLE2 is just not big enough to accurately draw faces, which are very detailed things.

"This person doesn't exist" uses StyleGAN which can definitely do faces, but can't do general pictures.

1 more reply

gfodor3y ago

I think they’re not training on faces on purpose.

pjgalbraith3y ago

You are probably right. Having used it I sometimes get images with white polygons covering the faces of people as if they have been blanked out.

skybrian3y ago· 2 in thread

Nice! I was wondering why there are example images of real-looking people, but it seems this is allowed now:

https://www.vice.com/en/article/g5vbx9/dall-e-is-now-generat...

IshKebab3y ago

Hmm I signed up 2 days ago and it still says "Please don't share images of realistic faces." when you sign up.

skybrian3y ago

Yeah, I saw that too, but it doesn't seem to be in the terms of use?

codeshaunted3y ago· 2 in thread

This would be super useful if I actually had access :P

skybrian3y ago

When did you sign up? They seem to opening the gates more:

https://mobile.twitter.com/sama/status/1547212678644371457

ccmcarey3y ago

I signed up on day 2, still no access :')

1 more reply

alexjray3y ago· 1 in thread

The Open AI clear content policy is quite interesting to me. It's reasonable but clearly controlling.

teaearlgraycold3y ago

They’re trying to walk a fine line. Maximizing revenue while avoiding regulation.

godmode20193y ago· 1 in thread

Can anybody recommend a prompt engineering resource for language models?

Interesting topic

eru3y ago

Perhaps https://arxiv.org/pdf/2102.07350.pdf

Also Gwern has done a lot on this.

seydor3y ago· 1 in thread

What's the copyright situation for images from dalle/imagen?

jazzyjackson3y ago

https://www.copyright.gov/rulings-filings/review-board/docs/...

trention3y ago· 1 in thread

Calling this "engineering" is just beyond parody.

wnkrshm3y ago

tracyhenry3y ago

Based on this, an interesting project would be paraphrasing any regular prompt into a prompt that works for DALLE-2.

alana3143y ago

This is great, lots of good ideas in the deck.

totetsu3y ago

There is some shared Google docs in the dalle2 discord community about this too.

j / k navigate · click thread line to collapse