undefined | Better HN

0 pointsdougmwne4y ago0 comments

These are fun discussions because words like "artistic creativity" have a colloquial meaning that could only apply to humans since the dawn of humanity. Now you have an image of Kermit in Wall-E. I have never seen or conceived of an image of Kermit in Wall-E. Let's assume that adorable robot Kermits do not exist in the training data to be spit out like a search algorithm.

The image is new, it did not previously exist. It is a creation, a very vague idea of a few words that was created in full realization.

So it sees like the only difference between the "Not creativity" that Dall-E is doing and "Real Creativity" that humans do is tht humans are the ones doing it?

I agree there's this concept of expanding the frontiers of human aesthetic capability that has slow-marched from cave paintings till post-modernism. That there are a very few artists that invent completely new styles that the rest of us copy and remix. It's questionable that Dall-E can do that, but I'm also not sure that it can't do that.

0 comments

2 comments · 1 top-level

orbital-decay4y ago· 1 in thread

>So it sees like the only difference between the "Not creativity" that Dall-E is doing and "Real Creativity" that humans do is tht humans are the ones doing it?

The differentiator is whether the result is worthy to look at for humans, that's all.

In case of the OP, you forget that the human had to predict that the combination of two would be interesting for other humans, and then construct the prompt, possibly selecting the best pictures. That's who did most of the work here, and it was effortless for the neural network of a human. Could DALL-E analyze the world autonomously without human intervention at all? No, it's an open loop system.

Novelty hinges on the ability to conceptualize things, not the execution. Sure, DALL-E 2 shows a glimpse of conceptualization internally as it works with compressed descriptions (concepts, abstractions) of things it draws. But it's super limited and not flexible enough to create new ideas, it doesn't have either short-term or long-term memory, it doesn't change, it has all knowledge about Kermit and Blade Runner pre-baked, and so on. You have to re-train it from scratch every time you want it to remember something truly new, there's no feedback loop to do that. Human ability is still much more powerful.

DALL-E 2 is almost at the point where it can supplement the human conceptualization with AI execution, though. Possibly in a couple iterations it will be there, with more believable results. In very limited cases, of course - as it's temporally unstable (it makes a totally new image each time), cannot correct the output from new details provided by a human (like a professional concept artist could), etc.

treesprite824y ago

> you forget that the human had to predict that the combination of two would be interesting for other humans, and then construct the prompt, possibly selecting the best pictures. That's who did most of the work here

If the Twitter user claimed that the text prompts themselves were generated by asking GPT-3 for "An interesting sequence of text prompts to feed an image generation AI" or something like that, I would have believed them.

Presumptively, I imagine it's harder to create a model that generates images matching a certain human-language input prompt than to create an image generation model with no language component and have it pick its own scenarios internally. I don't think the former is done to palm off "most of the work" to humans, but rather because people want an easy way to see create own ideas so there's more demand for it.

> You have to re-train it from scratch every time you want it to remember something truly new, there's no feedback loop to do that.

As far as I'm aware, this isn't true. Deep learning is perfectly compatible with fine-tuning an existing model using new data. OpenAI/MS have been doing this with Codex to improve it based on Copilot telemetry and code from new languages/libraries.

j / k navigate · click thread line to collapse