There is no equivalent for illustrators.
My friends who studied some specific language are all unemployed or doing unqualified jobs. Their peers from a generation before are teachers or work in some embassy.
That said, before some unicorn really start doing some serious polishing, you'll still want some illustrators to piece art together. Taking the output of these models won't deliver a ready made product easily.
I guess lots of artists will move into teaching art.
I see tools like this might increase interest by the public into making their own art with the help of new tools, and some will want to be taught.
That is, a surface-level view might show these things as equivalent, but the skills required to produce a decent result are not encapsulated in the averages that models contain.
But I'm more bothered by sociatal effects where art is automated. I believe it'll expedite the effects we saw when the internet short circuited the feedback loop for creators, killing any gaps where non revenue optimizing humane creative force could thrive. Not to mention the crazy mimetic positive feedback loops tearing the discourse apart.
Text-to-image algos did the same thing for a while, but you look at the latest full-size DALL-E and it's pretty much flawless.
If I were considering art school, I'd certainly be reconsidering my options. Maybe there are some defects in the output, but nothing photoshop can't fix.
I think where humans win out (for now) is where a high degree of specificity/precision is needed (e.g. graphic design). Or certain legal requirements are present - AI art can't be copyrighted at this time - such as logo design.
But it's not going to tell you in clear words if your prompt was bad to begin with, like a human would, hopefully :).
Further, artists have a host of skills that DALL-E doesn’t, like “take that image, but change the colors a bit to make it more acceptable to the client, and move the cartoon bird a little further down”. Or “make an image that will look as good in a print as it does on a small screen”.
It was originally a fractal image generation app but it's expanded over time and now has a fairly foolproof installer for all the models you're likely to have heard of (those that have been released anyway).
Needs a google sub to run colab (DALL-E itself needs Colab Pro, but other models run on free version).
edit: not local! but very handy.
<caveat emptor>
bare minimum GPU: NVIDIA 2080 with 8GB VRAM
300 Gb of disk space
</caveat emptor>
Its great if you want to run more "classic" AI algorithms as well!
There are tools to convert to intermediate formats, like ONNX, but they are limited and don't work all the time. The automatic conversion tools usually assume that you can trace execution of the model for a dummy input and usually only work well if there isn't any complex logic (e.g. conditions can be problematic). Some operations aren't supported well, etc.
This isn't always technically difficult, but it's tedious because it usually involves double checking that at all steps, the model produces identical outputs for a given input. An additional challenge when transferring weights is that models are fragile and minor differences might have large effects on the predictions (even though if you trained from scratch, you might get similar results).
Also for deployment, the less cruft in the repository the better. A lot of research repositories end up pulling in all kinds of crazy dependencies to do evaluation, multiple big frameworks etc.
Is it a problem of accumulation of floating-point errors in operations that are done in a different order and with different kinds of arithmetic optimisations (so that they would be identical if they used un-optimised symbolic operations), or is there something else in the implementation of a neural network that I'm missing?
Not to discourage the OP of course, great work.
On my desktop, running the example
> python image_from_text.py --text='alien life' --seed=7
results in
> RuntimeError: This version of jaxlib was built using AVX instructions, which your CPU and/or operating system do not support. You may be able work around this issue by building jaxlib from source.
Unfortunately, following the instructions to build JAXlib from source (https://jax.readthedocs.io/en/latest/developer.html#building...) result in several 404 not found errors, which later cause the build to stop when it tries to do something with the non-existent files.
Unfortunately, it looks like I won't be running this today.
It's not listed in the requirements
I've posted it as an issue
Weight & Biases
> Why does it keep asking for an API key
From the README:
the Weight & Biases python package is used to download the DALL·E Mini and DALL·E Mega transformer models
It might not be obvious you need an account if you aren't in the field though.
I'd prefer to download it myself and choose where I put it too.
It now uses some hashed filename in some config directory in your homedir for this, I dislike this and want control over where I put models, make it more self contained instead of random directories spread all over your OS, and give them as input by file path.
This feedback is about dalle mini playground instead but it does the same thing. If this one is stripped to bare essentials I'd expect this type of dependencies stripped too.
Edit: I don't want to seem like complaining too much though and am very happy with these open models and tooling for them. Thanks!
EDIT: as I am being accused of inventing it I will quote the terms of agreement and license, since maybe its own founder seems to not have read it or someone without training on how to write proper terms and agreements made it for them and the restrictive usage of "Material" does apply to its hosted software.
Note that there is no formal definition of "Materials" or "Service", so that it applies to all the contents of the webpage including the software stored there: https://wandb.ai/site/terms
I quote it:
2. Use License Whether you are accessing the Services for personal, non-commercial transitory viewing only (our free license for individuals only), for academic use, or for commercial purposes (our subscription package for businesses), permission is granted to temporarily download one copy of the information or software (the “Materials”) from our website. This is the grant of a license, not a transfer of title, and under this license, you may not: a. Modify or copy the Materials; b. Use the Materials for any commercial purpose, or for any public display (commercial or non-commercial); c. Attempt to decompile or reverse engineer any software contained in the Materials; d. Remove any copyright or other proprietary notations from the Materials; or e. Transfer the Materials to another person or "mirror" the Materials on any other server. This license shall automatically terminate if you violate any of these restrictions and may be terminated by us at any time. Upon terminating your viewing of these materials or upon the termination of this license, you must destroy any downloaded materials in your possession whether in electronic or printed format. f. Utilize our personal license for individuals for commercial purposes and any such use of our personal license for commercial purposes (e.g. using your corporate email) may result in immediate termination of your license.
Why do you think that?
EDIT: I'll edit respond, since you did. Look at sections 3b and 3c in the terms, they cover Models and other user content specifically. Those are user property, not our property. But I can see how this is confusing. We will clarify it.
https://www.smithsonianmag.com/smart-news/us-copyright-offic...
> But copyright law only protects “the fruits of intellectual labor” that “are founded in the creative powers of the [human] mind.” COMPENDIUM (THIRD) § 306 (quoting Trade-Mark Cases, 100 U.S. 82, 94 (1879)); see also COMPENDIUM (THIRD) § 313.2 (the Office will not register works “produced by a machine or mere mechanical process” that operates “without any creative input or intervention from a human author” because, under the statute, “a work must be created by a human being”). So Thaler must either provide evidence that the Work is the product of human authorship or convince the Office to depart from a century of copyright jurisprudence.
[0] https://www.copyright.gov/rulings-filings/review-board/docs/...
https://github.com/kuprel/min-dalle/issues/1#issuecomment-11...
python3 image_from_text.py --text='a happy giraffe eating the world' --seed=7 154.61s user 22.18s system 262% cpu 1:07.40 total
WARNING:absl:No GPU/TPU found, falling back to CPU. (Set TF_CPP_MIN_LOG_LEVEL=0 and rerun for more info.)
As you can see, it took 1min 7seconds to complete.I assume it would be much faster with a grunty graphics card
Mini = 5.33 s
Mega = 14.7 s
Update: about 1/2 that time is just loading the model, so if you load the model and then generate multiple images, it drops to:
Mini = 3.91 s
Mega = 8.86 s
If it depends on the hardware, what would be the limit when one rents the biggest machine available in the cloud?
Edit: actually it's easier to open a terminal and move /content/pretrained/vqgan to /content/min-dalle/pretrained/vqgan
UnfilteredStackTrace Traceback (most recent call last) <ipython-input-2-0e20e3adf861> in <module>() 2 ----> 3 image = generate_image_from_text("alien life", seed=7) 4 display(image)
67 frames UnfilteredStackTrace: TypeError: lax.dynamic_update_slice requires arguments to have the same dtypes, got float16, float32.
The stack trace below excludes JAX-internal frames. The preceding is the original exception that occurred, unmodified.
--------------------
The above exception was the direct cause of the following exception:
TypeError Traceback (most recent call last) /content/min-dalle/min_dalle/models/dalle_bart_decoder_flax.py in __call__(self, decoder_state, keys_state, values_state, attention_mask, state_index) 38 keys_state, 39 self.k_proj(decoder_state).reshape(shape_split), ---> 40 state_index 41 ) 42 values_state = lax.dynamic_update_slice(
TypeError: lax.dynamic_update_slice requires arguments to have the same dtypes, got float16, float32.
Clone before kill.
I get this error detokenizing image Traceback (most recent call last): File "/home/ubuntu/work/min-dalle/image_from_text.py", line 44, in <module> image = generate_image_from_text( File "/home/ubuntu/work/min-dalle/min_dalle/generate_image.py", line 74, in generate_image_from_text image = detokenize_torch(image_tokens) File "/home/ubuntu/work/min-dalle/min_dalle/min_dalle_torch.py", line 107, in detokenize_torch params = load_vqgan_torch_params(model_path) File "/home/ubuntu/work/min-dalle/min_dalle/load_params.py", line 11, in load_vqgan_torch_params params: Dict[str, numpy.ndarray] = serialization.msgpack_restore(f.read()) File "/usr/local/lib/python3.10/dist-packages/flax/serialization.py", line 350, in msgpack_restore state_dict = msgpack.unpackb( File "msgpack/_unpacker.pyx", line 201, in msgpack._cmsgpack.unpackb msgpack.exceptions.ExtraData: unpack(b) received extra data.
TypeError: lax.dynamic_update_slice requires arguments to have the same dtypes, got float32, float16.