Show HN: InvokeAI, an open source Stable Diffusion toolkit and WebUI (opens in new tab)

(github.com)

414 pointssophrocyne3y ago102 comments

Hey everyone!

Excited to be able to share the release of `InvokeAI 2.0 - A Stable Diffusion Toolkit`, an open source project that aims to provide both enthusiasts and professionals a suite of robust image creation tools. Optimized for efficiency, InvokeAI needs only ~3.5GB of VRAM to generate a 512x768 image (and less for smaller images), and is compatible with Windows/Linux/Mac (M1 & M2).

InvokeAI was one of the earliest forks off of the core CompVis repo (formerly lstein/stable-diffusion), and recently evolved into a full-fledged community driven and open source stable diffusion toolkit titled InvokeAI. The new version of the tool introduces an entirely new WebUI Front-end with a Desktop mode, and an optimized back-end server that can be interacted with via CLI or extended with your own fork.

This version of the app improves in-app workflows leveraging GFPGAN and Codeformer for face restoration, and RealESRGAN upscaling - Additionally, the CLI also supports a large variety of features: - Inpainting - Outpainting - Prompt Unconditioning - Textual Inversion - Improved Quality for Hi-Resolution Images (Embiggen, Hi-res Fixes, etc.) - And more...

Future updates planned included UI driven outpainting/inpainting, robust Cross Attention support, and an advanced node workflow for automating and sharing your workflows with the community.

We're excited by the release, and about the future of democratizing the ability to create. Check out the repo (https://github.com/invoke-ai/InvokeAI) to get started, and join us on Discord (https://discord.gg/ZmtBAhwWhy)!

Show HN: InvokeAI, an open source Stable Diffusion toolkit and WebUI

(github.com)

414 pointssophrocyne3y ago102 comments

Hey everyone!

Future updates planned included UI driven outpainting/inpainting, robust Cross Attention support, and an advanced node workflow for automating and sharing your workflows with the community.

102 comments

80 comments · 19 top-level

swyx3y ago· 17 in thread

[OT] its been hard for me to trace the universe of stable diffusion forks so ive been maintaining a list here: https://github.com/sw-yx/prompt-eng#sd-major-forks

please let me know/send PRs if i missed anything, its been a couple months so i'm overdue for a round of cleanup/reorganizing

capableweb3y ago

I'm personally working on a UI as well, that is using InvokeAI :) It has a bit of a different focus, namely organization of generated images and facilitating generating a lot of images quickly via randomization. Here is the current page for it: https://patreon.com/auto_sd_workflow

Currently expanding it a lot with some fun features:

- multi-gpu support (first UI that would support that I think)

- no-click installer (installs everything when you start it up) that works on Windows, Linux and macOS

- A cloud version where you can "rent" access to the UI + very powerful GPU instances without having to run anything locally yourself.

Been waiting to submitting it all to HN as a Show HN but have to wait a bit for everything to get into place first :)

swyx3y ago

since its paid, i'd classify you as a distro :) https://github.com/sw-yx/prompt-eng/blob/main/README.md#sd-d...

1 more reply

jayd16163y ago

Here's another one for you: https://github.com/brycedrennan/imaginAIry

stavros3y ago

Imaginairy is great.

grosswait3y ago

Maybe I missed it but didn’t see https://github.com/divamgupta/diffusionbee-stable-diffusion-...

swyx3y ago

i have diffusionbee.com

hleszek3y ago

I see you've got my https://github.com/leszekhanusz/diffusion-ui gui but it seems to be linked to a completely unrelated face-swapping interface?

And in communities, you can probably add https://stablehorde.net

swyx3y ago

uhhh... copy paste brainfart sorry. thanks for correction

hafriedlander3y ago

Since everyone's jumping on the plug wagon, allow me to throw on my project, www.stablecabal.org.

We've got a GRPC server that's backwards compatible with the official grpc.stability.ai but adds advanced features, a Flutter based infinite canvas web client in heavy development, a Discord server and a Krita & Photoshop plugin (all except the last are open source under various licenses).

The outpainting the server supports is (IMHO) the best I've seen, and recently got some a tiny glimpse of attention when combined with the Photoshop plugin - https://twitter.com/NicolayMausz/status/1577767106384433156

swyx3y ago

great submission, thank you. god its so hard to keep on top of who is doing what.

am reworking my readme now with all the updates

hexomancer3y ago

Shameless plug: I was frustrated with the poor UI of notebook-based frontends so I wrote a desktop version here: https://github.com/ahrm/UnstableFusion .

Here is a video of some of its features: https://www.youtube.com/watch?v=XLOhizAnSfQ&t=1s

westoncb3y ago

Here's a cross-platform desktop GUI[0] for img2img/txt2img that takes a unique angle by remaining independent of specific models/scripts, though it was originally designed to work with Stable Diffusion.

[0] https://github.com/westoncb/generation-q

cmdr23y ago

https://github.com/cmdr2/stable-diffusion-ui is pretty popular, and is a 1-click installer for Win and Linux (Mac coming soon). Quite a lot of features, and well-liked by users for its easy-to-install and user-friendly GUI.

swyx3y ago

ah thanks, i've seen you around but somehow forgot to add. I've put you in as a distro since you bundle SD: https://github.com/sw-yx/prompt-eng/blob/main/README.md#sd-d...

1 more reply

wyldfire3y ago

Is "dreambooth" a fork? Or another feature that has been created by composing Stable Diffusion w/something else?

swyx3y ago

yeah you're right, more the latter. i should split it out.

OG dreambooth is proprietary Google code. what everyone's using is a third party replication of it using SD

1 more reply

zenlikethat3y ago

Another feature. You can "teach" SD a new concept, e.g., a new person, with a limited number of training images.

cercatrova3y ago· 13 in thread

Speaking of SD, I wonder if 1.4 will be the last truly open release as Emad said 1.5 would release a while ago but it's been held up for "compliance" reasons. Maybe they got legal threats due to using artists' works and stock images. If so, that would be sad to see it.

In a way it reminds me of people who make unofficial remakes of games but get cease and desists if they show gameplay while in development. The correct move is to fully develop the game and release it, then if you get C&Ds, too late, the game is already available to download.

zamber3y ago

There was an AMA with Emad yesterday on discord. He got asked this. The promise is that 1.5 will be released in the following week.

The slowdown has numerous issues. They got legal threats, death threats, and threats from some congresswoman to have them banned by the NSA (1).

Stability.ai workers (except for one) have a clause that they can open-source anything they're working on. They do and supposedly will open-source everything because they want to do a ecosystem, not a cash grab in the model of DALL-E.

Also they don't have one central place for all their projects and will scale from 100 to 250 employees in the following year so things should speed up.

1) https://eshoo.house.gov/media/press-releases/eshoo-urges-nsa...

nl3y ago

> banned by the NSA

Note that this NSA is the National Security Advisor, not the National Security Agency.

1 more reply

swyx3y ago

make what you will of it but as of yesterday this was his answer to one of my readers: https://twitter.com/EMostaque/status/1579204017636667392

> No actually dev decision. Generative models are complex to release responsibly and team still working on release guidelines as they get much better, 1.5 is only a marginal FID improvement.

illuminati19113y ago

Honestly this whole "responsible AI" thing is a sad last attempt to run away from the inevitable. The reality is that our politicians will end up in fake photos/videos/audio recordings, people who can't even draw a straight line will be able to make crazy online memes with any kind of imaginable image in just seconds no matter how offensive it is to some people and there is absolutely nothing we can do about it.

When these companies and OS projects create "responsible" or restricted AIs they are at the same time creating a demand for AIs that have no limitations and eventually open source or even commercial AIs will respond to this demand.

I hope while they still play this "responsible" game, they are at least using the time to figure out ways how we can live with this kind of advanced AI in a future where everything is fake/false by default.

3 more replies

londons_explore3y ago

Sounds like some middle manager is trying to put his foot down and is saying things like "no more releases till we have designed and tested a 37 step release signoff procedure"

sophrocyneOP3y ago

My take is that the “genie is out of the bottle”

Single source “massive models” may be more difficult to get out, but Emad said they’re working in licensing a ton of content to train future models. Even then, anyone can train new models now - The output from Dreambooth and Textual Inversion are already impressive, and seem like just the beginning.

Going to be an interesting road ahead.

jeffparsons3y ago

Question from an ML-illiterate:

Are there known ways the training for these models could be distributed/decomposed? E.g. SETI-style distribution of a homogenous centrally-defined task, or — much more exciting — recombination of several different models / sets of weights? (I'm just throwing words around here without really understanding them.)

I'm imaging a world in which one group of enthusiasts could work together to train a model on all images on Wikipedia, another group could work on training a model that understands hands really well, and then later yet another group could combine the work of the other two without doing all that training from scratch.

Is that even remotely plausible?

3 more replies

cmxch3y ago

Definitely is out of the bottle, especially when training capable cards are getting within reach of regular people.

Sort of hinted at it upthread, but would be interesting if this eventually brings competition to the GPU compute space (AMD, Intel?) .

judge20203y ago

Just train on the output of existing models minus any photos with watermarks - being twice removed is sure to make it even harder to claim copyright :)

F2hP18Foam3y ago

I wonder if the same will happen to Midjourney or Dall-E. I have generated images on Midjourney that literally had a 'Shutterstock' watermark plastered across them. This watermark was conspicuously missing when the image was upscaled.

throwaway0x7E63y ago

not to Dall-E, because OpenAI is the very source of all these "ethical concerns" about unwashed masses having access to these tools

klyrs3y ago

I've seen models amplify textures into noise, which attracts towards periodic noise, which has a goodly chance of turning into a watermark. Usually the watermark is gibberish proto-letters but yeah I've seen straight-up Shutterstock watermarks appear too.

noduerme3y ago

I've had stock photo watermarks show up repeatedly in SD generations as well.

1 more reply

KaoruAoiShiho3y ago· 6 in thread

Is there anything new here that might interest an existing user of auti's gui to switch?

sophrocyneOP3y ago

To be fair- Auto has been acquiring features at an insane clip (recently getting banned from the SD discord for accusations of code theft lol)

I think Invoke is competitive for now, but biggest advantage is an improved UX, and a large community with an ambitious roadmap focused more on enthusiasts/pros.

I’d give it a whirl and see where you end up preferring to do your SD projects :)

KaoruAoiShiho3y ago

Oh I see that my comment might be interpreted as snarky, I was literally just asking to please list the stuff that are new or different cause that would be very helpful for everyone.

1 more reply

hleszek3y ago

You can use https://diffusionui.com/b/automatic1111 once the automatic1111 webui is running:

- better inpainting

- a gallery to easily compare generations and easily regenerate images with small modifications

- responsive design --> works great on mobile, swipe left/right to switch between pictures in same generation and up/down to switch to another generation to compare

Here is the repo: https://github.com/leszekhanusz/diffusion-ui

Also if you don't have the hardware, you can also get images for free using the Stable Horde (https://stablehorde.net), a cluster of backends provided for free by volunteers.

You can test it here: https://diffusionui.com/b/stable_horde

nobo1223y ago

Automatic1111's webui is what 90% of the StableDiffusion community uses but he recently made the decision to use a company's proprietary code after it was leaked by a hacker and when confronted about it, instead of removing it as requested he chose to lie about it despite the git history evidence and the fact that the paper he claimed to have used as reference wasn't related at all to the techniques used by the stolen code.

The company whose code was stolen works closely with the man behind SD and the decision was made to merely ban him from the community instead of torpedo-ing the repo via DMCA.

1 more reply

nikkwong3y ago

One thing is that invoke-ai can be run via CLI or possibly programatically. I haven't found a good way to do that with the automatic GUI. Personally, I've also found features in automatic to be buggy. For example, batching seems to always break the UI for me personally. With the invoke-ai fork, I can run the CLI and produce images all night if I want to.

The bees knees would be being able to use automatic as a CLI or with some programmatic interface, because it is more feature rich. But I haven't seen anything that allows me to do that yet, so I'm stuck to its clunky UI or to use invoke-ai.

cmdr23y ago

Like I mentioned elsewhere, https://github.com/cmdr2/stable-diffusion-ui is pretty popular, and is a 1-click installer for Win and Linux (Mac coming soon). Quite a lot of features, and well-liked by users for its easy-to-install and user-friendly GUI.

pdntspa3y ago· 6 in thread

I am super stoked to see all these Stable Diffusion forks floating around, and I don't want to shit on the authors and their work that hard, but I swear the installation and packaging of these things is INSANE.

* Every single one of these seems to be a web UI, when this is desktop software that needs a desktop computer or workstation to run. Have we all collectively forgotten how to program PyGTK?

* Model files always go in the code repo. Have we forgotten how home folders work or what their purpose is? At the very least this one instructs you to make a shortcut/symlink if you don't want to copy the ckpt file yet again

* On that note, everything is autodownloaded to wherever the hell the programmer wants (once again, usually in the code repo itself). I must have four or five different copies of ESRGAN, and I spent a bunch of time monkeying around with automatic1111's fork trying to get it to correctly see everything when I ripped out the models folder and symlinked one in from a different place on my hard drive.

To the authors: can you all please get together and standardize some of this stuff? Models should go in user's homefolders, or at a customizable location, and NOT within the scope of stuff that can be touched by git pull. (Doing so causes git to freak out in many circumstances)

The breakneck pace of innovation here is awesome, but it feels like all gas no brakes on the usability front.

In the Bad Old Days(tm) you ran an install script which generates a desktop icon and you click that to run it. Meanwhile with this, on Windows, one has to open an anaconda prompt, activate the anaconda venv (or whatever it is), then manually invoke the whole thing with 'python scripts/invoke.py --web'. And if there's a one-click install script included (which invoke doesn't, but I am not knocking it for this!), half the time they seem to try and pull down the entire world all over again (a la sd-webui).

Like I get this need to make it easy to use, but it's like c'mon, there's is existing convention for all these things. Folks, please follow it!

If I had a wishlist, or the wherewithal to fork my own version, it would have:

* an actual GUI made with an actual windowing toolkit. I don't know why the hell everyone is so afraid of GTK, but I would use that. pyGTK is pretty simple IME, you can even read the C++ docs and it all maps over really nice to python. It doesn't need to be pretty!

* configurable model locations, preferably in an agreed-upon standardized hierarchy

* a standardized way of embedding prompt data into the PNG, a la automatic1111

* an uncomplicated but not overly optimistic setup process. An install.py and run.py, both with sensible defaults so that you don't need any command-line switches to run it except for special circumstances, and if it wants to autodownload updates then CHECK WITH ME FIRST! And preferably one that doesn't try to move my entire world (heres looking at you, sd-webui). And it will load the venv/conda environment for me.

And yes, for all the "put your money where your mouth is", I've been thinking about forking. But I don't know if I have the time or energy to keep up with all the developments in this space. But hey you never know...

nullc3y ago

Gotta love the tools that edit your bashrc! I'm not sure if that's worse or the mystery meat background auto-downloading. 0_o

Once the pace of innovation slows down I'm sure we'll see more effort from people with traditional software engineering experience come in to clean things up.

pdntspa3y ago

Oh god don't even get me started on that....

sophrocyneOP3y ago

- A web app is more versatile, as many users are running this and then accessing the client via other laptops & devices in the house. Understand your point, but not a priority. We do have a GUI mode that runs it in flask, but I don't think that's what you're getting at ;)

- 1 Click Install & Run is in the works to make this easier to install. We agree.

- Model locations are a valid point - We're working on being able to hot swap models mid-session, so I'll bring this up into the convo.

- Invoke has aligned on its own metadata structure for the ability to easily pull those parameters into future invocations. We're not worried about compatibility with Automatic.

No need to fork - Just join us on discord and complain loudly until we make things better. :)

pdntspa3y ago

I wasn't expecting this list of complaints to be read so positively. So, props for that. Might just see you guys on discord...

1 more reply

drawingthesun3y ago

If you're using Mac m1 DiffusionBee is a one click app install that I've been using to generate high quality renders in seconds.

The devs recently added img2img support too.

I had tried to install the more complete versions but I just end up in a wormhole of python and conda errors.

nl3y ago

FYI, the HuggingFace diffusers does the download of models sensibly.

It's probably worth following that.

Timwi3y ago· 4 in thread

Sounds awesome! Unfortunately, it says that it requires a GPU. Please consider making it accessible to people without a GPU, for example using OpenVino like this (command line only) project does:

https://github.com/bes-dev/stable_diffusion.openvino

Thanks!

hleszek3y ago

Those who don't have a GPU could use the Stable Horde: https://stablehorde.net

geuis3y ago

What you're asking for isn't entirely possible for local installs. Yes, you can run SD on a cpu, but each image takes minutes at a time vs seconds via gpu.

For example, it's not possible to run SD on my 2 year old 16 intel MacBook Pro. This is because PyTorch doesn't have support for the slightly older AMD gpu on board. There's a newer framework called RocM for AMD cards that allows them to work with recent versions of PyTorch.

Given all that, the requirements to have a Nvidia card is entirely acceptable, and for the most part a technical requirement.

nullc3y ago

Minutes isn't really that big a deal though, one could give it a list of prompts to round-robin through and come back in the morning to a huge collection of images to explore. It's just a different workflow.

Ironically, cpu support would be faster for me (in terms of throughput, at least) because I have on the order of a thousand zen cores put only a couple CUDA compatible GPUs with enough ram to run SD.

Timwi3y ago

> What you're asking for isn't entirely possible for local installs.

The project I linked to does it, so it's clearly possible. I didn't ask for speed, I only ask to be able to run it at all.

cmxch3y ago· 3 in thread

How hard of a requirement is the NVidia graphics chip? Polaris era AMD chips do work decently at the 4gb level (although a bit finicky) and Navi/Big Navi AMD cards work reasonably well with modern ROCm.

pja3y ago

Stable Diffusion works for me with a Polaris GPU. Had to compile my own local copy of Tensorflow to use it, but everything runs.

cmxch3y ago

Which documentation/build environment are you using?

I’m using Ubuntu(to follow what AMD has for ROCm) and building the entirety of (gfx803 patched) ROCm from source.

It works with some forks but not others.

2 more replies

wasyl3y ago

I ran some SD fork on Radeon Pro Vega 20 GPU, I'm not familiar with the whole setup but it was running "torch mps backend"? Anyway it was pretty fast and worked well, so I'm a bit surprised at lack of Intel macs support from all those SD forks

pdntspa3y ago· 3 in thread

Min requirements say 12gb, I take it this doesn't have the optimizations that automatic1111 has for <8gb cards?

teolandon3y ago

It says 12GB RAM, not VRAM. Right above that it says that it can work on 4GB VRAM cards.

capableweb3y ago

You can run it with lower VRAM for sure, up until some weeks ago, I was using that repository with a 11GB card.

pja3y ago

Yeah. Stable diffusion runs fine on my 8gb Polaris card (rx580) & I've heard of forks that will let you run it in 6 or even 4gb VRAM at a small cost in render time.

tehsauce3y ago· 2 in thread

A Shameless plug, if anyone is interested in building apps using stable diffusion and wants to keep things as cheap as possible, I built a very user-friendly API that is 1/4 the cost of the official stable diffusion API. There is also a free demo.

You can try it out:

https://computerender.com.

capableweb3y ago

The page at https://computerender.com/cost.html has the title "How is computerender 4x cheaper than other services hosting Stable Diffusion?" but doesn't actually explain how/why it is cheaper, just that "crowd-sourced servers are much more difficult to work" without elaborating on how what you're doing is different than that.

Care to shine some light on it? Using something like runpod/vast.ai would be my guess?

tehsauce3y ago

Hi! Your guess is correct. I monitor prices on vast.ai and runpod to get the best possible GPU power per dollar.

I updated the link you mentioned with some of this info!

iFire3y ago· 2 in thread

Can you make the ui InvokeAI as easy to install as running a Windows 11 command line script?

I couldn't get it to work following https://invoke-ai.github.io/InvokeAI/installation/INSTALL_WI...

sophrocyneOP3y ago

One click install is a goal, we just need a contributor who is confident taking it on as a project.

cmdr23y ago

Hi, I'm the author of the cmdr2 UI and installer (that iFire linked to). I'd be happy to contribute the 1-click installer used by my project. It's a battle-hardened installer (over 100k installations on all kinds of PCs and networks), and I'm finishing up a rewrite in python, so that the installer code is easier to maintain for others.

I'd be happy to submit a PR to your project, if you're interested in using it. I actually got it working with your project a few weeks ago, so I know it works with your repo.

I've opened a github issue as well, so we can talk there if you'd like: https://github.com/invoke-ai/InvokeAI/issues/1042

ionwake3y ago· 2 in thread

I was unable to get this to run on the Mac M1 over the last week - has anyone here had any success?

wokwokwok3y ago

Yes.

File an issue of it’s not working for you; it’s working fine for me.

(See for example https://github.com/invoke-ai/InvokeAI/issues/1021 ; if you had a previous install, delete it entirely)

ionwake3y ago

Thank you for the reply , I already did must be my setup.

lucasfcosta3y ago· 1 in thread

This is much needed. Even for a software engineer like me, it was quite cumbersome to use Stable Diffusion locally without such an UI.

I feel like there's just so much to improve though. Maybe SD is the definitive proof that one single feature can trickle down into many others just by adding good UI on top of it.

sophrocyneOP3y ago

Luckily, the team has a pretty jampacked roadmap. This is v1 of the full WebUI.

hda23y ago· 1 in thread

What about safety filters? All the safety filters in the SD interfaces/services I used so far are too false-positive happy. Can these filters be disabled or at least toned down in InvokeAI? If so, how easily?

capableweb3y ago

It ships without any filters.

paulirish3y ago· 1 in thread

PSA: You can email support@github to ask them to "detach my repo as a fork", in case the repo has matured so much it shouldn't have the "forked from …" treatment.

suyash3y ago

That's all good but it's nice to give credit where credit is due. I like how they do it in the README.

lawik3y ago

Oh, I used the dreeam.py script to back a Telegram bot. It later ended up in my demo for my talk Chat Bots as User Interfaces (with Elixir): https://www.youtube.com/watch?v=DFGHaER6_j4

I primarily used the InvokeAI release because I found it was easy to get going with on Linux and then it was simple enough to hack around with.

Also the first tool I've ever used where I've rode on the ragged edge of what my 3070 is okay with. I've had graphical glitches due to occupying all the video memory (KDE doesn't like it). I've had to quit apps to make it work.

Thanks for making a useful thing of all this Stable Diffusion stuff. I've enjoyed it.

nohat3y ago

I've been using a modified version of lsteins fork since almost the beginning. Recommended! It does lack some of the features of eg automatic1111, but it has good cli, and actually has a license, which is pretty important (as novelai has learned).

neilv3y ago

Nice! lstein is the SD fork that I ended up using, and I'm delighted to see it evolve into InvokeAI and keep getting better.

Uke3y ago

How good are solutions like stable diffusion at inpainting nowadays? What about the watermarks of getty et at that have been part of some of dall-e 2.0 images. Could one feasably remove such watermarks or stuff like a white grid array with these solutions?

So how convincing are these solutions in the worst case is what i am asking.

cmsj3y ago

Yay! I built an IRC bot for SD using lstein's repo because it was the first one that I could get to work reliably on M1, so I'm really glad to see the process continue really well with InvokeAI!

gernb3y ago

This is great but it requires lots of "geek" (installing dependencies, borking your system with brew, etc...)

Vs DiffusionBee which just works

https://diffusionbee.com/

Maybe the two projects can merge?

j / k navigate · click thread line to collapse

102 comments

80 comments · 19 top-level

swyx3y ago· 17 in thread

[OT] its been hard for me to trace the universe of stable diffusion forks so ive been maintaining a list here: https://github.com/sw-yx/prompt-eng#sd-major-forks

please let me know/send PRs if i missed anything, its been a couple months so i'm overdue for a round of cleanup/reorganizing

capableweb3y ago

Currently expanding it a lot with some fun features:

- multi-gpu support (first UI that would support that I think)

- no-click installer (installs everything when you start it up) that works on Windows, Linux and macOS

- A cloud version where you can "rent" access to the UI + very powerful GPU instances without having to run anything locally yourself.

Been waiting to submitting it all to HN as a Show HN but have to wait a bit for everything to get into place first :)

swyx3y ago

since its paid, i'd classify you as a distro :) https://github.com/sw-yx/prompt-eng/blob/main/README.md#sd-d...

1 more reply

jayd16163y ago

Here's another one for you: https://github.com/brycedrennan/imaginAIry

stavros3y ago

Imaginairy is great.

grosswait3y ago

Maybe I missed it but didn’t see https://github.com/divamgupta/diffusionbee-stable-diffusion-...

swyx3y ago

i have diffusionbee.com

hleszek3y ago

I see you've got my https://github.com/leszekhanusz/diffusion-ui gui but it seems to be linked to a completely unrelated face-swapping interface?

And in communities, you can probably add https://stablehorde.net

swyx3y ago

uhhh... copy paste brainfart sorry. thanks for correction

hafriedlander3y ago

Since everyone's jumping on the plug wagon, allow me to throw on my project, www.stablecabal.org.

swyx3y ago

great submission, thank you. god its so hard to keep on top of who is doing what.

am reworking my readme now with all the updates

hexomancer3y ago

Shameless plug: I was frustrated with the poor UI of notebook-based frontends so I wrote a desktop version here: https://github.com/ahrm/UnstableFusion .

Here is a video of some of its features: https://www.youtube.com/watch?v=XLOhizAnSfQ&t=1s

westoncb3y ago

[0] https://github.com/westoncb/generation-q

cmdr23y ago

swyx3y ago

ah thanks, i've seen you around but somehow forgot to add. I've put you in as a distro since you bundle SD: https://github.com/sw-yx/prompt-eng/blob/main/README.md#sd-d...

1 more reply

wyldfire3y ago

Is "dreambooth" a fork? Or another feature that has been created by composing Stable Diffusion w/something else?

swyx3y ago

yeah you're right, more the latter. i should split it out.

OG dreambooth is proprietary Google code. what everyone's using is a third party replication of it using SD

1 more reply

zenlikethat3y ago

Another feature. You can "teach" SD a new concept, e.g., a new person, with a limited number of training images.

cercatrova3y ago· 13 in thread

zamber3y ago

There was an AMA with Emad yesterday on discord. He got asked this. The promise is that 1.5 will be released in the following week.

The slowdown has numerous issues. They got legal threats, death threats, and threats from some congresswoman to have them banned by the NSA (1).

Also they don't have one central place for all their projects and will scale from 100 to 250 employees in the following year so things should speed up.

1) https://eshoo.house.gov/media/press-releases/eshoo-urges-nsa...

nl3y ago

> banned by the NSA

Note that this NSA is the National Security Advisor, not the National Security Agency.

1 more reply

swyx3y ago

make what you will of it but as of yesterday this was his answer to one of my readers: https://twitter.com/EMostaque/status/1579204017636667392

> No actually dev decision. Generative models are complex to release responsibly and team still working on release guidelines as they get much better, 1.5 is only a marginal FID improvement.

illuminati19113y ago

3 more replies

londons_explore3y ago

Sounds like some middle manager is trying to put his foot down and is saying things like "no more releases till we have designed and tested a 37 step release signoff procedure"

sophrocyneOP3y ago

My take is that the “genie is out of the bottle”

Going to be an interesting road ahead.

jeffparsons3y ago

Question from an ML-illiterate:

Is that even remotely plausible?

3 more replies

cmxch3y ago

Definitely is out of the bottle, especially when training capable cards are getting within reach of regular people.

Sort of hinted at it upthread, but would be interesting if this eventually brings competition to the GPU compute space (AMD, Intel?) .

judge20203y ago

Just train on the output of existing models minus any photos with watermarks - being twice removed is sure to make it even harder to claim copyright :)

F2hP18Foam3y ago

throwaway0x7E63y ago

not to Dall-E, because OpenAI is the very source of all these "ethical concerns" about unwashed masses having access to these tools

klyrs3y ago

noduerme3y ago

I've had stock photo watermarks show up repeatedly in SD generations as well.

1 more reply

KaoruAoiShiho3y ago· 6 in thread

Is there anything new here that might interest an existing user of auti's gui to switch?

sophrocyneOP3y ago

To be fair- Auto has been acquiring features at an insane clip (recently getting banned from the SD discord for accusations of code theft lol)

I think Invoke is competitive for now, but biggest advantage is an improved UX, and a large community with an ambitious roadmap focused more on enthusiasts/pros.

I’d give it a whirl and see where you end up preferring to do your SD projects :)

KaoruAoiShiho3y ago

Oh I see that my comment might be interpreted as snarky, I was literally just asking to please list the stuff that are new or different cause that would be very helpful for everyone.

1 more reply

hleszek3y ago

You can use https://diffusionui.com/b/automatic1111 once the automatic1111 webui is running:

- better inpainting

- a gallery to easily compare generations and easily regenerate images with small modifications

- responsive design --> works great on mobile, swipe left/right to switch between pictures in same generation and up/down to switch to another generation to compare

Here is the repo: https://github.com/leszekhanusz/diffusion-ui

Also if you don't have the hardware, you can also get images for free using the Stable Horde (https://stablehorde.net), a cluster of backends provided for free by volunteers.

You can test it here: https://diffusionui.com/b/stable_horde

nobo1223y ago

The company whose code was stolen works closely with the man behind SD and the decision was made to merely ban him from the community instead of torpedo-ing the repo via DMCA.

1 more reply

nikkwong3y ago

cmdr23y ago

pdntspa3y ago· 6 in thread

* Every single one of these seems to be a web UI, when this is desktop software that needs a desktop computer or workstation to run. Have we all collectively forgotten how to program PyGTK?

The breakneck pace of innovation here is awesome, but it feels like all gas no brakes on the usability front.

Like I get this need to make it easy to use, but it's like c'mon, there's is existing convention for all these things. Folks, please follow it!

If I had a wishlist, or the wherewithal to fork my own version, it would have:

* configurable model locations, preferably in an agreed-upon standardized hierarchy

* a standardized way of embedding prompt data into the PNG, a la automatic1111

nullc3y ago

Gotta love the tools that edit your bashrc! I'm not sure if that's worse or the mystery meat background auto-downloading. 0_o

Once the pace of innovation slows down I'm sure we'll see more effort from people with traditional software engineering experience come in to clean things up.

pdntspa3y ago

Oh god don't even get me started on that....

sophrocyneOP3y ago

- 1 Click Install & Run is in the works to make this easier to install. We agree.

- Model locations are a valid point - We're working on being able to hot swap models mid-session, so I'll bring this up into the convo.

- Invoke has aligned on its own metadata structure for the ability to easily pull those parameters into future invocations. We're not worried about compatibility with Automatic.

No need to fork - Just join us on discord and complain loudly until we make things better. :)

pdntspa3y ago

I wasn't expecting this list of complaints to be read so positively. So, props for that. Might just see you guys on discord...

1 more reply

drawingthesun3y ago

If you're using Mac m1 DiffusionBee is a one click app install that I've been using to generate high quality renders in seconds.

The devs recently added img2img support too.

I had tried to install the more complete versions but I just end up in a wormhole of python and conda errors.

nl3y ago

FYI, the HuggingFace diffusers does the download of models sensibly.

It's probably worth following that.

Timwi3y ago· 4 in thread

Sounds awesome! Unfortunately, it says that it requires a GPU. Please consider making it accessible to people without a GPU, for example using OpenVino like this (command line only) project does:

https://github.com/bes-dev/stable_diffusion.openvino

Thanks!

hleszek3y ago

Those who don't have a GPU could use the Stable Horde: https://stablehorde.net

geuis3y ago

What you're asking for isn't entirely possible for local installs. Yes, you can run SD on a cpu, but each image takes minutes at a time vs seconds via gpu.

Given all that, the requirements to have a Nvidia card is entirely acceptable, and for the most part a technical requirement.

nullc3y ago

Ironically, cpu support would be faster for me (in terms of throughput, at least) because I have on the order of a thousand zen cores put only a couple CUDA compatible GPUs with enough ram to run SD.

Timwi3y ago

> What you're asking for isn't entirely possible for local installs.

The project I linked to does it, so it's clearly possible. I didn't ask for speed, I only ask to be able to run it at all.

cmxch3y ago· 3 in thread

pja3y ago

Stable Diffusion works for me with a Polaris GPU. Had to compile my own local copy of Tensorflow to use it, but everything runs.

cmxch3y ago

Which documentation/build environment are you using?

I’m using Ubuntu(to follow what AMD has for ROCm) and building the entirety of (gfx803 patched) ROCm from source.

It works with some forks but not others.

2 more replies

wasyl3y ago

pdntspa3y ago· 3 in thread

Min requirements say 12gb, I take it this doesn't have the optimizations that automatic1111 has for <8gb cards?

teolandon3y ago

It says 12GB RAM, not VRAM. Right above that it says that it can work on 4GB VRAM cards.

capableweb3y ago

You can run it with lower VRAM for sure, up until some weeks ago, I was using that repository with a 11GB card.

pja3y ago

Yeah. Stable diffusion runs fine on my 8gb Polaris card (rx580) & I've heard of forks that will let you run it in 6 or even 4gb VRAM at a small cost in render time.

tehsauce3y ago· 2 in thread

You can try it out:

https://computerender.com.

capableweb3y ago

Care to shine some light on it? Using something like runpod/vast.ai would be my guess?

tehsauce3y ago

Hi! Your guess is correct. I monitor prices on vast.ai and runpod to get the best possible GPU power per dollar.

I updated the link you mentioned with some of this info!

iFire3y ago· 2 in thread

Can you make the ui InvokeAI as easy to install as running a Windows 11 command line script?

I couldn't get it to work following https://invoke-ai.github.io/InvokeAI/installation/INSTALL_WI...

sophrocyneOP3y ago

One click install is a goal, we just need a contributor who is confident taking it on as a project.

cmdr23y ago

I'd be happy to submit a PR to your project, if you're interested in using it. I actually got it working with your project a few weeks ago, so I know it works with your repo.

I've opened a github issue as well, so we can talk there if you'd like: https://github.com/invoke-ai/InvokeAI/issues/1042

ionwake3y ago· 2 in thread

I was unable to get this to run on the Mac M1 over the last week - has anyone here had any success?

wokwokwok3y ago

Yes.

File an issue of it’s not working for you; it’s working fine for me.

(See for example https://github.com/invoke-ai/InvokeAI/issues/1021 ; if you had a previous install, delete it entirely)

ionwake3y ago

Thank you for the reply , I already did must be my setup.

lucasfcosta3y ago· 1 in thread

This is much needed. Even for a software engineer like me, it was quite cumbersome to use Stable Diffusion locally without such an UI.

I feel like there's just so much to improve though. Maybe SD is the definitive proof that one single feature can trickle down into many others just by adding good UI on top of it.

sophrocyneOP3y ago

Luckily, the team has a pretty jampacked roadmap. This is v1 of the full WebUI.

hda23y ago· 1 in thread

capableweb3y ago

It ships without any filters.

paulirish3y ago· 1 in thread

PSA: You can email support@github to ask them to "detach my repo as a fork", in case the repo has matured so much it shouldn't have the "forked from …" treatment.

suyash3y ago

That's all good but it's nice to give credit where credit is due. I like how they do it in the README.

lawik3y ago

Oh, I used the dreeam.py script to back a Telegram bot. It later ended up in my demo for my talk Chat Bots as User Interfaces (with Elixir): https://www.youtube.com/watch?v=DFGHaER6_j4

I primarily used the InvokeAI release because I found it was easy to get going with on Linux and then it was simple enough to hack around with.

Thanks for making a useful thing of all this Stable Diffusion stuff. I've enjoyed it.

nohat3y ago

neilv3y ago

Nice! lstein is the SD fork that I ended up using, and I'm delighted to see it evolve into InvokeAI and keep getting better.

Uke3y ago

So how convincing are these solutions in the worst case is what i am asking.

cmsj3y ago

Yay! I built an IRC bot for SD using lstein's repo because it was the first one that I could get to work reliably on M1, so I'm really glad to see the process continue really well with InvokeAI!

gernb3y ago

This is great but it requires lots of "geek" (installing dependencies, borking your system with brew, etc...)

Vs DiffusionBee which just works

https://diffusionbee.com/

Maybe the two projects can merge?

j / k navigate · click thread line to collapse