For practical context, Stable Diffusion 2.X was trained on LAION-5B as opposed t...

Terretta · on Dec 12, 2022

Problem with hands is probability.

It's more probable a finger has a finger on both sides of it than not. So the model diffuses lots of adjacent fingers.

stavros · on Dec 12, 2022

But that's the same for everything that has structure. A small section of an arm is much more likely to have another small section of an arm next to it than to have a hand, yet SD's arms are usually well-proportioned.

sdenton4 · on Dec 12, 2022

There's a lot of loooooong necks, though.

alar44 · on Dec 12, 2022

No the problem with fingers is that they resemble hotdogs and the AI really likes hotdogs so you get a lot of fingers.

I can make things up too!

satvikpendem · on Dec 12, 2022

SD 2 also removed quite a lot of images of humans due to their fear of people generating CSAM, so the quality actually has gotten worse for anything resembling humans than SD 1.

astrange · on Dec 12, 2022

2.0 removed too many of them due to a bug in the NSFW filter. 2.1+ should be better again.

But they’re harder to control without negative prompting.

in3d · on Dec 12, 2022

This is incorrect. Stable Diffusion 1.x was trained on "laion-improved-aesthetics" (a subset of laion2B-en).

minimaxir · on Dec 12, 2022

Double checked and both the initial comment and the correction are incorrect: the original v1.1 was trained on LAION-2B, then subsequent versions were finetuned on the aestethics subset.

Either way, the main point is the same: more training data gives better results.

https://github.com/CompVis/stable-diffusion#weights

in3d · on Dec 12, 2022

1.1 wasn’t public. Public releases were trained as I said.

cma · on Dec 12, 2022

1.1 is available here: https://huggingface.co/CompVis/stable-diffusion-v-1-1-origin...