259 sample images
Images were personally selected from her instagram and subreddit, and all cropped and tagged manually. A good mixture of personal photos, photoshoots, press photos, movie stills, etc.
Description
2 vectors. Vectors were trained individually and then merged. Probably some potential for improvement by training a third or more vectors but I'm pretty satisfied with the results here. The main area that could be improved is face distortion with more distant shots, although using "restore faces" or img2img at a higher resolution can often solve any weirdness with that.
I will likely work on a SD 2.1 based version soon and upload that as well.
Batch Size 24, 20000 steps for each vector, trained on a 3080Ti
Custom learning rate schedule:
8e-3:40, 7e-3:160, 6e-3:320, 5e-3:520, 4e-3:780, 3e-3:1140, 2e-3:1720, 1e-3:2300, 9e-4:2440, 8e-4:2620, 7e-4:2820, 6e-4:3060, 5e-4:3360, 4e-4:3760, 3e-4:4320, 2e-4:5320, 1e-4:6340, 9e-5:6620, 8e-5:6960, 7e-5:7340, 6e-5:7820, 5e-5:8440, 4e-5:9260, 3e-5:10420, 2e-5:12160, 1e-5:13560, 9e-6:13860, 8e-6:14200, 7e-6:14560, 6e-6:14960, 5e-6:15400, 4e-6:15920, 3e-6:16560, 2e-6:17420, 1e-6:18100, 9e-7:18260, 8e-7:18440, 7e-7:18620, 6e-7:18840, 5e-7:19080, 4e-7:19380, 3e-7:19760, 2e-7
FAQ
Comments (10)
Not to be rude, but I genuinely don't understand how you can manually crop and tag dozens of images of someone and then view these previews as looking like them. Would you be willing to upload your dataset somewhere, so we can take a gander and maybe contribute?
Having followed Karen Gillan since her Doctor Who days, I feel like these images look just like her. It's not 100%, but getting 100% with an embedding alone, especially one with only two vectors, is pretty much impossible, but it's close enough that most people seeing one of these images would easily recognize who they're supposed to be. Is there something specific about them that you find to be off? Like I mentioned I can certainly try training another vector to see if things improve any more, but I was pretty satisfied with the results here. Of course, this is my first upload here, so I'm still pretty new at this.
I don't plan to upload my dataset, sorry. Want to avoid any potential copyright issues. If you take a look at her instagram, that's where the majority of the images I gathered came from.
You made this like me being lazy to make a good looking sandwich but adding ketchup between 2 bread slices and eating it
What do you mean? I put a lot of effort into cropping and tagging the sample images, and coming up with a learning rate schedule and workflow I was happy with. I think the results are pretty good. Do you disagree?
I haven't still tested your embed, but I'm really curious about your process. How do you manage a batch size of 24? I get a CUDA error with anything greater than 4... Is there a way to override this limitation? My graphics card is a GTX 3080 (12 GB). And do you leave the gradient accumulation steps at 1?
Yes gradient accumulation is at 1. I'm using xformers with "Use cross attention optimizations while training" enabled. Training at 512x512
On 2.1 and 768x768, I can use a batch size of 10
@fudefrak Thanks! One more thing to try next time. :)
I think that without the embedding, deliberate has a pretty good response to "karen gillan"
I haven't tried deliberate, but when I typed Karen Gillan into the base model, I got someone who looked like a very different person, with only vague similarities.
Think it looks alright, not sure what the complain is
Details
Available On (1 platform)
Same model published on other platforms. May have additional downloads or version variants.