CivArchive
    DPO (Direct Preference Optimization) LoRA for XL and 1.5 - OpenRail++ - SDXL - V1.0
    Preview 4862211
    Preview 4862214
    Preview 4903274

    What is DPO?

    DPO is Direct Preference Optimization, the name given to the process whereby a diffusion model is finetuned based on human-chosen images. Meihua Dang et. al. have trained Stable Diffusion 1.5 and Stable Diffusion XL using this method and the Pick-a-Pic v2 Dataset, which can be found at https://huggingface.co/datasets/yuvalkirstain/pickapic_v2, and wrote a paper about it at https://huggingface.co/papers/2311.12908.

    What does it Do?

    The trained DPO models have been observed to produce higher quality images than their untuned counterparts, with a significant emphasis on the adherence of the model to your prompt. These LoRA can bring that prompt adherence to other fine-tuned Stable Diffusion models.

    Who Trained This?

    These LoRA are based on the works of Meihua Dang (https://huggingface.co/mhdang) at

    https://huggingface.co/mhdang/dpo-sdxl-text2image-v1 and https://huggingface.co/mhdang/dpo-sd1.5-text2image-v1, licensed under OpenRail++.

    How were these LoRA Made?

    They were created using Kohya SS by extracting them from other OpenRail++ licensed checkpoints on CivitAI and HuggingFace.

    1.5: https://civarchive.com/models/240850/sd15-direct-preference-optimization-dpo extracted from https://huggingface.co/fp16-guy/Stable-Diffusion-v1-5_fp16_cleaned/blob/main/sd_1.5.safetensors.

    XL: https://civarchive.com/models/238319/sd-xl-dpo-finetune-direct-preference-optimization extracted from https://huggingface.co/stabilityai/stable-diffusion-xl-base-1.0/blob/main/sd_xl_base_1.0_0.9vae.safetensors

    These are also hosted on HuggingFace at https://huggingface.co/benjamin-paine/sd-dpo-offsets/

    Description

    FAQ

    Comments (28)

    wesDec 25, 2023· 25 reactions
    CivitAI

    Sorry, I'm too lazy to look up external links. Can you please write one sentence that explains what this actually does?

    Edit: OK, I ended up reading the external link. Maybe a before and after comparison image would be helpful to demonstrate the impact it has and how it improves the image to match human preferences.

    456477696e581Dec 25, 2023· 5 reactions

    I get the feeling it's one of those things where "If you have to ask, you're not nerdy enough to use it."

    enfugue
    Author
    Dec 25, 2023· 8 reactions

    Hello, thank you for the suggestions, my apologies for the confusion. I've edited in a blurb with some details about the model and it's intended effects. The authors provided this comparison which isn't great but is a start: https://huggingface.co/mhdang/dpo-sdxl-text2image-v1/blob/main/01.gif. I'm working on better comparisons as we speak, I'll post some as soon as they're done.

    wesDec 26, 2023

    Thanks, I appreciate it! Merry Christmas!

    klotzDec 25, 2023
    CivitAI

    So, I guess this is some kind of detail enhancer.

    shapeshifter83Dec 27, 2023· 2 reactions

    no, it helps with prompt accuracy, mainly.

    amazingbeautyDec 26, 2023· 4 reactions
    CivitAI

    basically , by looking to this page one thing come in mind that his generates only photos of cats and dogs..

    dam* lack of samples and clear explanations.

    enfugue
    Author
    Dec 26, 2023· 1 reaction

    Hello, I have added more examples of some varying kinds of prompts for both LoRA. The effects of the training are difficult to summarize and I myself haven't explored the edges of the model or the LoRA yet (I did not train it, I just extracted the LoRA.)

    Rather than a specific style or intention, the model was trained on 850,000 "A vs. B" image pairs that were chosen by humans. So all we can really say for sure that the training did was make the images "more aligned with human preferences." What that means in practice will only really bear out with time.

    1704178Dec 27, 2023· 8 reactions
    CivitAI

    What settings do you recommend for using this Lora?

    ReLeVaNCeAIDec 28, 2023· 4 reactions
    CivitAI

    It's really good so far actually!

    parallelepipedonDec 30, 2023· 6 reactions
    CivitAI

    "finetuned based on human-chosen images" Does this imply that other finetunes were trained with images chosen by... animals? ;> At random?

    LAION's common crawl could be considered non-human chosen. But many finetuners claim they chose only the best images, which I highly doubted-particularly those with hundreds of thousands. One early model I used to joke might've been trained on images of mail order brides. ;> Though those could still be considered human-chosen.

    enfugue
    Author
    Dec 30, 2023· 4 reactions

    Most datasets are curated by algorithms, not people (or animals or randomness.) You can argue that the algorithms were written by people but that discussion seems unproductive, the point of PickAPic is that it's a massive dataset that is 100% human curated, and that is a rarity.

    ErilazJan 4, 2024· 4 reactions

    I think both of you are missing the point. DPO can be automated, it doesn't have to rely on human input. Likewise, a dataset isn't necessarily cherry-picked by hand. DPO optimizes the PREFERENCE, because it trains the model with differential data, much like RLHF. The difference is, RLHF uses a reward model trained to prefer the same things the authors prefer, and DPO doesn't use this, it uses the trained model itself. I don't know how Diffusion-DPO works in image generation models, but in LLMs it relies on token probabilities - the higher the probability of the desired output, the higher the reward. In layman terms, ofc. Hence DIRECT preference optimization. But the source of preference is arbitrary. Preference is a bias, and bias is arbitrary more often than not. We can use DPO to appeal to humans and use human data. Or we can use DPO to make a model work similarly to another model, like Midjourney, another diffusion algorithm. The source of preference in irrelevant in this case.

    parallelepipedonJan 21, 2024

    @Erilaz Interesting. Thanks for your reply.

    FlexabilityJan 1, 2024
    CivitAI

    Great work! Really excellent results

    155956Jan 8, 2024· 73 reactions
    CivitAI

    for all of the people who are confused about what this is, im no expert or anything, but the lora is mainly used to be baked into other models, and its effect is that it makes stable diffusion take your prompt more seriously. the lora is mainly means for model trainers, and will help with using more natural language in your prompts instead of 1 million keywords.

    hope this comment was useful. again, im not an expert, but people seem to be very confused about this, and i dont want that to take away from how cool this really is!

    SuzanneMar 19, 2024
    CivitAI

    does it work with Pony V6 ?

    GogetaSSGSS3Mar 21, 2024· 4 reactions

    I've seen some people talk about this, they use this lora with AutismMix, which is a model based on PonyV6, so I think it does work. I'm not 100% sure tho

    SuzanneMar 21, 2024

    @GogetaSSGSS3 ok, thanks

    boolosoiMar 23, 2024· 3 reactions

    Works

    SuzanneMar 23, 2024

    @boolosoi thank you

    EnigmataMar 24, 2024· 2 reactions
    CivitAI

    Unfortunately I don't see significant difference. I tested simple sentence and model with DPO and without it does good job - https://civitai.com/images/8497276

    ReyArtAgeApr 6, 2024· 2 reactions
    CivitAI

    Thanks very much, it makes my sdxl model listen better to prompts

    satangelApr 29, 2024· 15 reactions
    CivitAI

    may i know what is the best setting for the lora strength?

    thebrownsauce184Jul 5, 2024· 6 reactions
    CivitAI

    Are we supposed to use this as a LoRA in our image generations to make the images better? Or is this just for creators to put in their models?

    TrixiesJul 5, 2024· 7 reactions

    I have tested with and without. For me, I find it follows my prompts better. As well as removing those unsightly bumps on bodies, creating a smoother look.. But, then again, it could just be me.

    thebrownsauce184Jul 5, 2024· 2 reactions

    @Trixies Sweet! tnx

    LORAfiend69Dec 28, 2024· 1 reaction
    CivitAI

    Brilliant.