Dwayne Johnson aka The Rock FLUX Dev Fine-Tuning / DreamBooth Model for Educational and Research Purposes - Dwayne Johnson aka The Rock FLUX Dev LoRA Model for Educational and Research Purposes - Full Tutorial

Dwayne Johnson aka The Rock FLUX Dev Fine-Tuning / DreamBooth Model for Educational and Research Purposes - Dwayne Johnson aka The Rock FLUX Dev LoRA Model for Educational and Research Purposes - Full Tutorial - FP16 Version

I am sharing how I trained this model with full details and even the dataset: please read entire post very carefully.

This model is purely trained for educational and research purposes only for SFW and ethical image generation.

The workflow and the config used in this tutorial can be used to train clothing, items, animals, pets, objects, styles, simply anything.

The uploaded images have SwarmUI metadata and can be re-generated exactly. For generations FP16 model used but FP8 should yield almost same quality. Don't forget to have used yolo face masking model in prompts.

How To Use

Download model into diffusion_models of the SwarmUI. Then you need to use Clip-L and T5-XXL models as well. I recommend T5-XXL FP16 or Scaled FP8 version.

A newest fully public tutorial here for how to use :

I have trained both FLUX LoRA and Fine-Tuning / DreamBooth model.

Activation token / trigger word : ohwx man

Each training was up to 200 epochs and once every 10 epoch checkpoints saved and shared on below Hugging Face Repo : https://huggingface.co/MonsterMMORPG/Model_Training_Experiments_As_A_Baseline

This model contains experimental results comparing Fine-Tuning / DreamBooth and LoRA training approaches.

Additional Resources

Installers and Config Files : https://www.patreon.com/posts/112099700
FLUX Fine-Tuning / DreamBooth Zero-to-Hero Tutorial : https://youtu.be/FvpWy1x5etM
FLUX LoRA Training Zero-to-Hero Tutorial : https://youtu.be/nySGu12Y05k
Complete Dataset, Training Config Json Files and Testing Prompts : https://www.patreon.com/posts/114972274
Click below link to download all trained LoRA and Fine-Tuning / DreamBooth checkpoints for free
https://huggingface.co/MonsterMMORPG/Model_Training_Experiments_As_A_Baseline/tree/main

Environment Setup

Kohya GUI Version: 021c6f5ae3055320a56967284e759620c349aa56
Torch: 2.5.1
xFormers: 0.0.28.post3

Dataset Information

Resolution: 1024x1024
Dataset Size: 28 images
Captions: "ohwx man" (nothing else)
Activation Token/Trigger Word: "ohwx man"

Fine-Tuning / DreamBooth Experiment

Configuration

Config File: 48GB_GPU_28200MB_6.4_second_it_Tier_1.json
Training: Up to 200 epochs with consistent config
Optimal Result: Epoch 170 (subjective assessment)

Results

LoRA Experiment

Configuration

Config File: Rank_1_29500MB_8_85_Second_IT.json
Training: Up to 200 epochs
Optimal Result: Epoch 160 (subjective assessment)

Results

Comparison Results

LoRA 90 vs 160 vs Fine-Tuning 170 Comparison

Key Observations

LoRA demonstrates excellent realism but shows more obvious overfitting when generating stylized images.
Fine-Tuning / DreamBooth is better than LoRA as expected.

Model Naming Convention

Fine-Tuning Models

Dwayne_Johnson_FLUX_Fine_Tuning-000010.safetensors
- 10 epochs
- 280 steps (28 images × 10 epochs)
- Batch size: 1
- Resolution: 1024x1024
Dwayne_Johnson_FLUX_Fine_Tuning-000020.safetensors
- 20 epochs
- 560 steps (28 images × 20 epochs)
- Batch size: 1
- Resolution: 1024x1024

LoRA Models

Dwayne_Johnson_FLUX_LoRA-000010.safetensors
- 10 epochs
- 280 steps (28 images × 10 epochs)
- Batch size: 1
- Resolution: 1024x1024
Dwayne_Johnson_FLUX_LoRA-000020.safetensors
- 20 epochs
- 560 steps (28 images × 20 epochs)
- Batch size: 1
- Resolution: 1024x1024

Description

For Full Details, Training Dataset, Tutorial, Guide, Configs, Training Json Files, Workflows, Installers, Resources and All Checkpoints > https://huggingface.co/MonsterMMORPG/Model_Training_Experiments_As_A_Baseline

FAQ

Comments (26)

9ballNov 2, 2024· 4 reactions

CivitAI

22GB for The Rock? Yea..

SECourses

Author

Nov 2, 2024

FP8 version also exists but sadly I couldn't find how to make it default asking CivitAI team

Triple_Headed_MonkeyNov 3, 2024

@SECourses Most creators would extract a lora from the dreambooth model and upload this instead.

SECourses

Author

Nov 3, 2024· 1 reaction

@Triple_Headed_Monkey I will post that too. I trained LoRA models as well and i will hopefully publish all. LoRA , and LoRA extraction

SECourses

Author

Nov 2, 2024· 1 reaction

CivitAI

FP8 model is also there sadly it is not set as default and I am asking CivitAI team to how to set it default

Triple_Headed_MonkeyNov 3, 2024· 1 reaction

Yeah, no. Apparently the site still doesn't allow for you to choose the order of files uploaded on a single model page without uploading as an entirely new version.

SECourses

Author

Nov 3, 2024

@Triple_Headed_Monkey yes sadly that way. i added as separate models for now

eurotakuNov 3, 2024

@Triple_Headed_Monkey @SECourses working as intended, all community members can choose their favourite precision in their account settings, so someone preferring fp16 will see that as default, another one with fp8 set as default, will be shown that instead as the first one. so you can put both checlpoints into one version, each visitor will see what they prefer. :)

SECourses

Author

Nov 4, 2024

@eurotaku but it is set as fp16 by default and private window shows that too. i think user should be able to override default behavior

Triple_Headed_MonkeyNov 5, 2024

@eurotaku Yes it is totally working as intended when you choose to upload a CLIP model and it shows up as the default because it is the higher precision model :D And not to mention when you try and change the precision there is no value lower than FP8 and setting it the same it will just tell you "there is already a model of this type uploaded"

HavoFXNov 3, 2024· 2 reactions

CivitAI

Keep up the good work!

SECourses

Author

Nov 3, 2024

Thanks a lot for comment

sevenof9247Nov 3, 2024· 1 reaction

CivitAI

question lora training, have you found out whether it is better to remove the background or describe it, e.g. for portraits?

SECourses

Author

Nov 3, 2024

well i tested full captions. it reduces training accuracy. however for only background, i didnt test them to be fair

ranjeet3939Nov 3, 2024· 1 reaction

CivitAI

I am getting error: [ComfyUI-0/STDERR] ValueError: Model face_yolov9c.pt not found, or yolov8 folder path not defined

SECourses

Author

Nov 3, 2024

true it works in SwarmUI : https://huggingface.co/Bingsu/adetailer/blob/main/face_yolov9c.pt

ranjeet3939Nov 4, 2024· 1 reaction

@SECourses thanks, where to put this file in swarm ui?

SECourses

Author

Nov 4, 2024· 1 reaction

@ranjeet3939 make a folder inside models folder as yolov8 put there

ranjeet3939Nov 5, 2024

@SECourses thanks champion, one more thing, in the article, could you please write the setting which needs to be selected during image generation in swarmui, for example: sampler, clips, flux guidance etc... I am trying to replicate your images and huge fan of you :)

SECourses

Author

Nov 5, 2024

@ranjeet3939 please watch this tutorial just recently recorded to show all : https://youtu.be/-zOKhoO9a5s

SECourses

Author

Nov 5, 2024

CivitAI

For Full Details, Training Dataset, Tutorial, Guide, Configs, Training Json Files, Workflows, Installers, Resources > https://huggingface.co/MonsterMMORPG/Model_Training_Experiments_As_A_Baseline

Triple_Headed_MonkeyNov 5, 2024· 2 reactions

CivitAI

I'm going to be kind here. Flux training is case sensitive, so while it is cool that you managed to show that it is easier to train Flux contradictory to how it expects to be trained compared to other models, if you were to repeat this experiment you should at least use the tags/captions like so:

Owhx Man

This should reduce the amount of time it takes and the rank/dim needed to achieve decent results.

SECourses

Author

Nov 5, 2024

it is true case sensitive. but we still don't have full tokenizer. have you found any? i found T5 tokenizer and it had so few words

Triple_Headed_MonkeyNov 6, 2024

@SECourses I've not seen a decent one around off the top of my head either.

T5 is basically a mini LLM. I'm fairly certain that CLIP is still handling the majority of the heavy lifting all round including the tokenization process. I've been trying to work it out for a little while but I think the process is something like:

Text input > T5 breaks it down contextually based on the sentence structure and attempts to feed it to the CLIP tokenizer without giving it room to mistake intent > CLIP converts tokens into a vector with spaital/visual information > Transformer uses these vectors to generate an image.

There are a couple of other possible configurations, but the simple take away from all of the different variations was that T5 is not capable of interpreting or producing the output necessary to caption or generate imagery. It is inherently fully text and token based. Therefore it's contributions to the process must also be restricted to the domain of text and/or improving the tokenization process with semantic context.

By itself T5 is basically an autofill model. Which would generate text based on simple inputs/parameters. Things like finishing a sentence you've started writing or responding to simple questions relating to additional information being provided to it. For example when interrogating an image using a Vision adapter model or something like BLIP.

In other words I'm not sure it really matters much in this case. CLIP having been created originally for the purpose of captioning and sorting images into categories was accidentally found to have the ability, when used in reverse, to generate the image information it was trained to caption.

So instead of converting images to vectors and plotting them to category tags, it became possible to input category tags and it would output vectors.

The transformer is then trained on top of the vector inputs with the same, or similar enough, datasets that were used to train the CLIP models, which bridges the gap between the Transformers ability to generate/manipulate pixels and the CLIP's output of vector information.

In the case of FLUX in specifics, the architecture seems to include a secondary clip model inside the transformer model layers itself. Which handles the translation between the training done on the transformer and the untrained clip and T5 models and allows for results to be closer to if you had trained them in conjunction.

This is my take on it anyway.

wikeeyangNov 14, 2024· 1 reaction

CivitAI

A very good teacher! I learned a lot from you, especially about the De-distill model, thank you a lot.

SECourses

Author

Nov 17, 2024

@wikeeyang thanks a lot

Checkpoint

Flux.1 D

by SECourses

Download (Beta)

celebrity

Details

Downloads

Platform

CivitAI

Platform Status

Deleted

Created

11/2/2024

Updated

5/7/2026

Deleted

5/23/2025

Trigger Words:

ohwx man

Files

dwayneJohnsonAkaTheRockFLUXDevFineTuning_fp16Version.safetensors

Size:

22.17 GB

SHA256:

f8c7fece9293b028090cd1d4af86f8f7593e0e446bd7cff602c63e0d6255dbbb

Mirrors

HuggingFace (3 mirrors)

dwayneJohnsonAkaTheRockFLUXDev_fp16Version.safetensors

Dwayne_Johnson_FLUX_Fine_Tuning-000170.safetensors

CivitAI (1 mirrors)

dwayneJohnsonAkaTheRockFLUXDevFineTuning_fp16Version.safetensors

Available On (1 platform)

Same model published on other platforms. May have additional downloads or version variants.

SeaArt

How To Use

Additional Resources

Environment Setup

Dataset Information

Fine-Tuning / DreamBooth Experiment

Configuration

Results

LoRA Experiment

Configuration

Results

Comparison Results

Key Observations

Model Naming Convention

Fine-Tuning Models

LoRA Models

Description

FAQ

What is Dwayne Johnson aka The Rock FLUX Dev Fine-Tuning / DreamBooth Model for Educational and Research Purposes - Dwayne Johnson aka The Rock FLUX Dev LoRA Model for Educational and Research Purposes - Full Tutorial?

Why was this model removed from CivitAI?

How do I use Dwayne Johnson aka The Rock FLUX Dev Fine-Tuning / DreamBooth Model for Educational and Research Purposes - Dwayne Johnson aka The Rock FLUX Dev LoRA Model for Educational and Research Purposes - Full Tutorial?

What should I watch out for with Flux models?

What other Flux-based models are worth knowing?

Can I use this model commercially?

What files are available and where can I download them?

Comments (26)

Details

Files

dwayneJohnsonAkaTheRockFLUXDevFineTuning_fp16Version.safetensors

Mirrors

Available On (1 platform)