NOTE: This model has it's own VAE, which is baked into the model. For best results, please ensure that the selected VAE in automatic1111 is set to "Automatic". If you've never poked around in the VAE settings, this will be the default.
NextPhoto is the result of a whole lot of training, data curation, and block merging. The model is designed exclusively for the generation of photo-realistic photos, and as such it cannot generate non-photo images (even if prompted to do so). For more details about version 3.0, check out the "About this version".
All sample images were generated using ESRGAN_4x upscaling model at 2x upscaling, with 0.45 denoising strength. I'm not gonna upload a 32 bit model, as the v3 model was trained using 16bit precision, so it would literally just be a waste of space.
Usage Guide
(highly recommended) The negative prompt is quite important for the photorealism, but you don't really have to change it ever to get great results. I'd recommend the following negative prompt as a base: (worst quality:0.8), cartoon, halftone print, burlap,(cinematic:1.2), (verybadimagenegative_v1.3:0.3), (surreal:0.8), (modernism:0.8), (art deco:0.8), (art nouveau:0.8)
This prompt uses the verybadimagenegative_v1.3 textual embedding. You'll need
Place the downloaded file into the "embeddings" folder of the SD WebUI root directory, then restart stable diffusion.
Positive Prompts: You don't need to think about the positive a whole ton - the model works quite well with simple positive prompts.
Examples:
A well-lit photograph of woman at the train station
A perfect well-lit medium photograph of an old married couple sitting on their porch
A poorly lit photograph of a man walking on the trail at night
For more examples of positive prompts you can look at the sample photos for the model.
Upscaling: This model works will still generate photorealistic images without upscaling, but upscaling is strongly recommended for photorealism. You'll need to use the ESRGAN_4x upscaling model (not R-ESRGAN) in the hires fix section for decent results. Set the weight anywhere from 0.3 to 0.5 for best results, and the upscale amount to 2. I normally set my weight to 0.5 or 0.45.
Sampler: I use DPM++ 2M Karas, and generally don't stray from it. While the other samplers can still produce good results, DPM++ 2M Karas is the most consistent in my experience with this model.
For further improvements:
Reduce your CFG scale: The default classifier free guidance scale scale of 7 works good, but occasionally this can be too high. Reduce the CFG scale until you like the results - I generally bottom out at 4.0, as anything lower than that and the negative prompt starts getting ignored. Increasing the CFG scale past 7 or 8 will result in more "dramatized" photos (not in a good way), but will also result in the model listening more to the prompts, so balance as needed. High CFG scales can work well for specific situations, but lower CFG scales work great quite consistently.
Avoid excess LORA and Textual Inversion use: As v2 and v3 of this model are custom trained and not purely block merged, any LORAs or Textual Inversions may not work as well as they do in other models. Based on my experience, you can still get good results with them, but I'd recommend treading lightly - I'd recommend an additive approach where you add LORAs or inversions selectively when needed.
Description
Initial Release
FAQ
Comments (22)
This is a beautiful model. When used right this creates very amateur looking photos which is great as usually realistic models are too "perfect" looking. great model and I hope you keep it updated as I love it
yep it has a very natural touch less filmic then most of the Sony directions
many don't like the to high chroma outputs as well you can turn them down on any model with lower CFG scale often though.
To overdone contrast can also tighten the effect of Perceptual Artificialness
Some very old Models come closer to real eye Perception Simulation then Artistic Camera Lens Simulation as well i highly advise a look at older models.
Also you get the impression that in Many models is no Real Hair color anymore every hair color starts to look unnatural fake in their light responses
Thus why i tend to call most Photo Models "Photo Artisitc" by now then Photo Real
Someone should generate some Car examples
As requested, I added a post with some car generations
Be the change you want to see
@Stagnation
What stop reading my mind,who,that was Poetic
it's a very honest model i really like it's visual perceptual space
PS: Stagnation i already see the change (in)by many and it's beautiful
I just hope it isn't to late
By far the best model i have used. can do lots of nice things without it looking fake
Tremendous. I really appreciate your work.
The best model i have seen so far with Degenerate_Realism, too glad that i have found this gem. Also this one can do NSFW/Porn stuff very acuretly, xxxxtimes better than URPM and consorts.
Do you have unpruned version? It gives much better result with multidiffusion upscaling
I don't presently have one unfortunately as this was created as a merge of pruned models. I could apply the weights of this model against an unpruned model though. I'd want to find a full unpruned f32 model+ema weights that has no weird license on it to apply the weights against prior to publishing though - you wouldn't happen to know of one would you?
@bigbeanboiler would you be able to share your block weight recipe? V1 really is a great model!
@MixerPerson If I still had it around I would, but I don't unfortunately. Generally speaking though, the key was to use my other model - Flat2D-animerge - as the base for everything except the attention layers. The attention layers (specifically, the down/up blocks that include self attention and cross attention) came from merging a variety of different photoreal models.
Best model so far, when compared to RV 2.0. Thank you.
Thank you, Best Realistic photograph Model, What VAE you use in you gens?
I use vae-ft-mse-840000-ema-pruned for all my generations (the OG stable diffusion one)
The Best Realistic Model, Good for portrait, good hand... and Everything you want. Well Done!
This model; It gives much higher quality and realistic results than (realisticVisionV20_v20NoVAE) and (rundiffusionFX_v10) and (amIReal_V3) models. Thanks this is the best model for photorealistic
"CFG Scale" stands for "Context Free Guidance Scale", not the shortened form of "configuration scale". Your instructions still mostly make sense but could be more clear.
Oh huh, that makes a lot more sense than configuration. Thank you, I'll update the description
Classifier Free Guidance
The best so far, Basic skin-hair-eye details are insane No hires required No bestqual... 8k... unity... award winning ... etc required If your semi-real model needs something realistic, merge nextphoto to it and you can see the wonder without much distortion. love it ✨
p.s. It also does works with faceLora just use it with slightly high weight

















