Ponydiffusion is an excellent model for 2d content, but it seems rather inconsistent with 3d. This model is designed to more consistently produce photorealistic 3d images of a variety of subjects. Currently, the beta version still produces a more CGI effect as I do not believe I have enough sample images, but hopefully future versions will be more realistic. I would recommend checking the description of each version to see what it does and what its drawbacks are for the time being for more detailed info.
Description
Improved image quality using better captions in more images, but still has a more CGI geel to it