saoirse's character workflow for z-image turbo

welcome to my workflow! version 1.95!

first of all i want to thank everyone who has appreciated my images, which were generated using some version of this workflow, and everyone who has downloaded the workflow. please leave a comment if you have any questions, i am happy to help! secondly i want to apologize for the noodles. there are so many of them. you will probably want to hide the noodles...

if you have any trouble with finding or placing some of the weirder nodes and files, or you're getting errors that are super unhelpful, please please let me know! i am happy to assist if i can.

--sft

version 1.95 - i've integrated the padding option into the load image node and exposed the qwenvl prompt setup options and more of the seedvr options.

version 1.9 - i've removed the flux guidance setting because it really doesn't do anything with any of the models i tested. but in exchange i've added an option to pad (extend) a loaded image to move a pose figure around the frame. instructions are in area 4! i have upgraded the face detailer to use sam3. this means you need to have triton installed as well. brief instructions are in the area. you can find more detailed instructions online (in the install readme for sageattention):

https://github.com/ScalierBullet63/ComfyUI_EasySageAttention/blob/main/guide_en.md

what is this workflow for?

this workflow is for human and human-like character-focused images. it uses three basic ideas:

1. controlnet preprocessors that can pull randomly from images (poses) you have collected, and - with version 1.5 of the workflow - the vnccs pose studio node to custom build your own

2. a QwenVL pose descriptor that can create a prompt element from the randomly selected image

3. lots and lots and lots of fine-tuning and options

the goal is to make it possible to set up a basic prompt (possibly without a description of pose or action) and then generate a bunch of images using the pose template images you have organized. people often say that z image turbo lacks diversity or "imagination," and this workflow tries to overcome that limitation while still giving you a lot of control.

you can use it on non-human-like characters but you'll want to turn off most of the detailers, and make sure any controlnet poses you use don't make it weird!

how is this workflow organized?

there are six bookmarked areas, plus this (seventh) instruction node. when you're first getting used to the workflow, go through each bookmarked area one by one and get things set up. later, most of your attention will be spent on bookmarks 4, 5, and 6. the rest of these instructions will be stepping you through each bookmarked area, starting with...

area 1: loaders

this is where you load in the models and other items you're going to use.

- there are GGUF, regular model and checkpoint loaders. you can go ahead and put models in all three - when you get to area 3 you'll be able to toggle among them.

- just below those loaders is the loader node for the clip, vae and patch. i use the qwen_3_4b_abliterated_v2 clip and the regular z image turbo vae. you will also need the Z-Image-Turbo-Fun-Controlnet-Union-2.1-lite-2601-8steps model patch in order for the controlnet setup to work.

- below this is the loader node for the basic detailers. the checkpoint i'm using is gonzalomoXLFluxPony_v60PhotoXLDMD, and you'll also need the sam_vit_b_01ec64.pth file. at the bottom of this node are two text fields. the first one is the positive prompt for the detailers ("realistic high detail" is what i put in there) and the second is the negative ("deformed, ugly, missing, inhuman"). the basic detailers don't really change the shape of anything, just the texture, so they aren't really able to fix the number of fingers on a hand, for example.

- on the right of area 1 is the lora loader. i use LoraManager here, which is pleasant, but you should replace it with whatever makes you most comfortable. if you are using LoraManager, you will get a button along the top of the screen to load the web page for adding loras to the workflow. don't worry about the text field - that gets updated when you add the loras.

area 2: image folders

this section is going to take a little bit of work outside of ComfyUI. you will need to set up a folder somewhere to put all of the images you want to use as sources for your controlnet poses. inside that folder you will need a landscape folder and a portrait folder. you can also put the images in subfolders of those folders. i have the following subfolders specified: all fours, crouching, kneeling, lying on back, lying on front, other, sitting, sitting on floor, and standing. remember that you will want one set of subfolders each for the landscape and portrait folders.

if this is too tedious or unnecessary, then you can just have the landscape and portrait folders and only use "load landscape - all" and "load portrait - all" when you get to area 4.

you won't really need to touch most of what's in area 2 of the workflow itself. most of the nodes will be muted. since you can't use a remote mute switch on nodes inside subgraphs i had to keep them all at the top level. so i tried to make them look pretty.

however, at the bottom of area 2 (the green node) you do need to put in the path to the top level folder where the images are (the folder that has landscape and portrait in it).

i recommend using images that have a longest side no bigger than 1024. the controlnet area will automatically tinker with the size anyway to make sure it's not too big or small.

area 3: toggles and settings

now it gets complicated! this is where you can fine-tune how you want the workflow to operate. the red nodes are toggles. the green nodes are settings. here is what each one does:

- model switcher: this node lets you switch between a GGUF model and a regular model. the workflow will not load the one you don't have selected, so it's ok to have both of them assigned in area 1!

- select first pass sampler: you can pick between ZSampler Turbo and the iTools version of KSampler. ZSampler Turbo is a nifty settings prepackage that makes good results quickly, but you can't do any tweaking. i use the iTools version of KSampler as the other option (the one that I prefer) because it lets me output stats. otherwise it's basically KSampler.

- second sampler toggle: my workflow does its sampling in two main stages: the first stage gets the overall sense of the image and the second one refines it to look good. but some samplers are decent just on the first pass, especially when the steps are high enough, so if you don't want the second sampler you can turn it off. (you can't turn off the first sampler because that just breaks the workflow haha.)

- QwenVL prompt expander toggle: i personally usually keep this turned off, but if you want the workflow to make the prompt you've written more detailed, you can turn this on.

- prompt expander instructions: if you want to change the instructions given to QwenVL for the prompt enhancement, change this text.

- enable mid-workflow sounds: after the first and second sample stages there are sound nodes that play a little alert to let you know they're done. mostly you'll want this if you have the stop/go nodes turned on (see below). if you don't, you may as well turn the mid-workflow sounds off, because they can get annoying. there is always an end-of-flow sound.

- save prompt as text file: with this turned on, the processed prompt (with pose and enhancements, if they are used) will be saved to a text file alongside the image.

- post-processor toggles: after the second sample stage, there are five more steps that you can choose to include. the first one is the basic detailer step, which has toggles a little farther down. the other steps do different types of beautification, but they do add time to the generation:

- face detail inpaint: a much better option than the basic one, but it takes longer and sometimes it goes weird and makes the character's face sweaty. i've done some tweaking to reduce the chances of that happening, but unfortunately i haven't been able to eliminate it entirely.

- SeedVR2 upscaler: my default settings will upscale the image to 4K (the long edge will be 3840 pixels). i don't use ultimate sd upscale because it makes my gaming laptop cry.

- optical realism & physics (down and to the right): this does some subtle lens distortion type stuff, like adding grain, vignetting, chromatic aberration and so on.

- apply LUT settings: this lets you apply a LUT to the final image.

- activate stop/go options: my computer is not very powerful so image generation takes a while. i don't like having to wait for the workflow to finish when i already know it's going to look bad. so i added toggles that let you stop the process after the first and/or second stages. when we get to area 6 you'll see how it's set up. if you have a zippy computer, leave these turned off, so the workflow will go from start to finish without interruption.

- detailer bypass options: this node lets you toggle the basic detailers. there are six of them: face, nipples, hands, pussy, feet, and teeth. if you're going to use the face inpaint detailer, keep the face basic detailer off. and if you like not having ComfyUI crash sometimes, keep the teeth detailer off. (for some reason that one likes to blow up occasionally.)

- upscale first latent by: if you are planning to only use one sampling stage, you may want to make your latent bigger to get more detail. i recommend setting this to 1.5. if you are using both sampling stages, i recommend leaving this at 1.0.

- first sampler (ZTurbo): this is the pre-packaged sampler that you can toggle. you can adjust the number of steps (default is 9). you can't actually change the denoising that much, only down to 0.98.

- first sampler (iTools): this is the other first sampler option. you have more control here. i personally prefer dpmpp_2m / beta with 10 steps, but euler / simple works pretty well too. bear in mind that if you are using the two stages, the second stage will have a lot more impact on the look of the final image.

- second sampler: this is the second sampler. i like using dpmpp_2m / beta with 6 steps. i also set the scale_by to 1.5, so that the second stage works on a bigger latent than the first (allowing for more detail). with my defaults, stage 2 at 1.5x takes about the same amount of time as stage 1. set your denoising the way you set img2img. i prefer 0.70, but anything from 0.50 to 0.80 tends to look good.

- sampler seed behavior: if you set it to randomize each time (the seed will show -1), both samplers will randomize. the seed will be the same but that doesn't really matter. if you set it to new fixed random, the node will select a single seed and keep it constant until you click new fixed random again.

- SeedVR2 upscaler long edge size: set this to whatever you want the longest edge of your final image to be - for landscape, it'll be width, and for portrait, it'll be height. i set the default to 3840, which is 4K.

- SeedVR2 upscaler: here are some more options to tweak the upscaler. if you have a better computer than mine you can go into the subgraph and swap in the 2.5 version of the node.

- optical realism & physics: here you can tweak all of the settings for the pre-LUT postprocessor. you can get some really pleasant results that look a little more like actual photographs. i recommend keeping most of the numbers fairly low, unless you actually want some funky visuals.

- apply LUT settings: this handles the very last step in the workflow. you can apply any LUT that you have placed in models/luts. lut files end in .cube and you can find them in various places on the internet. my favorite is tweed 71, set to 0.50 strength.

- (basic) detailers (upper right side of area): the six basic detailers have default setups for which bbox file they use, but here you can change them around and adjust the strength (denoise) of the detailing. remember: the teeth detailer (JTeethSeg.pt) really likes to crash. it works...but it likes to crash.

area 4: controlnet

this area looks complicated but it's actually fairly simple. most of it is toggles.

- controlnet activator: if this is toggled on, this area will run during the workflow. if it's toggled off, the workflow will bypass most of it. however, if you do toggle it off, you must also turn on controlnet off (img only) and you must turn off remove background and Enable QwenVL pose description in the nodes below.

- controlnet option selector: here is where you pick what type of image you want to bring into controlnet. load image lets you select an individual image of your choice using the load image node on the far right. it doesn't have to be one from your pose image folders. load landscape pose will pick a random image from the landscape folders and load portrait pose will pick a random image from the portrait folders. you will be able to specify subfolders next. vnccs pose studio (new for 1.5!) lets you use the node on the far right to manually design the pose you want. see below for a little bit more detail. controlnet off (img only) must be turned on if the controlnet activator is off. when you use this option, you can set the size of the latent you want on the right side of the area, at the top. i set the default to 1024 (longest side) but you can change that, and you can also set the aspect ratio you want. images generated using controlnet will use the dimensions of the image you load (manually or randomly).

- remove background?: this toggle will remove the background from the loaded or randomly selected image used by controlnet. (remember to toggle this off if you have controlnet engine set to no / img size only set to yes.) you can keep the background if you want but if there's too much going on in the image the samplers will suck that in and use it, which might not be what you want. also, if you're using the QwenVL pose description option, removing the background makes it easier for the model to ignore everything except the pose.

- QwenVL pose description toggle: if you turn this on (turn it off if the controlnet is off!), when the workflow loads the pose image for controlnet it will also generate a description of the pose and attach it to the end of your prompt. this can sometimes be a helpful guide for the samplers, especially the second one, which doesn't use controlnet.

- controlnet strength: this manages how intensely you want the samplers to pay attention to the pose. i've set the default to 0.85 but if you want the generation to be looser you can drop it down.

- QwenVL pose description settings: you can adjust the pose description prompt and the model used here. you want to make sure the result you get only describes the pose and not the clothes etc., so that it doesn't interfere with anything else you're adding to the main prompt.

- preprocessor: here is where you select the preprocessor that controlnet will use to tinker with the pose image you feed it. i find that the binary preprocessor has the most consistent results for pose images that are photographs of people who aren't your character, because it takes out most of the detail but leaves just enough to get a sense of the body's position. dw and openpose are good, too, but sometimes they come up with goofy solutions. although lots of people recommend canny, i find that it can mess up body proportions and faces, because it's too closely tied to the original image. but find what works for you!

- load landscape type and load portrait type: here you can be more specific about which pose image folder the workflow uses. whichever type you picked on the left side of the area will determine which node gets used here. setting the "off" one to anything won't matter, so you don't need them to match. if you don't have subfolders or want the workflow to pick from any of them, use the all option.

- second sampler controlnet toggle: here you can decide whether you want the second sampler to use the model with controlnet added or only interpret the pose based on the latent it receives from the first sampler. running controlnet twice can lead to really strange results (like the wrong body proportions), and in workflow versions prior to 1.6, it was always turned off for the second sampler. but now you can pick! it tends to work best if you're making your own poses with the vnccs pose studio. reading in a pose image twice tends to over-represent the pose model's look. using dw and openpose may help. another option is to use something like the zit fat slider lora to increase the size of your character at the first stage so that in the second stage it will come back to the proportions you want. with that lora set to a little larger than your intended character's size, it seems to work out pretty well.

- image resolution: this gets used when you have controlnet off (img size only) set to yes.

- load image: this gets used when you have load image in the controlnet option selector node set to yes. below the loaded image, you have the option to add space to your loaded image along one or more sides, which will move the pose figure. helpful if your pose figure is in the center of the image and you want it more to one side, for example. if you don't want to pad your image, leave all the settings at zero.

- vnccs pose studio (to the right): (new for 1.5!) this gets used when you have vnccs pose studio in the controlnet option selector node set to yes. this is kind of experimental. i learned about it, freaked out at how cool it is, added it, tested it once, uploaded the workflow. you can use the tool to design your pose. the pose will be sent to the controlnet system. a full tutorial is beyond the scope of this help text but it should be pretty easy to learn by tinkering and messing around.

- pose studio lighting toggle (to the lower left): (new for 1.6!) when vnccs pose studio is on you can turn the lighting toggle on to let you use the lighting section of the pose studio. this will insert a lighting prompt element at the end of the prompt you make in area 5.

area 5: prompt builder

now we get to the fun stuff!

i have separated the prompt creation area into seven text fields. you are welcome to use just one, but i find that separating the sections of my prompt makes it easier to change just one part without having to hunt for it. the seven sections are: composition (camera, position, etc.), character description (face, skin, body), character clothing (including accessories), location and setting (anything that isn't the character), character pose and action (if you have controlnet off, use this to describe what your character is doing), lighting/mood/style (artistic elements of the camera work), and LoRA triggers (this is where you put any triggers you need from your loras, this part of the prompt will always get put in front).

there is also a face detail inpaint prompt text field. if you are using the face inpainting detailer (*not* the basic face detailer), put in your character's face and skin description so that the inpainting will have a guide. this text field does not get added to the main prompt, so it should just be a subset of the character description box.

the seed for QwenVL node runs if you have the QwenVL pose description and/or prompt enhancer turned on. if you are running a batch of images but you're keeping the pose or prompt the same, setting this to a fixed random will let the batch run faster after the first image, since the workflow will use the previous QwenVL results. but you can set it to be random every time, and then QwenVL will run every time.

the blackboard/scratch paper node is just for taking notes or clipping prompt snippets. it's not connected to anything, it's just for you.

area 6: observer

we're in the home stretch! this is the area you will come to when you start your run. because my computer is slow i like to sit here and watch. depending on how you have things set, some or all of this area will update while the workflow goes through. if some parts are turned off, they won't update, and will just show the last thing that did pass through them.

in the top left are two images from the controlnet area: the first one is the image that got loaded in (with or without background removed) and the second one is the processed image based on the preprocessor you selected.

to the right of these are the prompt fields. the first one is the prompt you typed in with no adjustment. the second one is the pose description that QwenVL came up with, if you turned that on. and the third one is the actual prompt used by the workflow, with the pose description added (if turned on), modified by the prompt enhancer if you used it.

below the controlnet images are the first pass and second pass images. if you have the stop/go toggles turned on, those will be active below each image. the one on the left will pause the workflow after the first pass, and you will have to click continue or cancel. the one on the right does the same after the second pass.

the big image in the bottom center is a comparison node between the second pass and the final result. if you have any of the detailers turned on, etc., those will be reflected here.

finally, on the right, i have two stats blocks for the first pass and the second pass. the first pass one only works if you're using the regular KSampler. the ZSampler Turbo node doesn't send stats.

final notes

and that's everything! if you have any questions i will do my best to answer them. this is my first real workflow and it's probably got some issues and it has way too many noodles running all over. but i think it works pretty well and i hope it's easy to understand. thank you!

Description

FAQ

Details

Files

saoirsesCharacter_v10.zip

Mirrors

Description

FAQ

What is saoirse's character workflow for z-image turbo?

What files are available and where can I download them?

Details

Files

saoirsesCharacter_v10.zip

Mirrors