Detailed usage of Flux.1 ControlNet + IP-Adapter in ComfyUI
Five months have passed since the release of flux.1 and the surrounding environment has been enhanced. In this article, we would like to explain the ControlNet in Flux.1. You can also use Canny and Depth with “Flux.1 Tools” released last November, but this article is only about ControlNet. We will write about “Flux.1 Tools” in another article.
Custom Node Installation
As a prerequisite to using ControlNet with ComfyUI, you need to install a custom node for the preprocessor. use the “Custom Nodes Manager” in the ComfyUI Manager to search and install ComfyUI's ControlNet Auxiliary Preprocessors
.
If you have never used ControlNet with ComfyUI, please read through the article below for a basic explanation of how to use it.
ControlNet Model Installation
Currently, ControlNet models are available from Xlabs, InstantX, and Jasper AI. Each of these models is introduced below.
Xlabs
Xlabs has released three models, “Canny”, “Depth”, and “HED”, trained on 1024×1024, and a beta version of “ip-adapter”. The first three models should be placed in ComfyUI\models\xlabs\controlnets
The “ip-adapter” model should be placed in ComfyUI\models\xlabs\ipadapters
.
InstantX
InstantX has “ControlNet Union” bundled with Canny, Depth, Tile, Blur, Pose, Gray, and Low quality, and independent “Canny” and “IP-Adapter”. Older versions of ComfyUI required custom nodes, but now you can use only basic nodes. The downloaded model is named diffusion_pytorch_model.safetensors
, so rename it to something like flux-dev-controlnet-union.safetensors
to make it easier to understand for yourself. and place it in the \ComfyUI\models\controlnet
.
Jasper AI
There are three from Jasper AI: Upscaler, Surface-Normals, and Depth. The downloaded model is placed in \ComfyUI\models\controlnet
. These models also have the name diffusion_pytorch_model.safetensors
, so change the name to something like jasperaiFlux.1-dev-Controlnet-Depth.safetensors
to make it easier to understand for yourself.
Try using Flux.1’s ControlNet
Once the model is installed, let’s use ControlNet. The workflow will be explained as you go through the process.
Xlabs
If you use the Xlabs ControlNet model, you will need to install a dedicated custom node.
Custom Node Installation
Open ComfyUI Manager and install “x-flux-comfyui”.
Use the “Custom Nodes Manager” to search for and install x-flux-comfyui
.
Explanation of Official Workflow
Once the installation is complete, there will be a workflow in the \ComfyUI\custom_nodes\x-flux-comfyui\workflows
. In this article, flux-controlnet-canny-v3-workflow.json
will be explained.
About Nodes
This section describes the nodes dedicated to the use of ControlNet.
Load Flux ControlNetLoad the Flux.1 ControlNet model from Xlabs. For model_name, select the Flux.1 model you want to adapt. For controlnet_path, select the ControlNet model you want to use.
Apply Flux ControlNetThe ControlNet model and the image of the preprocessor adapted to the model (in this case, the image extracted by Canny Edge) are output as controlnet_condition. The input controlnet_condition is used when you want to adapt multiple ControlNet. Also, strength controls the strength of the ControlNet.
Xlabs SamplerThis is a sampler that uses the XLabs Control Net. Note that this sampler cannot use samplers, scheduler types, etc. like the regular KSampler. It has the following three key parameters.
- timestep_to_start_cfg:Sets from which step stage the CFG and negative prompts are adapted.
- true_gs:It is scaled to the CFG scale.
- image_to_image_strangth:Sets the intensity of the original image when used with img2img.
InstantX
InstantX can be used in the usual ControlNet configuration; an official workflow is available at Civitai.
The official workflow is an example of OpenPose usage; OpenPose data is extracted from “Optional: Extract Openpose from image”, but if you only need the data, it is available on the DCAI drive.
Jasper AI
Normal sample
Jasper AI can be used in almost the same configuration as InstantX mentioned earlier. The difference is that “SetUnionControlNetType” is not required, so it connects directly to Apply Controlnet from “Load ControlNet Model” of the model. There is no official workflow for this one, so a simple workflow is available on the drive.
Upscaler sample
The Upscaler model is configured a little differently than the others, so here is an example workflow.
The workflow features a blank prompt. Upscale_method in “Upscale Image” may be generated more beautifully by switching to lanczos
depending on the input image.
The upscaler is designed to upscale very small images, so upscaling images larger than 1024 pixels will either result in an OOM (out-of-memory) error or, even if it does not, will take much longer to generate and the results will be unusable.
Introduction of practical workflows using ControlNet
The following is a workflow using the ControlNet introduced here. The functions incorporated in this workflow are as follows
- Use GGUF to reduce load on VRAM
- Use LoRA to determine the direction of the illustration
- Fix characters using IP adapter
- Implement 2nd pass to increase scale & detail
The workflow is available on Patreon, but only paid supporters can view and download it.
Even if you cannot download the workflow, you can configure it yourself by looking at the explanation.
Required Custom Node Installation
The following custom nodes and models must be installed to run this workflow. If you have already installed them, please update to the latest version to avoid problems.
- ComfyUI-GGUF:Custom node to load Unet and CLIP in GGUF format
- ComfyUI-IPAdapter-Flux:Custom node to run IPAdapter published by InstantX
Required Model Installation
Download the models below.
GGUF Models
This example uses the base model flux1-devQ8_0.gguf
. If you cannot generate the model due to an OOM (out-of-memory) error, change it from Q8_0 to Q6_K
or Q5_K_S
, although the quality will change.
For more information, please refer to the following article.
Base ModelPlace the downloaded model in the \ComfyUI\models\unet
.
Place the downloaded model in the \ComfyUI\models\clip
.
LoRA Models
Place the downloaded LoRA model in the \ComfyUI\models\loras
IPAdapter Model
Place the IPadapter model you downloaded in \ComfyUI\models\ipadapter-flux
. If there is no ipadapter-flux folder, please create a new folder.
Explanation of Graphs
Let’s take a look at each group.
IP Adapter
The IP Adapter group is a group of IP Adapter-related nodes.
Load a reference image with “Load Image”. The reference image used is an image generated by SDXL’s Animagine XL V3.1 and refers to the illustration style and person. However, the accuracy is not very high.
“Upscale Image” resizes the reference image to 1024 pixels, suitable for the IP Adapter.
Load the downloaded ip-adapter.bin
into the ipadapter of “Load IPAdapter Flux Model” and select google/siglip-so400m-patch14-384
4 for clip_vision. This model will be installed automatically when the custom node is installed. provider can be selected from “cuda”, “cpu” and “mps (Multi-Process Service)”.
The “Apply IPAdapter Flux Model” allows you to adjust the IP Adapter’s adaptation rate by specifying the strength of the IP Adapter in the “weight” parameter and the “start_percent/end_percent” parameter determines at which point in the generation step the IP Adapter is adapted. At the time of writing, setting end_percent to 1.0 will cause noise in the generated results, so set it to 0.8
or less.
Load Basic Models
Load Basic Models loads a basic models.
Load the base model flux1-devQ8_0.gguf
in the “Unet Loader (GGUF)”. if you cannot generate it due to OOM (out-of-memory) error, change Q8_0 to Q6_K
or Q5_K_S
, although the quality will change.
Load the text encoder at the “DualCLIPLoader (GGUF)” node. Load t5-v1-xxl-encoder-Q8_0.gguf
instead of the usual “t5xxl_fp16”. Here, too, set Q6_K
or Q5_K_S
to match your VRAM.
Load each LoRA with two “LoraLoaderModelOnly”. In this case, the strength_model is set to 0.50
for both.
Basic Info
Basic Info summarizes the information necessary for generation.
It is almost the default setting, but the prompt is written as follows.
A masterful highly intricate detailed anime.
A girl looking at viewer.
in the In the European medieval fantasy era.
The generated size is set to 720p size, 1280×720 pixels, which will be enlarged later.
Upscale
In the Upscale group, the pixel image generated in the 1st pass is once quadrupled using 4x-UltraSharp.pth
and then scaled to the desired pixels using the “Scale Image to Total Pixels” node. In this case, 3.0MP
is selected, but if you get OOM (out-of-memory) error, reduce this number.
2nd Pass Info
2nd Pass Info is the setting for the 2nd pass. The most important value is the denoise value, which can be lowered to get a result similar to the 1st pass.
How to use Workflow
Basically, the “Queue” button can be used to generate the results. If you want to ranomize
the “RandomNoise” node and quickly see the result of the 1st pass, mute or bypass the “Preview Image (2nd Pass)” node and press the “Queue” button. With this setting, the program stops when the 1st pass is generated, so keep going until you get the result you want.
When you are satisfied with the result of the 1st pass, right-click on the image of the result you like from the Queue list in the side menu and load it from “Load Workflow”.
After loading, first set the “RandomNoise” node to fixed
.
Remove the mute or bypass on the “Preview Image (2nd Pass)” node and click the “Queue” button to resume the generation from Upscale. (If the 1st pass data has been released from memory, the image will be generated from the beginning.)
Since the “Save Image” node is not used in this workflow, right-click on “Preview Image (2nd Pass)” and save it as a Save Image or change the node to the “Save Image” node.
Final Result
The final result is as follows: Flux.1 is not good at generating animated illustrations (faces and eyes are not generated well), However, by using IP Adapter and LoRA to give direction to the style, as shown in this workflow, it is possible to create illustrations with detail.
Conclusion
In this article, we introduced the use of ControlNet in Flux.1. By using ControlNet, it is possible to create illustrations that are difficult to create using the standard Flux.1 base model alone, and it is possible to create even higher quality illustrations. Flux.1 is capable of producing high-quality photorealistic images, but SDXL’s “Animagine XL” and “NoobAI-XL (Illustrious)” produce higher quality illustrations. However, with ControlNet, it is also possible to produce high-quality illustrations using IP Adapter, as shown in the workflow introduced here.