DCAI
Loading Light/Dark Toggl

[ComfyUI] Detailed usage of Flux.1 [dev] Also introduces lightweighting using GGUF

⏱️20min read
📅 Nov 13, 2024
🔄 Nov 25, 2024
[ComfyUI] Detailed usage of Flux.1 [dev] Also introduces lightweighting using GGUF featured Image
Supported by

We have already introduced Flux.1 [schnell], and now we will explain Flux.1 [dev]. flux.1 [dev] can produce high-quality illustrations, although its generation time and required specifications are larger than those of Flux.1 [schnell]. Also, unlike Flux.1 [schnell], the license is a non-commercial license and the model is designed for developers and AI researchers. The generated output can be used for personal, scientific, or commercial purposes, but paid services or selling the model is prohibited. For more information about the license, please check the official 🔗Black Forlest Labs license.

PR
Image of ASUS TUF Gaming GeForce RTX 4090 24GB Gaming Graphics Card (DLSS 3, PCIe 4.0, 24GB GDDR6X, HDMI 2.1a, DisplayPort 1.4a, TUF-RTX4090-24G-GAMING)
ASUS TUF Gaming GeForce RTX 4090 24GB Gaming Graphics Card (DLSS 3, PCIe 4.0, 24GB GDDR6X, HDMI 2.1a, DisplayPort 1.4a, TUF-RTX4090-24G-GAMING)
🔗Amazon-Usa Link
Image of MSI Gaming RTX 4070 Ti Super 16G Gaming X Slim Graphics Card (NVIDIA RTX 4070 Ti Super, 256-Bit, Extreme Clock: 2685 MHz, 16GB GDRR6X 21 Gbps, HDMI/DP, Ada Lovelace Architecture)
MSI Gaming RTX 4070 Ti Super 16G Gaming X Slim Graphics Card (NVIDIA RTX 4070 Ti Super, 256-Bit, Extreme Clock: 2685 MHz, 16GB GDRR6X 21 Gbps, HDMI/DP, Ada Lovelace Architecture)
🔗Amazon-Usa Link

Flux.1 [dev] Features

Flux.1 [dev] is a 12 billion parameter guidance-distilled model that offers more degrees of freedom than the latent adversarial diffusion distillation model of Flux.1 [schnell]. And allows the use of ControlNet, etc. ControlNet, etc. can also be used.

How to install Flux.1 [dev]

Before using Flux.1 [dev], update to the latest version of ComfyUI as there are some standard nodes that cannot be used if ComfyUI is out of date.

Download Models

To install Flux.1 [dev], you need to install the base model, text encoders. Let’s go through the steps. If you have previously downloaded and used the Flux.1 [schnell] model, download only the base model.

Explanation of the official ComfyUI workflow for Flux.1 [dev]

Download the image of the fox girl with the cake from the Flux Dev section of the official ComfyUI documentation and drag and drop it into ComfyUI or load it from the Open button in Workflow. If the model is placed in the correct folder, the “Queue” button will generate an image identical to the downloaded image. The first time the image is generated, it will take a long time to load the data into memory. From here, each node will be explained.

Load Diffusion Model

Node to load the model; Please note that the normal LoadCheckpointModel cannot be used for loading. unet_name is flux1-dev.safetensors and weight_dtype is default (FP16). However, since it takes time to generate, you may want to use fp8_e4m3fn.

DualCLIPLoader

Load the text encoder model; choose t5xxl_fp16.safetensors and clip_l.safetensors for clip_name. (Use t5xxl_fp8_e4m3fn.safetensors instead of fp16 if PC memory is insufficient.) Use flux for type.

Load VAE

Load VAE; select ae.safetensors for vae_name.

BasicGuider

Flux uses BasicGuider to support SamplerCustomAdvanced as in SD3.

FluxGuidance

Set the guidance (CFG) for Flux. Use the default of 3.5.

CLIP Text Encode (Positive Prompt)

Flux does not use negative prompts, only positive prompts.

EmptySD3LatentImage

They have created an empty latent image for Stable Diffusion3, but the generated result will be the same with the normal “Empty Latent Image”. width and height are externalized to be shared with “ModelSamplingFlux”, which will be explained later.

RandomNoise

Specify the seed for generation. If you want the same result as the sample image, set noise_seed to 219670278747233 and control_after_genetate to fixed.

KSamplerSelect

Select a sampler. Basically, euler is fine.

BasicScheduler

The sampling schedule is set up. scheduler is set to simple and steps to 20. denoise is used as 1.00 because there is no original image.

ModelSamplingFlux

This is the Flux version of the time-step scheduling shift used in Stable Diffusion 3. max_shift is the maximum value for the shift. base_shift is the base value for the shift. width / height is the size of the generated image. In this example we will use the default values.

SamplerCustomAdvanced

Sampler node for next generation models. Parameters have been externalized to allow for more fine-tuning.

Customize the official workflow

Here we would like to customize the official workflow in a practical way. The items we want to incorporate are as follows

  • Implementation of GGUFs due to long generation time
  • Multiple LoRAs implemented
  • Image chooser custom node: Implemented to check the result of the 1st Pass. This makes it easy to rerun the 1st Pass, which has a short generation time, until a satisfactory generation is obtained.
  • Upscale with 2nd Pass implementation

Implementation of GGUF

First, let’s install GGUF. Download the 8-bit version of the GGUF model of Flux.1 [dev] published by city96 from the link below. The download location is ComfyUI\models\unet.

If you have already installed the GGUF version of the T5-xxl text encoder in the previous article, you do not need it. The download location is in ComfyUI\models\clip.

Once you have downloaded the model, the next step is to find and install ComfyUI-GGUF using the “Custom Nodes Manager” .

Now, let’s insert the nodes. Replace the “Load Diffusion Model” node with “Unet Loader (GGUF)” from ComfyUI-GGUF that you just installed. Select flux1-dev-Q8_0.gguf for the unet_name.

Next, replace “DualCLIPLoader” with “DualCLIPLoader (GGUF)”. Although the order is not set, select clip_l.safetensors for clip_name1 and t5-v1_1-xxl-encoder-Q8_0.gguf downloaded earlier for clip_name2.

Node Locations
  • Unet Loader (GGUF):bootleg > Unet Loader (GGUF)
  • DualCLIPLoader (GGUF):bootleg > DualCLIPLoader (GGUF)

Implementation of LoRA

We would like to implemant two LoRAs into this project. Download the following LoRAs. The download location is in the usual ComfyUI\models\loras. (When using multiple LoRAs, it is convenient to use Power Lora Loader, etc. in the custom node rgthree, but in the DCAI workflow, we want to keep it as simple as possible, so we minimize the use of custom nodes).

It is not so difficult to install LoRA in Flux.1, just insert “LoraLoaderModelOnly” between the “Unet Loader (GGUF)” you just replaced and the “ModelSamplingFlux” you are connecting to. In this case, we want to use two nodes, so we will connect two nodes.

There is no particular order in which to load the LoRAs, but from the first “LoraLoaderModelOnly” node, select aidmaImageUprader-FLUX-v0.3.safetensors for lora_name and set strength_model to 0.25. Then select the second node’s lora. Then select the second node’s lora_name as sifw-annihilation-fluxd-lora-v013-Beta-000015.safetensors and set the strength_model to 0.85.

Each LoRA used in this project has its own trigger word, so add the following prompt at the end of “CLIP Text Encode (Positive Prompt)”. *You can use LoRA without the trigger word.

aidmaimageupgrader, sifwastyle, anime
Node Locations
  • LoraLoaderModelOnly:loaders > LoraLoaderModelOnly

Installing the Image chooser

Use the “Image chooser” familiar to DCAI workflow. If you do not have it installed, use the “Custom Nodes Manager” to search for and install the Image chooser. If you want to know how to install it in detail, please refer to the following article.

After installation is complete, connect “Preview Chooser” after “VAE Decode” after the sampling. The “Save Image” node will be used last, so move it out of the way for now.

Node Locations
  • Preview Chooser:image_chooser > Preview Chooser

Upscale with 2nd Pass implementation

From here, things get a bit more complicated. First, let’s implemant the upscale node. Place “Load Upscale Model” and “Upscale Image (using Model)” and connect UPSCALE_MODEL to upscale_model. Select 4x-UltraSharp.pth for the model_name of “Load Upscale Model”.

Then place “Scale Image to Total Pixels” and connect IMAGE of “Upscale Image (using Model)” to image. set upscale_method to lanczos and megapixels to 3.00.

Connect “Scale Image to Total Pixels” to “VAE Encode” to encode into a latent image for 2nd Pass.

Next, implement the 2nd Pass sampler: Copy and paste the 1st Pass “BasicGuider,” “FluxGuidance,” “ModelSamplingFlux,” “CLIP Text Encode (Positive Prompt),” “BasicScheduler,” and “ SamplerCustomAdvanced” are copied with Ctrl + c and pasted with Ctrl + Shift + V, keeping the input. Once pasted, place them where you like.

Next, rewrite the copied “CLIP Text Encode (Positive Prompt)” as follows

very detailed, masterpiece, intricate details, UHD, 8K

Change the denoise of “BasicScheduler” to 0.35. And change max_shift of “ModelSamplingFlux” to 0.25 and base_shift to 0.00.

Copy “VAE Decode” of 1st Pass using the same Ctrl + Shift + v method as before and connect it to denoised_output of “SamplerCustomAdvanced” of 2nd Pass.

Finally, connect the “Save Image” that was avoided in the Image chooser installation, and your customization is complete.

Click on the “Queue” button to generate the first pass results, then select the image and proceed. After a few moments, the final result will be generated.

Node Locations
  • Load Upscale Model:loaders > Load Upscale Model
  • Upscale Image (using Model):image > upscaling > Upscale Image (using Model)
  • Scale Image to Total Pixels:image > upscaling > Scale Image to Total Pixels
  • VAE Encode:latent > VAE Encode

Final Results

Final Results
Seed:219670278747233
Open Image

The workflow is available on Patreon, but only paid supporters can view and download it. If you would like to become a paid supporter for just one month, it will encourage you to write more, so please join if you like.

Even if you cannot download the workflow, you can configure it yourself by looking at the explanations given so far, so there is no need to download it by force.

Bonus

As an added bonus, let’s generate an eye-catching illustration for this article using the custom workflow we just created.

LoRA changes

To begin, add another “LoraLoaderModelOnly” after “Unet Loader (GGUF)”. Once added, download the following two LoRAs.

After the download is complete, change the LoRA settings as follows. (The order is not particularly important.)

  • Flux.1_Turbo_Detailer.safetensors:0.70
  • aidmaFLUXpro1.1-FLUX-V0.2.safetensors:0.75
  • sifw-annihilation-fluxd-lora-v013-Beta-000015.safetensors:0.90

Parameter Changes

  • Rewrite the prompt to the following prompt
    A masterful highly intricate detailed cinematic photo.
    (In the European medieval fantasy era:1.4), medium close shot of a very cute anime high wizard girl with light-pink-haired and blue-eyes  is looking at viewer. She wears a white and dark-blue magic robe.
    A vibrant diverse people. A wide variety of people faces. 
    The marketplace is offering a wide variety of fruits, vegetables, meats, breads, cheese, spices, flowers, and daily commodities. 
    In the shoppers are adventurers with various armor, swords, magic sticks, and other equipment, as well as residents. 
    In the background is a magnificent castle, and behind the castle is a mountain.
    The weather is blue with a summer-like sky and birds are flying.
    
    A hyper realistic, very detailed, masterpiece, intricate details, 50mm lens shot, soft edge line for girl's face, correct perspective, upper-body
  • Change the values of width and height; let’s change width to 1280 and height to 720.
  • Change noise_seed in “RandomNoise” to 303013184412751.
  • Change the scheduler of “BasicScheduler” for 1st Pass and 2nd Pass to beta.
  • Change max_shift to 1.50 and base_shift to 0.25 in “ModelSamplingFlux” for 1st Pass.
  • Change the guidance of “FluxGuidance” for 2nd Pass to 2.0.
  • Finally, raise the steps in “BasicScheduler” for 2nd Pass to 30 and you are done.

Click on the “Queue” button to generate the first pass results, then select the image and proceed. After a few moments, the final result will be generated.

Final Results

Final Results
Seed:303013184412751
Open Image

This workflow is also available on Patreon, but only paid supporters can view and download it.

Conclusion

How was Flux.1 [dev] compared to Flux.1 [schnell], which takes much longer to generate, and we hope you can see that it is reasonably usable if you use the GGUF or FP8 version and reduce the VRAM consumption. Also, the quality of the generation seems to be better with Flux.1 [dev] than with Flux.1 [schnell], but in some cases you may get lower quality results. In particular, the quality of illustrations did not seem to be that different from photo-realistic images, and the hand generation, which AI generation is not very good at, seemed to have many failures in this workflow. However, since ControlNet and negative prompts are available in Flux.1 [dev], we would like to introduce these in DCAI at another time.

PR
Image of ASUS ROG G16CH (2024) Gaming Desktop PC, Intel® Core™ i7-14700F, NVIDIA® GeForce RTX™ 4060Ti Dual, 1TB PCIe® Gen4 SSD, 32GB DDR5 RAM, Windows 11, G16CHR-AS766Ti
ASUS ROG G16CH (2024) Gaming Desktop PC, Intel® Core™ i7-14700F, NVIDIA® GeForce RTX™ 4060Ti Dual, 1TB PCIe® Gen4 SSD, 32GB DDR5 RAM, Windows 11, G16CHR-AS766Ti
🔗Amazon-Usa Link
Image of MSI Gaming RTX 4080 Super 16G Expert Graphics Card (NVIDIA RTX 4080 Super, 256-Bit, Extreme Clock: 2625 MHz, 16GB GDRR6X 23 Gbps, HDMI/DP, Ada Lovelace Architecture)
MSI Gaming RTX 4080 Super 16G Expert Graphics Card (NVIDIA RTX 4080 Super, 256-Bit, Extreme Clock: 2625 MHz, 16GB GDRR6X 23 Gbps, HDMI/DP, Ada Lovelace Architecture)
🔗Amazon-Usa Link
Supported by