DCAI
Loading Light/Dark Toggl

[ComfyUI] How to use the latest high-performance model Flux.1 [schell]

⏱️13min read
📅 Aug 08, 2024
🔄 Sep 10, 2024
[ComfyUI] How to use the latest high-performance model Flux.1 [schell] featured Image
Supported by

This time we would like to introduce Flux.1, developed by 🔗Black Forlest Labs as a competitor to Stability AI’s 🔗Stabule Diffusion 3 and Fal AI’s 🔗Auraflow. The model is available in a closed-weight [pro] version that can be used only with the API, an open-weight [dev] version that can be used for non-commercial purposes (switchable by contacting Black Forlest Labs), and an open-weight [schnell] version available under the Apache 2.0 license, which we introduce here. Developed on a 12-billion-parameter base, about twice as large as the SDXL model, it is the most advanced image-generating AI model currently available for local environments.

PR
Image of Corsair Vengeance i7400 Series Gaming PC - Liquid Cooled Intel® Core™ i9 12900K CPU - NVIDIA® GeForce RTX™ RTX 4090 GPU - 2TB M.2 SSD - 64GB Vengeance RGB DDR5 Memory - Black
Corsair Vengeance i7400 Series Gaming PC - Liquid Cooled Intel® Core™ i9 12900K CPU - NVIDIA® GeForce RTX™ RTX 4090 GPU - 2TB M.2 SSD - 64GB Vengeance RGB DDR5 Memory - Black
🔗Amazon-Usa Link
Image of ASUS ROG G16CH (2024) Gaming Desktop PC, Intel® Core™ i7-14700F, NVIDIA® GeForce RTX™ 4060Ti Dual, 1TB PCIe® Gen4 SSD, 32GB DDR5 RAM, Windows 11, G16CHR-AS766Ti
ASUS ROG G16CH (2024) Gaming Desktop PC, Intel® Core™ i7-14700F, NVIDIA® GeForce RTX™ 4060Ti Dual, 1TB PCIe® Gen4 SSD, 32GB DDR5 RAM, Windows 11, G16CHR-AS766Ti
🔗Amazon-Usa Link

Flux.1 [schnell] Features

Flux.1 [schnell] is said to outperform Midjournery-V6.0 among currently operational AI image generation models. The table below is based on the 🔗ELO Score score which shows the quality of the scores collected from over 100,000 users published by 🔗Artificial Analysis. The table shows that Black Forlest Labs’ flagship Flux.1 [pro] and Flux.1 [dev] are not as good as Stability AI’s flagship Stable Image Ultra. It exceeds Stability AI’s flagship model.(The graph is labeled SD3-Ultra, but we think it probably refers to Stable Image Ultra.)

Graph of ELO score
Image from Black Forlest Labs
Open Image

The spider chart below shows the prompt following, size/aspect variability, typography, output diversity, and visual quality benchmark table. We can see that the score for Flux.1 [schnell], which is also presented here, is a little lower in output diversity, but the overall score is higher.

Spider Chart of Benchmark
Image from Black Forlest Labs
Open Image
  • Parameter base: 12 Billion parameter base
  • Text Encoder: Open AI CLIP ViT-L/14 & Google T5 XXL
  • Generate high-quality AI illustrations in 1-4 fast steps (schnell version only)
  • Prompt comprehension further improved compared to SDXL

The recommended specifications for Flux.1 [schnell] are quite high: the model has a capacity of 23.8G, so 24G of GPU memory is ideal for loading all data into GPU memory. (The overflow will also use shared memory, so it can be used even if it doesn’t all fit.) However, if you use the 🔗FP8 Model or 🔗pruned models, or use them while freeing up memory, you can run with 16G of GPU memory.

How to install Flux.1 [schnell]

Before using Flux.1 [schnell], update to the latest version as there are standard nodes that cannot be used if ComfyUI is out of date. (v0.0.4 or higher recommended)

The installation procedure is 🔗 written with reference to the 🔗official ComfyUI documentation.

Download Models

To install Flux.1 [schnell], you need to install a base model, text encoder, etc. Let’s take a step-by-step approach.

Download Basic Workflow

Download and load the official workflow or a slightly modified DCAI workflow.

The official version can be seen by loading the image of the bottle in the 🔗ComfyUI’s official documentation Flux Schnell.

The DCAI version is available on Patreon for your reference.

Explanation of the basic workflow for Flux.1 [schnell]

This explanation is based on DCAI’s basic workflow. We are only adding one new node to the official workflow.

Load Diffusion Model

Node to load the model. Note that the normal LoadCheckpointModel cannot be used to load the model.

  • unet_name: Select a model. In this case, select flux-schedule.safetensors.
  • weight_dtype: The default (FP16) is fine, but since it takes time to generate, use fp8_e4m3fn or fp8_e5ms.
ModelSamplingFlux

This node is not used in the official workflow, but is the Flux version of the time-step scheduling shift used in Stable Diffusion 3. (More detailed settings are now available than in ComfyUI v0.0.4.)

DualCLIPLoader

Load the text encoder model. codt5xxl_fp16 and codclip_l should be selected for clip_name. [t5xxl_fp8_e4m3fn instead of fp16 if there is not enough PC memory.]

BasicGuider

Flux uses BasicGuider to support SamplerCustomAdvanced as in SD3.

SamplerCustomAdvanced

Sampler node parameters for the next generation models have been externalized for more fine-tuning.

About setting up the flux.1 [schnell]

When using ComfyUI, please note the values of base_shift in ModelSamplingFlux and guidance in FluxGuidance. These values are reflected in Flux.1 [dev], but not in Flux.1 [schnell].

We have not dug deep, but since Flux.1 [schnell] is Timestep-distilled and Flux.1 [dev] is Guidance-distilled, we suspect this is the effect of this difference.

Sampling Settings

  • sampler: euler
  • scheduler: simple
  • steps: 4

Resolution settings

Resolutions from 0.1 to 2.0 megapixels are supported. In this case, 1 megapixel is used to generate a 19:13 1216 x 832 image.

Flux.1 [schnell] Customizing a Basic Workflow

Have you ever used the basic workflow and felt that the quality of the generated illustrations was not high enough? Here we will show you how to upscale to improve quality and how to free up memory for low-GPUs. The workflow is available on Patreon.

Required Custom Nodes

You can find several workflow examples on the web, but DCAI has created a sample with a minimum number of custom nodes to make it less difficult. In this case, we will use the following custom node because we want to create a process that generates GPU memory while freeing it.

  • ComfyUI Layer Style: LayerUtility: uses Purge VRAM node; LayerColor: ColorAdapter: corrects for color errors after upscaling.
  • ntdviet/comfyui-ext: Use the LatentGarbageCollector node.

If you do not know how to install a custom node for ComfyUI, please refer to the following article.

Custom Steps

  • Change VAE Decode: First, change VAE Decode to “VAE Decode (Tiled)” after SamplerCustomAdvanced in the basic workflow to prevent memory shortage. If memory is low, the default value of 512 is fine.
  • Upscale Image (using Model) to enlarge: Upscale Image (using Model) to enlarge. The model is loaded from Load Upscale Model.
  • Reduce the enlarged image to the desired size: In this case, we used a model that expands by a factor of 4, so we reduced the “Upscale Image By” to 30%.
  • Re-create the image using i2i based on the resized image: SamplerCustomAdvanced ( 2nd time), noise and sampler share the value of the 1st time. Connect a new prompt (CONDITIONING) to “BasicGuider” because we want to use the magnification prompt. We use “CLIPTextEncodeFlux” as a tutorial. we want to input the same prompt to clip_l and t5xxl, so we externalize it with Convert Widget to Input and input it in the Primitive node. guidance is not available in FLUX.1 [schnell], so the default is fine. “BasicScheduler” sets the scheduler to sgm_uniform with steps set to 1 and denoize to 0.20, the same as the first time.
  • Release generated data from memory: “LatentGarbageCollector” and “LayerUtility: Purge VRAM” to release data.
  • color correction: The enlarged image was faded, so “LayerColor: ColorAdapter” is used to correct it.

The above process seems to have improved quality at upscale, what do you think?

Final results of workflow
Final generated illustration
prompt: A girl knight standing at hill side. Under the blue sky. horizonin view, 50mm lens shot.The big word statues of \"DCAI\" is behind girl.beautiful girl, cute face, looking at viewer, medival, (latest japanese comic style:1.1),ultra detailed
Open Image

About Flux.1 [schnell] FP8 Checkpoint Edition

The FP8 checkpoint version can be used as a checkpoint model and will work well with middle spec GPUs.

How to use Flux.1 [schnell] FP8 Checkpoint Edition

Download the model ComfyUI/models/checkpoints/ from the link below, load the checkpoint model into the default workflow of ComfyUI. You can use KSampler by setting staps to 1-4 and CFG to 1.0 in the recommended settings. If you get an error, update ComfyUI to the latest version.

We also tried it with A1111 WebUI (v1.10.1) but it gave me an error and did not work. We have not tried it but it seems to work with 🔗StableSwarmUI or 🔗Forge.

Conclusion

This article was an introduction to Flux.1 [schnell]. It has only been on the market for a short time, so we are looking forward to seeing more of this model in the future. Black Forlest Labs, the developer of Flux.1 [schnell], seems to be developing a TEXT TO VIDEO model next, so we would like to keep an eye on this one as well. I would like to write another article on Flux.1 [schnell] when I have more information about it.

PR
Image of MSI Gaming RTX 4080 Super 16G Expert Graphics Card (NVIDIA RTX 4080 Super, 256-Bit, Extreme Clock: 2625 MHz, 16GB GDRR6X 23 Gbps, HDMI/DP, Ada Lovelace Architecture)
MSI Gaming RTX 4080 Super 16G Expert Graphics Card (NVIDIA RTX 4080 Super, 256-Bit, Extreme Clock: 2625 MHz, 16GB GDRR6X 23 Gbps, HDMI/DP, Ada Lovelace Architecture)
🔗Amazon-Usa Link
Image of Seasonic Focus V4 GX-1000 (ATX3) -1000W -80+ Gold -Full-Modular -ATX Form Factor -Premium Japanese Capacitor -10 Year Warranty -Nvidia RTX 30/40 Super & AMD GPU Compatible -Focus V4 GX-1000 (ATX3)
Seasonic Focus V4 GX-1000 (ATX3) -1000W -80+ Gold -Full-Modular -ATX Form Factor -Premium Japanese Capacitor -10 Year Warranty -Nvidia RTX 30/40 Super & AMD GPU Compatible -Focus V4 GX-1000 (ATX3)
🔗Amazon-Usa Link
Supported by