DCAI
Loading Light/Dark Toggl

How to use LoRA with ComfyUI Flux.1 [schnell]

⏱️14min read
📅 Sep 08, 2024
🔄 Sep 08, 2024
How to use LoRA with ComfyUI Flux.1 [schnell] featured Image
Supported by

It has been about a month since Flux.1 was launched, and we are seeing more and more excitement with the appearance of ControlNet. In this article, I will explain how to use LoRA in your Flux.1 [schnell] workflow to take your quality to the next level.

PR
Image of GIGABYTE - AORUS 17 (2024) Gaming Laptop - 240Hz 2560x1440 QHD - NVIDIA GeForce RTX 4070 - Intel Ultra 7 155H - 1TB SSD with 16GB DDR5 RAM - Windows 11 Home AD (AORUS 17 BSG-13US654SH)
GIGABYTE - AORUS 17 (2024) Gaming Laptop - 240Hz 2560x1440 QHD - NVIDIA GeForce RTX 4070 - Intel Ultra 7 155H - 1TB SSD with 16GB DDR5 RAM - Windows 11 Home AD (AORUS 17 BSG-13US654SH)
🔗Amazon-Usa Link
Image of Corsair Vengeance i5100 Gaming PC – Liquid Cooled Intel Core i9-14900KF CPU – NVIDIA GeForce RTX 4080 Super GPU – 32GB Dominator Titanium RGB DDR5 Memory – 2TB M.2 SSD – Black/Gray
Corsair Vengeance i5100 Gaming PC – Liquid Cooled Intel Core i9-14900KF CPU – NVIDIA GeForce RTX 4080 Super GPU – 32GB Dominator Titanium RGB DDR5 Memory – 2TB M.2 SSD – Black/Gray
🔗Amazon-Usa Link

How to use LoRA with Flux.1 [schnell]

To use LoRA in Flux.1 [schnell], implement it in your workflow using the “LoraLoaderModelOnly” node.

To load models into “LoraLoaderModelOnly”, you need to load models into \ComfyUI\models\loras in the ComfyUI directory. Also, if you share A1111 WebUI models, load models from \stable-diffusion-webui\models\Lora.

If you are new to Flux.1 [schnell], please refer to the following article for a detailed explanation.

How to implement LoraLoaderModelOnly in ComfyuI

The “LoraLoaderModelOnly” implementation basically connects the MODEL out of the “Load Diffusion Model”; CLIP is not reflected, so the standard “Load LoRA” is not used. If you don’t have the node, update ComfyUI to the latest version.

Introduction of LoRA to be used in this project

The LoRA we will use this time is AkiLora’s “Aki Anime.” When searching for LoRA on Civitai, please note that [Dev] and [schnell] have different models. To check, go to the LoRA page and check the Base Model section; Flux.1 [schnell] is clearly marked as “Flux.1 S” and Flux.1 [dev] as “Flux.1 D”.

  • Creators: 🔗AkiLora
  • Download: 164 MB
  • Upload: 2024/9/4
  • File Name: aki_anime.safetensors

AkiLora’s “Aki Anime” is a LoRA that adapts animation styles to Flux.1 [schnell]. Please note that the license is for non-commercial use.

In the standard Flux.1 [schnell], animation styles are not stable and old styles sometimes appear. With this LoRA, the animation style can be more stable because there is a directionality to the animation style. However, as is the case with all LoRAs, the more directional the animation style becomes, the less variety there is, and the less flexibility there is.

Using Flux.1 [schnell] workflow with LoRA

We will be introducing a text2image + upscaler architecture using Flux.1 [schnell]. Memory management has been improved, so let’s first update ComfyUI to the latest version. The default settings of the workflow are to generate high quality images with a small number of steps but with a large amount of VRAM. If you do not have enough VRAM, it will take a long time to generate images, so please set “Load Diffusion Model” weight_dtype to fp8_e4m3fn and “Empty Latent Image” resolution to 768 x 512.

The workflow is available on Patreon, but only paid members can view and download it. If you would like to become a paying member for just one month, it will encourage us to write more, so please join if you would like.

Even if you cannot download the workflow, you can structure it yourself by looking at the explanation, so please continue on to the explanation.

Required Custom Node Installation

The following three custom nodes must be installed to execute this workflow.

  • Image chooser: Implemented to check the results of the 1st pass, this allows easy reruns of low resolution with short generation times until a satisfactory configuration is obtained.
  • Crystools: Implemented to check low-resolution seed, since ComfyUI’s seed system changes to the next seed after execution, and to check metadata.
  • ComfyUI Layer Style: Used for post-process color correction, LayerColor: ColorAdapter node for color correction.

When you open the Json file downloaded to ComfyUI with ComfyUI-Manager installed, you will be warned that there are missing custom nodes. You can install the missing custom node from “Install Missing Custom Nodes” in the Manager menu.

Please refer to the following article for a detailed explanation of how to install custom nodes.

Basic Info

The Basic Info group brings together the basic input nodes of Text2image.

  • Load Diffusion Model: Select flux-schnell.safetensors for unet_name. weight_type is default as fp16, high quality for heavy calculation. fp8_e4m3fn / fp8_e5m2 is light weight high to medium quality.
  • DualCLIPLoader: Load the text encoder model; choose clip_l and t5xxl_fp16 for clip_name. (Use t5xxl_fp8_e4m3fn instead of t5xxl_fp16 if there is not enough PC memory.)
  • Load VAE: Select ae.safetensors for vae_name.
  • LoraLoaderModelOnly: Load LoRA. aki_anime.safetensors for lora_name and 0.85 for strength_model to reflect the base model a little.
  • ModelSamplingFlux: The time step scheduling shift settings. max_shift should be set in the range of 0.0 to 2.0 when used with FLUX.1 [schnell]. Also, base_shift is not reflected, so use 0 or the default setting of 0.5. In some cases, bypassing this node may give better results.
  • Empty Latent Image: This time, set the size to 0.9 MP with a ratio of 16:9, 1264 x 712. Set batch_size to 3. If the 1st pass is too heavy, set it to 1.
  • RandomNoise: If you want to see the same generated results as in this article, set noise_seed to 526038417924851 and control_after_genelate to fixed. third appearance of batch.
  • CLIP Text Encode (Prompt): Basically, you can use only natural language since T5XXL is good at natural language, but since CLIP L is also used, you can also use Danbooru style. In this case, the following prompts will be used.
    A beautiful girl knight is standing at hill side, Under the blue sky. and she is looking at viewer.
    
    the color theme is the teal and orange.
    
    In The background big word of ("DCAI":1.3) made by (detailed ruined stone:1.15) with moss and plants.
    (The old castle is on top of a hill:0.85) .
    
    horizonin view, 50mm lens portrait, (anime face with detailed eyes:1.3), medival fantasy, water fall, authentic (no credits, no signature.:1.1)
  • KSamplerSelect: Select a sampler. This time, select the officially recommended euler.
  • BasicScheduler: Select sampler and scheduler type; for scheduler, choose either beta or simple. In the case of the illustration, we set “steps” to 2 because too high a step will result in a lack of detail.

1st Pass

The 1st Pass group will generate the first one.

  • BasicGuider: Change the model and conditioning together into a guider.
  • SamplerCustomAdvanced: Sampler node for next generation models. Parameters have been externalized for more fine-tuning.
  • VAE Decode: Latent images are decoded and converted to pixel images.

Preview

The Preview group will be an area to preview the generated illustration. This workflow does not use the “Save Image” node, to save, right click on the preview image or change “Preview Image” to “Save Image”.

  • Preview Chooser: The result of the 1st Pass generation will be displayed. If you are satisfied with the results, select an image and click the “Progress selected image” button to proceed.
  • Latent From Batch: Use this when you want to generate only the desired image in a batch file. Although bypassed, if you want to generate only one image for this sample, use this node because Seed cannot be re-generated by Seed alone, since ComfyUI uses a single Seed for batch processing.
  • 🪛 Preview from image: This node is used to obtain metadata of the image selected in the Preview Chooser, and is implemented to check the currently generated Seed since the Seed in ComfyUI randomly switches to the next Seed when executed.
  • 🪛 Show any value to console/display: Used to view metadata extracted in Preview from image.
  • Preview Image – finish 2nd Pass: This is a preview of the illustration generated by 2nd Pass.
  • Preview Image – 4K/8K image: This is a preview of the image sharpener.

Upscale

The Upscale group is scaling up for 2nd Pass.

  • Load Upscale Model: Load the upscaler model. In this case, we will use 4x-UltraSharp.
  • Upscale Image (using Model): Use the upscaler model to enlarge the image by a factor of 4. Bypass this node if the operation is too heavy.
  • ImageScaleToTotalPixels: Scales down a 4x scaled model to 2 megapixels. upscale_method selects lanczos.

2nd Pass

The 2nd Pass group uses img2img to clean up the image based on the image enlarged by Upscale.

  • CLIP Text Encode (Prompt): Use a simple prompt for 2nd Pass. In some cases, using the 1st Pass prompt will generate better results.
  • ModelSamplingFlux: In the 2nd Pass, max_shift is set to 0.15 because we do not want to change the composition significantly. width / height is set to default because max_shift is low and does not affect it much.
  • BasicScheduler: We don’t want to change the composition too much here, but if we lower it too much, the picture will not change, so we set denoise to 0.35. steps is set to 3 to increase the amount of drawing. scheduler is set to beta.
  • VAE Encode: Upscale converts enlarged pixel image to Latin and image.
  • BasicGuider: Change the model and conditioning together into a guider.
  • SamplerCustomAdvanced: Node for second sampling

Post Process

The Post Process group makes the final fine-tuning of the generation.

  • VAE Decode (Tiled): Decode the latent image generated by 2nd Pass into a pixel image using Tiled Decode. Normal “VAE Decode” is also acceptable because ComfyUI automatically switches to “VAE Decode (Tiled)” when there is not enough memory.
  • LayerColor: ColorAdapter: The color tones may change between the 1st and 2nd pass, so this node is used to correct the color tones.

Image Sharpener

The Image Sharpener group is once enlarged to 8K and then reduced to 4K for sharpening. If the operation is too slow, bypass the entire group.

  • Load Upscale Model: Again, we will use 4x-UltraSharp.
  • Upscale Image (using Model): This node will expand to 8K.
  • ImageScaleToTotalPixels: The image is enlarged to 8K and then reduced to 4K. upscale_method changes the clarity of the image, so experiment with different methods.

Once the settings have been made up to this point, click the “Queue Prompt” button to start generation.

Once the 1st Pass is finished, the result will be displayed in the “Preview Chooser”. Generate the image many times until you are satisfied with the result. When a good image is generated, send it to the upscaling process. After a while, the final result will be generated.

Final result of workflow
Final result of workflow
Open Image

Conclusion

In this article, we have explained how to implement LoRA on Flux.1 [schnell] models with ComfyUI. LoRA in Flux.1 [schnell] is still not as common as in Flux.1 [dev]. We expect the Flux.1 [schnell] model to gain momentum, as it is of better quality than SDXL and can be used commercially. We will continue to follow Flux.1 in the future.

PR
Image of LaCie 1big Dock, 10TB, External Hard Drive, HDD Docking Station, Thunderbolt 3, 7.200 RPM, Enterprise Class Drives, for Mac and PC Desktop, 5 Year Rescue Services (STHS10000800)
LaCie 1big Dock, 10TB, External Hard Drive, HDD Docking Station, Thunderbolt 3, 7.200 RPM, Enterprise Class Drives, for Mac and PC Desktop, 5 Year Rescue Services (STHS10000800)
🔗Amazon-Usa Link
Image of ASUS TUF Gaming GeForce RTX™ 4090 OG OC Edition Gaming Graphics Card (PCIe 4.0, 24GB GDDR6X, DLSS 3, HDMI 2.1, DisplayPort 1.4a)
ASUS TUF Gaming GeForce RTX™ 4090 OG OC Edition Gaming Graphics Card (PCIe 4.0, 24GB GDDR6X, DLSS 3, HDMI 2.1, DisplayPort 1.4a)
🔗Amazon-Usa Link
Supported by