StableDiffusion webUI SDXL Model Usage Basics
In this article, We would like to explain the SDXL model. At the time of this writing, the world is abuzz with the announcement of StableDiffusion V3, but the SDXL model is the previous version of StableDiffusion V3.
Differences from SD 1.5 model
The main differences between Stable Diffusion’s SD1.5 and SDXL are as follows
SD1.5
- Based resolution: 512×512 Dot
- Text Encoder: Open AI CLIP ViT-L/14
- Parameter base:3.5 Billion parameter base
-
Key Features:
- SD1.5 generates images at a relatively low resolution. While the prompts are intuitive and easy to handle, the detail of the generated images is limited.
- AI illustrations can be generated with less memory than SDXL.
- Many SD1.5 models are available to the public, and if they are used properly, a wide variety of expressions are possible.
- Many users are accustomed to quality modifier prompts such as “masterpiece”.
SDXL
- Based resolution:1,024×1,024 Dot
- Text Encoder: OpenClip model (ViT-G/14) & OpenAI proprietary CLIP ViT-L.
- Parameter base:6.6 Billion parameter base (refinements are included)
-
Key Features:
- The base and refiner together have been significantly expanded to a 6.6B parameter base, enabling processing of more complex data.
- Maximum size output of 1,024 x 1,024 is possible, enabling the generation of precise AI illustrations.
- Two stages of generation, “Base” and “Refine,” are available to generate higher quality AI illustrations. (Only the base stage can be used.)
- The accuracy of image generation is high, and high-quality images can be generated without complex prompts.
- It works faster than SD1.5, reducing the time required to generate images without compromising image quality.
- Improved speed of trainings for custom Lora and checkpoint models
Recommended Specs
At least 8 GB of GPU memory is required to generate AI illustrations using the SDXL model.
The official recommendation is to use “xformers”.
We recommend that you change the startup settings of the “Automatic1111 Stable DIffusion Web UI” as follows, depending on your VRAM capacity.
Nvidia (12gb+) --xformers
Nvidia (8gb) --medvram-sdxl --xformers
Nvidia (4gb) --lowvram --xformers
How to adapt xformers
Open \stable-diffusion-webui
in the file explorer.
Right-click on webui-user.bat
and edit in Notepad or your favorite text editor.
Look for the line that says set COMMANDLINE_ARGS=
.
Enter the command line arguments described earlier after set COMMANDLINE_ARGS=
.
For example, if your VRAM is 8gb, set COMMANDLINE_ARGS= --medvram-sdxl --xformers
.
Save and launch WebUI.
If you check the command prompt at startup and it looks like the following, the system has been successfully adapted.
Download SDXL Model
Let’s start with the model published by “stability.ai”.
SDXL-base-1.0 SDXL-refiner-1.0 SDXL-VAEAlso, please refer to the recommended checkpoints for SDXL 1.0.
How to use the SDXL model
Move the downloaded models as in SD1.5: checkpoint models and refiner models to \stable-diffusion-webui\models\Stable-diffusion
and VAE to \stable-diffusion-webui\models\VAE
respectively.
Return to the browser and press the “🔄” button next to the checkpoint model selection tab in the upper left corner. When the update is complete, the tab will display the checkpoint model you just moved to the folder, select it and load it. Also select VAE as well as SD1.5.
How to use Refiner
Refiner functionality is required to incorporate a second stage into SDXL models. Note: WebUI version 1.5.2 or earlier is not supported. Therefore, we recommend updating to version 1.6.0 or higher or using the img2img function to apply the Refiner model.
Directions for use:- Activate the “Refiner” option in the parameter area.
- Under “Checkpoint,” select the refiner model.
- The “Switch at” value specifies when to switch models. This value can range from 0 to 1. For example, if 0.5 is set, the model will switch to the refiner model at the midpoint of the process.
About Resolution
SDXL models can output up to 1024×1024, so use the following resolutions.
- 1:1: 1024 x 1024
- 9:7: 1152 x 896
- 19:13: 1216 x 832
- 7:4: 1344 x 768
- 12:5: 1536 x 640
Try to generate with SDXL model
Let’s generate it with the following settings.
PromptA cat in armor stands on a hillside,
Medival fantasy, award winning water color, full body,
worst quality, ugly, deformed,bad anatomy,
Sampling steps: 30
Sampling method: DPM++ 2M
Schedule type: Karras
CFG Scale: 7
Clip skip: 2
Refiner: on
Model: sd_xl_refiner_1.0.safetensors
Switch at: 0.8
Width: 1344
Height: 768
VAE: sdxl_vae.safetensors
What do you think? By using the Seed value, it is possible to create similar illustrations, but depending on the version of the Web UI, the generated illustrations may differ. Also, try various styles since the style is specified in the “water color” part of award winning water color
.
anime artwork,
concept art,
cinematic film still,
comic,
line art drawing,
photographic,
art deco style,
art nouveau style,
cubist artwork,
hyperrealistic art,
pop Art style,
surrealist art,
vaporwave style,
Text Generation
The success rate is still low and it is not practical to use without the assistance of the Lora model, but it can be used to display text on signs and the like.
PromptA cat in armor stands on a hillside,
(holding a sign that says “ (CAT) ”:1.8),
Medival fantasy, fantasy-core, Award-Winning, anime artwork, dramatic, key visual, vibrant, studio anime, highly detailed, full body,
worst quality, ugly, deformed,bad anatomy,
Sampling steps: 30
Sampling DPM++ 2M Karras
CFG Scale: 7
Clip skip: 2
Width: 1344
Height: 768
VAE: sdxl_vae.safetensors
Refiner: on
Model: sd_xl_refiner_1.0.safetensors
Switch at: 0.7
Seed: 3632954274
ADetailer: on
Hires. Fix: on
Upscaler: R-ESRGAN 4x+
Hires steps: 10
Denoising strength: 0.5
Upscale by: 1.5
Conclusion
In this article, we have explained the basic operation of the SDXL model; without Refiner, it can be operated in much the same way as SD1.5. If you have a PC with appropriate specifications, please give it a try. We also do not recommend using quality modifiers such as “masterpiece,” but you may want to try it as it can affect the quality of the output. We are sure that you will find natural language processing even easier to use now that you have improved your understanding of the prompts.