Basic usage of Stable Diffusion Web UI Image-to-image section

⏱️32min read

📅 Aug 27, 2024

Basic usage of Stable Diffusion Web UI Image-to-image section featured Image

Supported by

In this article, I would like to explain the basic usage of A1111 Stable Diffusion web UI’s “Image to image (img2img)”. img2img allows you to create a new illustration using your input and prompts. You can use the generated AI to create a clean drawing from a rough sketch, or you can input the generated illustration to modify the hands and face, and remove unnecessary parts.

Image of Corsair Vengeance i5100 Gaming PC – Liquid Cooled Intel Core i9-14900KF CPU – NVIDIA GeForce RTX 4080 Super GPU – 32GB Dominator Titanium RGB DDR5 Memory – 2TB M.2 SSD – Black/Gray

Corsair Vengeance i5100 Gaming PC – Liquid Cooled Intel Core i9-14900KF CPU – NVIDIA GeForce RTX 4080 Super GPU – 32GB Dominator Titanium RGB DDR5 Memory – 2TB M.2 SSD – Black/Gray

Shop at Amazon-Usa

🔗Amazon-Usa Link

Intel® CoreTM i9-14900K New Gaming Desktop Processor 24 (8 P-cores + 16 E-cores) with Integrated Graphics - Unlocked

Shop at Amazon-Usa

🔗Amazon-Usa Link

Description of img2img interface

Explanation of each area

1. Checkpoint / Prompt Area: An area to select checkpoints for the model and enter prompts. There are also tabs for switching between tools such as text2img and changing settings

2. Generate button Area: In addition to the “Generate” button, you can manage prompt loading presets.

3. Preview Area: In addition to previewing the generated images, there are shortcut buttons for sending the generated images to Img2img and other applications.

Input image/parameter area description image

4. Page Switch Area: Toggles between generated pages and negative embedding and LoRA calling pages.

5. Image Input Area: The input image loading area, img2img mode switching, etc. are also switched here.

6. Generation Parameters Area: This area is used to set upscaling, sampling, and inpainting methods, the size of the generated image, and parameters necessary for generation such as steps and CFG.

About Checkpoint / Prompt Area

Checkpoint: Select the trained checkpoints in the model.

Page switching tab: Switch to the “img2img” and other settings and extensions management page.

Prompt: Describe the characteristics of the image you wish to generate.

Negative prompt: Describes the features of the image you do not want to generate.

About Generate button Area

Generate button: Buttons to start, pause and cancel image generation

Reload button: Recalls previous settings remaining in the cache.

Clear button: Clear prompt/negative prompts.

Style Apply button: Writes the currently selected style to the prompt and negative prompt.

Prompt Generate Button (CLIP): Generate prompts in natural language style from input images using the CLIP neural network.

Prompt Generate Button (DeepBooru): Generate Danbooru-style prompts from input images using the DeepBooru neural network.

Style edit button: Save and recall prompts and negative prompts as presets.

About Preview Area

Preview: The generated image is displayed.

Output folder button: The folder containing the output images will open in File Explorer.

Save image button: Saves the image selected in the preview.

Zip the image button: Zip all images displayed in the preview.

Send to img2img button: Sends the image selected in the preview to img2img along with prompts and settings.

Send to img2img inpaint button: The image selected in the preview is sent to img2img’s inpaint along with prompts and settings.

Send to Extras button: Sends the image selected in the preview to Extras.

About page switching area

Generation page: Handles input, mode switching, etc. of img2img’s main page image.

Negative embedding page: Select and adapt the installed negative embedding from the list.

Hypernetwork Page: Select and adapt the installed hypernetwork from the list.

Checkpoint Page: Select and adapt installed checkpoints from the list.

LoRA Page: Select and adapt the installed LoRA from the list.

About image input area

Mode Switching Tab: Switches img2img mode.

Canvas: In some modes, such as input image selection and mask range selection, you can draw with the mouse.

Copy input images: Copies the image loaded in the input to the selected mode.

About the Generation Parameter Area

Resize mode: Switches the mode when zooming in and out.

Soft inpainting: Adapts soft inpainting.

Sampling method: Select the sampling method and schedule type.

Sampling steps: Sets the number of sampling steps.

Refiner: Refiner is mainly used to incorporate the second stage of SDXL.

Size: Sets the size of the generated image.

Batch count: Sets the number of generated images to be output.

Batch size: Sets the number of images to be generated simultaneously in a single output.

CFG Scale: Sets how closely the image is generated to the prompt.

Denoising strength: Sets a numerical value for how much of the input image to keep.

Seed: You can randomize the seed value (like a seed for generation) or enter an arbitrary number. The “🎲️” button allows you to randomize / the “♻️” button recalls the previous seed / “Extra” allows you to set a more detailed Seed.

Script: Call scripts such as X/Y/Z plot.

Image of Inpeint parameter area — Additional parameters in Inpeint/Inpeint sketch mode

1. Mask blur: Apply blur to the mask to smooth the edges and make the border with the input image less noticeable.

2. Mask transparency: This setting is only for Inpeint sketch mode, but it sets the transparency of the mask. Increasing the transparency gives priority to the input image.

3. Mask mode: Select the mask mode.

Inpaint masked: Inpaints within the mask selection.
Inpaint not masked: Inpaints the area outside of the mask selection.

4. Masked content: Sets the state of the mask range before the process starts.

fill: Fills the input image with the average color within the masked area.
original: The input image is used as is.
latent noise: Fills the masked area with Seed-based noise. It can be used to mask an object and rewrite the background. 1 is recommended for CGF.
latent nothing: Fills the masked area with the average color of the masked area. It can be used to mask an object and rewrite the background. 0.8 or higher is recommended for CGF.

5. Inpaint area: Sets the inpainting area.

Whole picture: Apply inpainting to the entire image, including outside the mask area.
Only masked: Apply inpainting only within the mask area.

6.Only masked padding, pixels: If the inpainting area is Only masked, the mask range is extended by the specified pixels.

Supported by

About img2img mode

The A1111 Stable Diffusion web UI’s Image to image has six modes. It is explained in detail with examples of its use later in this article, so we will briefly describe it here.

img2img: Default mode. Generates illustrations with a composition similar to the input image.
Sketch: Upload an image with a single background color, draw a sketch on the canvas, and generate an illustration from it.
Inpaint: A mask is specified on the input illustration, and an illustration is generated based on it.
Inpaint sketch: It is similar to Inpaint mode, but the difference is that Inpaint can only specify a mask, while Inpaint sketch can also be used as a mask and lead generation by the sketch.
Inpaint upload: Upload an input image and a mask image, The illustration is created based on the input image and mask image. This mode is used when you want to inpainting an illustration created with software such as Photoshop.
Batch: Multiple input images can be processed in the img2img process at once.

How to use Canvas

Canvas is a tool for drawing sketches and specifying masks on the input image. Use the mouse to draw with a brush.

Alt + Mouse Wheels: Zoom in and out on the canvas.
Ctrl + Mouse Wheels: Sets the size of the brush.
R key: Resets the zoom of the canvas.
S key: Put the canvas in full screen mode; press the R or S key again to go to normal mode.
F key: You can move the canvas by holding down the F key and dragging the mouse.

Undo button: Cancels the previous brush.
Clear button: Delete all brushes on the canvas.
Remove button: Deletes the input image.
Use brush button: Sets the size of the brush.
Select brush color button: Sets the color of the brush. (Sketch and Inpaint sketch only)

About Resize mode of img2img

Resize mode allows you to change the settings when zooming in and out.

Just Resize: The aspect ratio is not maintained, so the image will stretch or shrink if the aspect ratio of the destination size is different.
Crop & Resize: The aspect ratio is maintained and the overhang is removed.
Resize & Fill: The image is scaled to fit the specified destination size while maintaining the aspect ratio. The excess area will be filled with the color of the edge of the input image.
Just Resize (Latent Upscale): Same as Just Resize, but uses Latent Upscale for scaling.

About Denoising strength

Denoising strength is a value that determines how close to the input image the image will be generated, and is the same parameter as the denoising strength in txt2img’s Hires.fix.The above sample is generated in img2img mode by changing the prompt from black chair to white chair.

0.0: No change
0.35: Slight change
0.75: Quite a change
1.0: Most changes

About img2img’s Soft inpainting

Soft inpainting blends the border between the mask and the background without discomfort when using Inpaint mode. It is recommended to set the mask blur to a high value. Ex. Mask blur: 20

About Soft inpainting parameters

Schedule bias: This is the bias at which step of the sampling process the input image is maintained. The default is 1. If it is set higher than 1, it will be maintained from the beginning of the sampling process. However, if the value is too high, the additional portion of the image will be reduced. On the other hand, if it is too small, it will be maintained from the later stages of sampling, so that the border between the input image and inpaint image will be more noticeable.
Preservation strength: This is the intensity of the maintenance of the input image. The higher the value, the more the input image is maintained.
Transition contrast boost: Adjusts the difference between the input image and the inpainting. The default is 4. The higher the number, the sharper the border. Conversely, the lower the number, the smaller the inpainted object will be, but the smoother the border will be.
Mask influence: This is the degree of influence of the mask. The higher the number, the greater the influence of the mask. 0 means that the mask is ignored. *When tested with v1.10.1, changing this value did not change the result.
Difference threshold: This is the threshold for the difference from the input image. The default value is 0.5. The larger the value, the more transparent the inpainted area becomes.
Difference contrast: Adjusts the difference between the input image and the inpainted area. Default is 2. The smaller the value, the more transparent it becomes.

Examples of Soft inpainting usage

In the sample, soft inpainting is applied with default values except for CFG and Size; without soft inpainting, the mask borders are noticeable, but with soft inpainting, the borders are barely discernible.

Examples of Soft inpainting usage-Inpaint mask — Inpaint mask

Examples of Soft inpainting usage-Without Soft inpainting — Without Soft inpainting / Prompt: 1girl,(upper body:1.3), looking at viewer, masterpiece, ultra detailed,medival, village, path Negative Prompt: worst quality, low quality, normal quality, lowres Steps: 20, Sampler: DPM++ 2M, Schedule type: Karras, CFG scale: 7, Seed: 4148301457, Size: 768x512, Model: darkSushiMixMix_225D, VAE: vae-ft-mse-840000-ema-pruned.safetensors, Denoising strength: 1, Clip skip: 2

Examples of Soft inpainting usage-With Soft inpainting — With Soft inpainting

About color correction of img2img

The color of the generated illustration may change when using img2img; use the color correction feature of A1111 WebUI.

Open the settings page on the Settings tab.
Select img2img in Stable Diffusion from the list on the left.
Check the box marked “Apply color correction to img2img results to match original colors.“
Press the Apply settings button to apply the settings.

This completes the settings, but if you switch frequently, add the following method to the Quicksettings list.

Select User Interface in User Interface from the list.
Enter img2img_color_correction in the Quicksettings list and select it from the list.
Press the Apply settings button to apply the settings.
Click the Reload UI button to restart the system. If a check box appears at the top of the UI after startup, the setup is complete.

About the transparent part of the input image

If the input image is a PNG or other image that has transparent areas, they will be used as white by default. This can be changed to a different color in the settings.

Open the settings page on the Settings tab.
Select img2img in Stable Diffusion from the list on the left.
Change the color below the “With img2img, fill transparent parts of the input image with this color.“.
Press the Apply settings button to apply the settings.

About the canvas size in Sketch mode

Note that the canvas in Sketch mode is affected by the custom scale setting of the display. If the display’s custom scale is set to a setting other than 100% in Windows, the canvas size will be affected.

For instance, if the display’s custom scale is set to 150% and the canvas is loaded with 512 x 512 pixels, the canvas will be 1.5 times larger, 768 x 768.

If you use “Resize to” to set the desired resolution, there is no problem, but if you use “Resize by” to set the resolution to Scale 1, the custom scale will result in a resolution of 768 x 768. In such a case, change the custom scale setting of the display to 100%.

Only masked and target size in inpainting mode

Comparison of inpainting areas — Prompt: 1girl, cute face, detailed face, meadow Negative Prompt: worst quality, low quality, normal quality, lowres Steps: 20, Sampler: DPM++ 2M, Schedule type: Karras, CFG scale: 7, Seed: 203628114, Size: 512x512, Model: darkSushiMixMix_225D, VAE: vae-ft-mse-840000-ema-pruned.safetensors, Clip skip: 2

This is what DarkStooM wrote in his 🔗Gist, When using Only masked mode, the mask area is enlarged to the target size using the Upscaler model and then inpainting is applied, so it seems that the Only masked mode produces more detail. The comparison image above was generated at 512 x 512, and the inpainting mask was specified only for the eyes, and the image was generated at equal magnification.

The Upscaler model is not selectable by default, But you can select it by using the following method.

Open the settings page on the Settings tab.
Select User Interface from the list on the left under User Interface.
Enter upscaler_for_img2img in the Quicksettings list and select it from the list.
Press the Apply settings button to apply the settings.
Click the Reload UI button to restart the system. If Upscaler for img2img appears at the top of the UI after startup, the setup is complete.

Supported by

Example of img2img mode usage

The img2img mode generates images based on the input image. The composition of the generated image will be the same as the original image, depending on the value of Denoise.

Let’s input a photo image to generate an illustration.

The input image will be a 🔗photo of a London back street borrowed from Unsplash. Drag or click this photo onto the canvas to load it.

The SD1.5 model (darkSushiMixMix_225D) used in this project favors the Danbooru style, so let’s analyze the prompt by clicking on the prompt generation button (DeepBooru) in the generation button area.

Once the analysis is complete, insert the following prompt at the beginning of the prompt to make it illustrative.

(anime style:1.5),

Then paste the following prompt into the negative prompt. We use (realistic:1.5) for an illustrative tone.

(realistic:1.4),worst quality, low quality, normal quality, lowres

Change the following parameters.

Set Sampling steps to 30 because we want to make sure the drawing is well defined.
Since the input image is 1920 x 1281 pixels, we set the Resize mode to Crop and Resize and the Resize to to 768 (W) x 512 (H) to accommodate the SD 1.5 model we are using.
Change the CFG Scale to 3 because we want to keep the style of the model.
We want to maintain the composition of the input image, so we set Denoising strength to 0.5.

Finally, generate it with the “Generate” button and you are done.

Final result of img2img mode / Prompt: (anime style:1.5), architecture, building, city, cityscape, europe architecture, lantern, no humans, outdoors, road, scenery, street Negative Prompt: (realistic:1.4),worst quality, low quality, normal quality, lowres Steps: 30, Sampler: DPM++ 2M, Schedule type: Karras, CFG scale: 3, Seed: 664072784, Size: 768x512, Model: darkSushiMixMix_225D, VAE: vae-ft-mse-840000-ema-pruned.safetensors, Denoising strength: 0.5, Clip skip: 2

Example of Sketch mode usage

Sketch mode generates an illustration based on a sketch drawn on a canvas. In this case, we would like to generate a “giant tree standing in a meadow. Be careful with the scale of the display as described in “About the canvas size in Sketch mode“.

First, switch to sketch mode and create a solid color image of the desired size (768 x 512 in this case) in Photoshop or other software and load it. In this case, we will upload a light blue image as the sky color because it will be easier to use the background color as the single color later on.

Sketch a meadow, a tree, and a mountain in the background, changing colors, even if only vaguely on the canvas.

Once the sketch is written, paste the following prompt into the prompt.

big tree, plane field, mountains, blue sky, masterpiece, ultra detailed

Then paste the following prompt into the negative prompt.

worst quality, low quality, normal quality, lowres

Switch the parameter from Resize to to Resise by and set Scale to 1.

Use the remaining settings as default. If you want the composition to be more like a sketch, lower the denoising strength. If you lower it too much, the image will look like a sketch, so find a good balance.

Finally, generate it with the “Generate” button and you are done.

Final results of Sketch mode / Steps: 20, Sampler: DPM++ 2M, Schedule type: Karras, CFG scale: 7, Seed: 1640796599, Model: darkSushiMixMix_225D, VAE: vae-ft-mse-840000-ema-pruned.safetensors, Denoising strength: 0.75, Clip skip: 2

Example of Inpaint mode usage

Inpaint mode is the most used mode in img2img and allows you to modify illustrations generated by txt2img.

First, use txt2img to generate illustrations with the following settings.

Prompt: 1girl, upper body, waving, smile, looking at viewer, medival, village, masterpiece, ultra detailed
Negative Prompt: worst quality, low quality, normal quality, lowres
Steps: 20
Sampler: DPM++ 2M
Schedule type: Karras
CFG scale: 7
Seed: 3546912850
Size: 768x512
Model: darkSushiMixMix_225D
VAE: vae-ft-mse-840000-ema-pruned.safetensors
Clip skip: 2

Once generation is finished, press the “send to img2img inpaint” button (red frame) in the preview area to switch to Inpaint mode.

Location of the send to inpaint button — Send to inpaint button

In this case, the fingers are incorrectly placed, so we will correct them. You can fix both hands at once, but in most cases, you will not be satisfied with the result after one generation, so we recommend that you fix one hand at a time.

Now we will correct the left hand. The little finger is out of alignment, so we will mask the little finger and its correct position after the correction.

The prompt is taken over from txt2img, so insert the following prompt at the beginning.

five fingers,

Switch the parameter from Resize to to Resise by and set Scale to 1. The other parameters are used as default.

Press the “Generate” button and continue generating until you are satisfied with the results. If you do not get a good result, try changing the CFG value.

Example of left hand mask — Left hand inpainting mask

If you are satisfied with the result, you want to modify the opposite hand, but first load the modified image onto the canvas with the “send to img2img inpaint” button, the same as for txt2img. The mask from earlier is still there, so you will erase it with the Clear button.

On the right hand, the tips of the ring and pinky fingers are not correct, and there is a finger-like object next to the thumb.

In this pattern, you can modify the ring and pinky fingers separately, but let’s modify them together. Mask the tips of the pinky fingers and whole ring fingers of the right hand on the canvas.

Example of right hand mask — Right hand inpainting mask

The prompts and parameters are the same as before.

If you are satisfied with the result, send it to canvas again with the “Send to img2img inpaint” button.

Finally, erase the finger-like object next to the thumb.

Example of removable mask — Inpainting mask of the area to be deleted

If you are removing elements inpaint, switch Masked content to fill. Also, a higher CFG increases the probability of flattening the original object. Conversely, a low CFG increases the chance of another object that is close in shape.

The above process has been used to correct the hand. As you can see, there is an element of luck involved. If you are not getting good results, you can try LoRA or Negative Embedding.

Final result of Inpaint mode / Steps: 20, Sampler: DPM++ 2M, Schedule type: Karras, CFG scale: 7, Seed: 2075150193, Size: 768x512, Model: darkSushiMixMix_225D, VAE: vae-ft-mse-840000-ema-pruned.safetensors, Denoising strength: 0.75, Clip skip: 2

Example of Inpaint Sketch mode usage

Inpaint Sketch mode is a combination of inpainting and sketching, allowing you to add new elements to the input image in a sketch. Be careful with the scale of the display as described in the section “About the canvas size in Sketch mode“.

Let’s add a treehouse to the final result of the sketch mode.

A1111 WebUI does not have the ability to send to Inpaint sketch, so you can directly drag and drop the generated image or specify a file to load.

In the case of inpainting sketches, the sketch can be used as a mask as it is, so if you want to add a sketch, first draw a silhouette with a solid color, and then draw the outline of the desired picture on top of it, which will work well. In this case, We wanted to add a treehouse, so We drew the following sketch.

Once the sketch is written, paste the following prompt into the prompt. In this case, we are generating the Inpaint area with Only masked, so we are only concerned with the additions to be written and are not using any extra prompts other than the treehouse.

small tree house, masterpiece, ultra detailed

Then paste the following prompt into the negative prompt.

worst quality, low quality, normal quality, lowres

Change the following parameters.

Set Mask transparency to 10 to blend in with the background.
Select Only masked for Inpaint area because we want to bring out details.
Set the Soft inpainting checkbox to ✅ to further blend in with the background. The default settings are fine.
Since we want to generate the image in equal size, we switch from Resize to to Resize by and set Scale to 1.
We wanted to get as close to the sketch as possible, so we set the denoising strength to 0.5.

Finally, generate it with the “Generate” button and you are done.

Final result of Inpaint Sketch mode / Steps: 20, Sampler: DPM++ 2M, Schedule type: Karras, CFG scale: 7, Seed: 2765650859, Size: 768x512, Model: darkSushiMixMix_225D, VAE: vae-ft-mse-840000-ema-pruned.safetensors, Denoising strength: 0.5, Clip skip: 2, Soft inpainting enabled: True, Soft inpainting schedule bias: 1, Soft inpainting preservation strength: 0.5, Soft inpainting transition contrast boost: 4, Soft inpainting mask influence: 0, Soft inpainting difference threshold: 0.5, Soft inpainting difference contrast: 2, Mask blur: 4, Inpaint area: Only masked, Masked area padding: 32

Example of Inpaint upload mode usage

Inpaint upload mode allows you to upload an input image and a mask, respectively, for inpainting. There are various uses for this mode, such as when you want to create a clean mask in Photoshop, or when you want to inpaint a mask part exported by 3DCG software, etc.

In this example, we will add a background to an image exported by 3DCG software.

The input image and mask will proceed using the following image.

Sample of Inpaint upload mode — Input Image

Sample mask of Inpaint upload mode — Mask Image

Load the above images into the input and mask, respectively.

Enter the prompt for background as follows.

blue sky, meadows, moutains, masterpiece, ultra detailed

Paste the following prompt into the negative prompt.

worst quality, low quality, normal quality, lowres

Change the following parameters.

To invert the masked area to the background, set Mask mode to Inpaint not masked.
The background should be generated based on the color of the input image, so set Masked content to original.
Change the Sampling method to DDIM CFG++ and the Schedule type toDDIM to make the input images composite naturally and without discomfort. *If not listed, A1111 WebUI must be updated to v1.10 or higher.
Set Sampling steps to 30 for better background rendering.
Since we want to generate the scale equally, switch from Resize to to Resize by and set Scale to 1.
Since there is a monotonous background influence on the input image, increase the CFG Scale to 10 to increase the influence of the prompt.

Finally, generate it with the “Generate” button and you are done.

Final result of Inpaint upload mode / Steps: 30, Sampler: DDIM CFG++, Schedule type: DDIM, CFG scale: 10, Seed: 2855064070, Size: 768x512, Model: darkSushiMixMix_225D, VAE: vae-ft-mse-840000-ema-pruned.safetensors, Denoising strength: 0.75, Clip skip: 2, Mask mode: Inpaint not masked, Mask blur: 4

Example of Batch mode usage

Batch mode is used when you want to apply img2img to multiple images at once. It is not very effective, but it can be used when you want to apply a style to a group of images.

This time, let’s convert the three images generated in the previous example into a watercolor-style sketch. However, this usage is not very efficient because it is not possible to modify each image one by one.

The following image created in the previous examples is loaded into the input.

Enter the prompt as follows.

(water color, sketch:1.5), flat color, masterpiece, ultra detailed

Paste the following prompt into the negative prompt.

worst quality, low quality, normal quality, lowres

Change the following parameters.

Change the Sampling method to DDIM CFG++, which is compatible with img2img, and the Schedule type to DDIM. *If not listed, A1111 WebUI must be updated to v1.10 or higher.
Since we want to generate the image in equal size, we switch from Resize to to Resize by and set Scale to 1.
We want the prompt to take precedence, so we set CFG to 9.
We want to preserve as many elements of the input image as possible, so we set the denoising strength to 0.3.

Finally, generate it with the “Generate” button and you are done.

Final result of Batch mode 1 / Steps: 20, Sampler: DDIM CFG++, Schedule type: DDIM, CFG scale: 9, Seed: 2855064070, Size: 768x512, Model: darkSushiMixMix_225D, VAE: vae-ft-mse-840000-ema-pruned.safetensors, Denoising strength: 0.3, Clip skip: 2

Conclusion

Have you had a chance to learn about the basic usage of A1111 Stable Diffusion Web UI’s “Image to Image (img2img)”? The img2img feature is very useful for brushing up existing artwork with AI, creating professional looking illustrations from simple sketches, and fine-tuning generated images. It is a useful tool to make your creative work more efficient, so please give it a try.

SAMSUNG 990 PRO Heatsink SSD 4TB, NVMe M.2, Speeds Up to 7,450MB/s Best for PlayStation® 5 (PS5 SSD) Console Expansion MZ-V9P4T0CW

Shop at Amazon-Usa

🔗Amazon-Usa Link

WD_BLACK 2TB SN850X NVMe Internal Gaming SSD Solid State Drive - Gen4 PCIe, M.2 2280, Up to 7,300 MB/s - WDS200T2X0E

Shop at Amazon-Usa

🔗Amazon-Usa Link

Category:📂 Novice

Tags:🏷️ AUTOMATIC1111 🏷️ image2image

Supported by