Steps

How to adjust inference/sampling steps in Stable Diffusion

Typically, the higher the number of steps used in a process, the better the quality of the output. However, it's important to prioritize the desired results before setting the number of steps too high.

Before delving into the specifics, let's take a moment to understand the steps parameter in Stable Diffusion and diffusion models more generally. These models work through an iterative process where each step removes some noise from the initial random noise generated from text input, resulting in higher-quality images over time. The process continues until the desired number of steps is reached.

Usually, around 30 sampling steps are sufficient to produce high-quality images. Increasing the number of steps may lead to a slightly different image, but it may not necessarily improve its quality. Additionally, the iterative nature of the process may slow down image generation, and using more steps will increase the amount of time required to produce an image. Therefore, it's not always necessary to use an excessive number of steps and endure a longer wait time, especially if the desired results can be achieved with fewer steps. The key is to find the right balance between image quality and the time and resources required to produce it.

You can find a sequence of individual steps from 1 to 60 that were used to produce an image of "a chicken wearing a hat" below:

Observe how the transformation occurs primarily between steps 3 and 7, as the blob begins to take on the shape of a chicken wearing a hat. The highest quality output is typically reached around steps 20 to 30. Any further steps beyond 30 do not contribute significantly to the quality of the generated image; instead, the chicken's form changes repeatedly without adding any new details.

Last updated