API Documentation

This document provides a detailed summary of the available API endpoints, including explanations, a complete list of supported parameters, and `curl` examples.

Note: Replace your_api_key with your actual API key and MODEL_NAME with the name of the model you want to use.

Main Docs

V1 API Endpoints

Base path: /v1

Text Generation

Parameters

Parameter	Type	Required	Description
model	string	required	The chosen model to use for generation.
multi_models	array of strings	optional	List of models to use for generation. Chosen randomly during the request.
multiplier	number	optional	Multiplier for a finetuned model's LoRA alpha value. Higher value means stronger effect from the finetune. 0.5x, 2x and 4x supported.
n	integer	optional	Number of output sequences to return for the given prompt.
min_tokens	integer	optional	Minimum number of tokens to generate per output sequence before EOS or stop tokens are generated.
presence_penalty	number	optional	Float that penalizes new tokens based on whether they appear in the generated text so far. Values > 0 encourage the model to use new tokens, while values < 0 encourage the model to repeat tokens.
frequency_penalty	number	optional	Float that penalizes new tokens based on their frequency in the generated text so far. Values > 0 encourage the model to use new tokens, while values < 0 encourage the model to repeat tokens.
repetition_penalty	number	optional	Float that penalizes new tokens based on their frequency in the generated text so far. Freq_pen is applied additively while rep_pen is applied multiplicatively. Must be in [1, inf). Set to 1 to disable the effect.
no_repeat_ngram_size	integer	optional	(Aphrodite only) Size of the n-grams to prevent repeating. 1 would mean no token can appear twice. 2 would mean no pair of consecutive tokens can appear twice.
temperature	number	optional	Float that controls the randomness of the sampling. Lower values make the model more deterministic, while higher values make the model more random. Zero means greedy sampling.
top_p	number	optional	Float that controls the cumulative probability of the top tokens to consider. Must be in (0, 1]. Set to 1 to consider all tokens.
top_k	integer	optional	Integer that controls the number of top tokens to consider. Set to -1 to consider all tokens.
top_a	number	optional	(Aphrodite only) Float that controls the cutoff for Top-A sampling. Exact cutoff is top_amax_prob*2. Must be in [0, inf], 0 to disable.
min_p	number	optional	Float that controls the cutoff for min-p sampling. Exact cutoff is min_p*max_prob. Must be in [0, 1], 0 to disable.
tfs	number	optional	(Aphrodite only) Float that controls the cumulative approximate curvature of the distribution to retain for Tail Free Sampling. Must be in (0, 1]. Set to 1 to disable.
eta_cutoff	number	optional	(Aphrodite only) Float that controls the cutoff threshold for Eta sampling (a form of entropy adaptive truncation sampling). Threshold is computed as min(eta, sqrt(eta)*entropy(probs)). Specified in units of 1e-4. Set to 0 to disable.
epsilon_cutoff	number	optional	(Aphrodite only) Float that controls the cutoff threshold for Epsilon sampling (simple probability threshold truncation). Specified in units of 1e-4. Set to 0 to disable.
typical_p	number	optional	(Aphrodite only) Float that controls the cumulative probability of tokens closest in surprise to the expected surprise. Must be in (0, 1]. Set to 1 to disable.
dynatemp_min	number	optional	(Aphrodite only) Minimum temperature for dynamic temperature sampling. Range [0, inf).
dynatemp_max	number	optional	(Aphrodite only) Maximum temperature for dynamic temperature sampling. Range [0, inf).
dynatemp_exponent	number	optional	(Aphrodite only) Exponent for dynamic temperature sampling. Range [0, inf).
smoothing_factor	number	optional	(Aphrodite only) Smoothing factor for Quadratic Sampling.
smoothing_curve	number	optional	(Aphrodite only) Smoothing curve for Cubic Sampling.
seed	integer	optional	Random seed to use for the generation.
length_penalty	number	optional	(Aphrodite only) Penalizes sequences based on their length. Used in beam search.
stop	string or array of strings	optional	List of strings that stop the generation when they are generated. The returned output will not contain the stop strings.
stop_token_ids	array of integers	optional	List of token IDs that stop the generation when they are generated. The returned output will contain the stop tokens unless they are special tokens.
include_stop_str_in_output	boolean	optional	Whether to include the stop strings in the output text. Defaults to False.
ignore_eos	boolean	optional	Whether to ignore the EOS token and continue generating tokens after the EOS token is generated.
max_tokens	integer	optional	Maximum number of tokens to generate per output sequence.
logprobs	integer	optional	Number of log probabilities to return per output token. When set to None, no probability is returned.
prompt_logprobs	integer	optional	Number of log probabilities to return per prompt token.
detokenize	boolean	optional	Whether to detokenize the output. Defaults to True.
skip_special_tokens	boolean	optional	Whether to skip special tokens in the output. Defaults to True.
spaces_between_special_tokens	boolean	optional	Whether to add spaces between special tokens in the output. Defaults to True.
logits_processors	array	optional	List of functions that modify logits based on previously generated tokens and optionally prompt tokens.
logit_bias	object	optional	List of LogitsProcessors to change the probability of token prediction at runtime.
truncate_prompt_tokens	integer	optional	If set to an integer k, will use only the last k tokens from the prompt (left-truncation). Default: None (no truncation).
xtc_threshold	number	optional	(Aphrodite only) In XTC sampling, if 2 or more tokens have a probability above this threshold, consider removing all but the last one. Disabled: 0.
xtc_probability	number	optional	(Aphrodite only) The probability that the removal will happen in XTC sampling. Set to 0 to disable. Default: 0.
guided_json	string or object	optional	If specified, the output will follow the JSON schema. Can be a JSON string or a Python dictionary.
guided_regex	string	optional	If specified, the output will follow the regex pattern.
guided_choice	array of strings	optional	If specified, the output will be exactly one of the provided choices (a list of strings).
guided_decoding_backend	string	optional	Overrides the default guided decoding backend for this specific request. Must be either "outlines" or "lm-format-enforcer".
guided_whitespace_pattern	string	optional	Overrides the default whitespace pattern for guided JSON decoding.
nsigma	number	optional	(Aphrodite only) Number of standard deviations from the maximum logit to use as a cutoff threshold. Tokens with logits below (max_logit - nsigma * std_dev) are filtered out. Higher values (e.g. 3.0) keep more tokens, lower values (e.g. 1.0) are more selective. Must be positive. 0 to disable.
dry_multiplier	number	optional	(Aphrodite only) Float that controls the magnitude of the DRY sampling penalty. Higher values create stronger penalties against repetition. The penalty is multiplied by this value before being applied. Must be non-negative. 0 disables the sampler.
dry_base	number	optional	(Aphrodite only) Base for the exponential growth of the DRY sampling penalty. Controls how quickly the penalty increases with longer repeated sequences. Must be greater than 1. Higher values (e.g. 2.0) create more aggressive penalties for longer repetitions. Defaults to 1.75.
dry_allowed_length	integer	optional	(Aphrodite only) Maximum number of tokens that can be repeated without incurring a DRY sampling penalty. Sequences longer than this will be penalized exponentially. Must be at least 1. Defaults to 2.
dry_sequence_breaker_ids	array of integers	optional	(Aphrodite only) List of token IDs that stop the matching of repeated content. These tokens will break up the input into sections where repetition is evaluated separately. Common examples are newlines, quotes, and other structural tokens. Defaults to None.
dry_range	integer	optional	(Aphrodite only) The range of tokens (input + output) to apply the DRY sampler.
skew	number	optional	(Aphrodite only) Bias the token selection towards higher or lower probability tokens. Defaults to 0 (disabled).
sampler_priority	array	optional	(Aphrodite only) A list of integers to control the order in which samplers are applied.
allowed_token_ids	array of integers	optional	If provided, the engine will construct a logits processor which only retains scores for the given token ids.
bad_words	array of strings	optional	List of words that are not allowed to be generated.
banned_phrases_token_ids	array of array of integers	optional	(Aphrodite only) List of token sequences that are not allowed to be generated.
best_of	integer	optional	Number of output sequences that are generated from the prompt. From these `best_of` sequences, the top `n` sequences are returned. `best_of` must be greater than or equal to `n`. By default, `best_of` is set to `n`.
custom_token_bans	array of integers	optional	(Aphrodite only) List of token IDs to ban from generating.
deepconf_threshold	number	optional	(Aphrodite only) Confidence threshold for early stopping in DeepConf.
deepconf_window_size	integer	optional	(Aphrodite only) Size of the sliding window for confidence calculation in DeepConf.
dry_early_exit_match_len	integer	optional	(Aphrodite only) If we find this large a match in DRY sampling, we stop searching.
dry_max_ngram	integer	optional	(Aphrodite only) Maximum length of match to check in DRY sampling.
dry_max_occurrences	integer	optional	(Aphrodite only) How many occurrences of last_token we analyze in DRY sampling.
early_stopping	boolean or string	optional	(Aphrodite only) Controls the stopping condition for beam search. It accepts `True`, `False`, or `"never"`.
enable_deepconf	boolean	optional	(Aphrodite only) Enable DeepConf for confidence-based early stopping.
extra_args	object	optional	Extra arguments to pass to the engine.
guided_decoding	object	optional	If provided, the engine will construct a guided decoding logits processor from these parameters.
mirostat_eta	number	optional	(Aphrodite only) Rate at which mirostat updates its internal surprisal value. Range [0, inf).
mirostat_mode	integer	optional	(Aphrodite only) Can either be 0 (disabled) or 2 (Mirostat v2).
mirostat_tau	number	optional	(Aphrodite only) Target "surprisal" that mirostat works towards. Range [0, inf).
output_kind	string	optional	The type of output to generate. Can be "cumulative", "delta", or "final_only".
structured_outputs	object	optional	(vLLM only) Parameters for configuring structured outputs.
temperature_last	boolean	optional	(Aphrodite only) Whether to use temperature as the last sampler in the sampling pipeline.
token_ban_ranges	array	optional	(Aphrodite only) List of tuples (tokens, start, length) to ban from generating.
use_beam_search	boolean	optional	(Aphrodite only) Whether to use beam search instead of sampling.

POST /chat/completions

Handles chat completions. This endpoint takes a list of messages and returns a generated response.

POST /chat/completions (with Image)

This endpoint allows you to send a text prompt along with an image for Vision Language Models (VLM). The user message content should be an array containing both the text and the image URL (base64 encoded).

POST /completions

Handles text completions. This endpoint takes a prompt and returns a generated response.

POST /tokenize

Tokenizes the given text.

Parameters

Parameter	Type	Required	Description
model	string	optional	The model to use for tokenization.
multi_models	array of strings	optional	A list of models to choose from.
prompt	string	optional	The prompt to tokenize.
messages	array	optional	The messages to tokenize.

Image Generation

POST /img2img

Handles image-to-image generation.

Parameters

Parameter	Type	Required	Description
sd_model_checkpoint	string	required	The name of the model checkpoint.
prompt	string	required	The text prompt.
init_images	array of strings	required	Base64-encoded initial images.
negative_prompt	string	optional	The negative prompt.
steps	integer	optional	Number of sampling steps.
sampler_name	string	optional	Sampling method.
width	integer	optional	Image width.
height	integer	optional	Image height.
clip_skip	integer	optional	Number of CLIP layers to skip.
seed	integer	optional	Random seed.
cfg_scale	number	optional	Classifier-Free Guidance scale.
stream	boolean	optional	Whether to stream the response.
batch_size	integer	optional	Number of images to generate in a batch.
denoising_strength	number	optional	Denoising strength for img2img.
mask	string	optional	Base64-encoded mask for inpainting.
mask_blur	integer	optional	Mask blur for inpainting.
inpainting_fill	integer	optional	Inpainting fill mode.
inpaint_full_res	boolean	optional	Whether to inpaint at full resolution.
inpaint_full_res_padding	integer	optional	Padding for full-resolution inpainting.
inpainting_mask_invert	integer	optional	Whether to invert the inpainting mask.
initial_noise_multiplier	number	optional	Initial noise multiplier.
detailer_enabled	boolean	optional	Enable the detailer.
detailer_prompt	string	optional	Prompt for the detailer.
detailer_negative	string	optional	Negative prompt for the detailer.
detailer_steps	integer	optional	Steps for the detailer.
detailer_strength	number	optional	Strength of the detailer.
detailer_model	string	optional	Model for the detailer.
detailer_classes	string	optional	Classes for the detailer.
detailer_conf	number	optional	Confidence for the detailer.
detailer_max	integer	optional	Max detections for the detailer.
detailer_iou	number	optional	IoU for the detailer.
detailer_padding	integer	optional	Padding for the detailer.
detailer_blur	integer	optional	Blur for the detailer.
detailer_merge	boolean	optional	Merge mode for the detailer.
schedulers_rescale_betas	boolean	optional	Rescale betas for schedulers.
schedulers_use_thresholding	boolean	optional	Use thresholding for schedulers.
schedulers_sigma	number	optional	Sigma for schedulers.
schedulers_beta_schedule	string	optional	Beta schedule for schedulers.
scheduler_eta	number	optional	ETA for schedulers.
schedulers_solver_order	integer	optional	Solver order for schedulers.
schedulers_beta_start	number	optional	Beta start for schedulers.
schedulers_beta_end	number	optional	Beta end for schedulers.
schedulers_timesteps_range	string	optional	Timesteps range for schedulers.
schedulers_shift	number	optional	Shift for schedulers.
schedulers_sigma_adjust	boolean	optional	Sigma adjustment for schedulers.
schedulers_sigma_adjust_min	number	optional	Min sigma adjustment.
schedulers_sigma_adjust_max	number	optional	Max sigma adjustment.

POST /txt2img

Handles text-to-image generation.

Parameters

Parameter	Type	Required	Description
sd_model_checkpoint	string	required	The name of the model checkpoint.
prompt	string	required	The text prompt.
negative_prompt	string	optional	The negative prompt.
steps	integer	optional	Number of sampling steps.
sampler_name	string	optional	Sampling method.
width	integer	optional	Image width.
height	integer	optional	Image height.
clip_skip	integer	optional	Number of CLIP layers to skip.
seed	integer	optional	Random seed.
cfg_scale	number	optional	Classifier-Free Guidance scale.
stream	boolean	optional	Whether to stream the response.
batch_size	integer	optional	Number of images to generate in a batch.
hr_sampler_name	string	optional	Sampler name for high-res fix.
detailer_enabled	boolean	optional	Enable the detailer.
detailer_prompt	string	optional	Prompt for the detailer.
detailer_negative	string	optional	Negative prompt for the detailer.
detailer_steps	integer	optional	Steps for the detailer.
detailer_strength	number	optional	Strength of the detailer.
detailer_model	string	optional	Model for the detailer.
detailer_classes	string	optional	Classes for the detailer.
detailer_conf	number	optional	Confidence for the detailer.
detailer_max	integer	optional	Max detections for the detailer.
detailer_iou	number	optional	IoU for the detailer.
detailer_padding	integer	optional	Padding for the detailer.
detailer_blur	integer	optional	Blur for the detailer.
detailer_merge	boolean	optional	Merge mode for the detailer.
schedulers_rescale_betas	boolean	optional	Rescale betas for schedulers.
schedulers_use_thresholding	boolean	optional	Use thresholding for schedulers.
schedulers_sigma	number	optional	Sigma for schedulers.
schedulers_beta_schedule	string	optional	Beta schedule for schedulers.
scheduler_eta	number	optional	ETA for schedulers.
schedulers_solver_order	integer	optional	Solver order for schedulers.
schedulers_beta_start	number	optional	Beta start for schedulers.
schedulers_beta_end	number	optional	Beta end for schedulers.
schedulers_timesteps_range	string	optional	Timesteps range for schedulers.
schedulers_shift	number	optional	Shift for schedulers.
schedulers_sigma_adjust	boolean	optional	Sigma adjustment for schedulers.
schedulers_sigma_adjust_min	number	optional	Min sigma adjustment.
schedulers_sigma_adjust_max	number	optional	Max sigma adjustment.

POST /upscale-img

Upscales a single image.

Parameters

Parameter	Type	Required	Description
image	string	required	The base64-encoded image to upscale.
upscaler_1	string	optional	The name of the upscaler to use.
resize_mode	integer	optional	The resize mode.
upscaling_resize	number	optional	The factor by which to resize the image.

Model Information

GET /models/textgen-models

Retrieves available text generation models. No parameters.

GET /models/image-models

Retrieves available image generation models. No parameters.

GET /upscalers

Retrieves available upscalers. No parameters.

GET /img-options

Retrieves image generation options. No parameters.

GET /img-samplers

Retrieves available image samplers. No parameters.

GET /parallel-requests

Retrieves parallel request limits. No parameters.

SD API V1 Endpoints

Base path: /sdapi/v1

The endpoints under /sdapi/v1 are designed for compatibility with the Stable Diffusion API and share the same parameters as their /v1 counterparts.

POST /sdapi/v1/img2img

See /v1/img2img for parameters.

POST /sdapi/v1/txt2img

See /v1/txt2img for parameters.

POST /sdapi/v1/extra-single-image

See /v1/upscale-img for parameters.

GET /sdapi/v1/sd-models

Retrieves available image models.

GET /sdapi/v1/upscalers

Retrieves available upscalers.

GET /sdapi/v1/options

Retrieves image generation options.

GET /sdapi/v1/samplers

Retrieves available image samplers.