API Documentation

This document provides a detailed summary of the available API endpoints, including explanations, a complete list of supported parameters, and `curl` examples.

Note: Replace your_api_key with your actual API key and MODEL_NAME with the name of the model you want to use.

Main Docs

V1 API Endpoints

Base path: /v1

Text Generation

Parameters

ParameterTypeRequiredDescription
modelstringrequiredThe chosen model to use for generation.
multi_modelsarray of stringsoptionalList of models to use for generation. Chosen randomly during the request.
multipliernumberoptionalMultiplier for a finetuned model's LoRA alpha value. Higher value means stronger effect from the finetune. 0.5x, 2x and 4x supported.
nintegeroptionalNumber of output sequences to return for the given prompt.
min_tokensintegeroptionalMinimum number of tokens to generate per output sequence before EOS or stop tokens are generated.
presence_penaltynumberoptionalFloat that penalizes new tokens based on whether they appear in the generated text so far. Values > 0 encourage the model to use new tokens, while values < 0 encourage the model to repeat tokens.
frequency_penaltynumberoptionalFloat that penalizes new tokens based on their frequency in the generated text so far. Values > 0 encourage the model to use new tokens, while values < 0 encourage the model to repeat tokens.
repetition_penaltynumberoptionalFloat that penalizes new tokens based on their frequency in the generated text so far. Freq_pen is applied additively while rep_pen is applied multiplicatively. Must be in [1, inf). Set to 1 to disable the effect.
no_repeat_ngram_sizeintegeroptional(Aphrodite only) Size of the n-grams to prevent repeating. 1 would mean no token can appear twice. 2 would mean no pair of consecutive tokens can appear twice.
temperaturenumberoptionalFloat that controls the randomness of the sampling. Lower values make the model more deterministic, while higher values make the model more random. Zero means greedy sampling.
top_pnumberoptionalFloat that controls the cumulative probability of the top tokens to consider. Must be in (0, 1]. Set to 1 to consider all tokens.
top_kintegeroptionalInteger that controls the number of top tokens to consider. Set to -1 to consider all tokens.
top_anumberoptional(Aphrodite only) Float that controls the cutoff for Top-A sampling. Exact cutoff is top_a*max_prob**2. Must be in [0, inf], 0 to disable.
min_pnumberoptionalFloat that controls the cutoff for min-p sampling. Exact cutoff is min_p*max_prob. Must be in [0, 1], 0 to disable.
tfsnumberoptional(Aphrodite only) Float that controls the cumulative approximate curvature of the distribution to retain for Tail Free Sampling. Must be in (0, 1]. Set to 1 to disable.
eta_cutoffnumberoptional(Aphrodite only) Float that controls the cutoff threshold for Eta sampling (a form of entropy adaptive truncation sampling). Threshold is computed as min(eta, sqrt(eta)*entropy(probs)). Specified in units of 1e-4. Set to 0 to disable.
epsilon_cutoffnumberoptional(Aphrodite only) Float that controls the cutoff threshold for Epsilon sampling (simple probability threshold truncation). Specified in units of 1e-4. Set to 0 to disable.
typical_pnumberoptional(Aphrodite only) Float that controls the cumulative probability of tokens closest in surprise to the expected surprise. Must be in (0, 1]. Set to 1 to disable.
dynatemp_minnumberoptional(Aphrodite only) Minimum temperature for dynamic temperature sampling. Range [0, inf).
dynatemp_maxnumberoptional(Aphrodite only) Maximum temperature for dynamic temperature sampling. Range [0, inf).
dynatemp_exponentnumberoptional(Aphrodite only) Exponent for dynamic temperature sampling. Range [0, inf).
smoothing_factornumberoptional(Aphrodite only) Smoothing factor for Quadratic Sampling.
smoothing_curvenumberoptional(Aphrodite only) Smoothing curve for Cubic Sampling.
seedintegeroptionalRandom seed to use for the generation.
length_penaltynumberoptional(Aphrodite only) Penalizes sequences based on their length. Used in beam search.
stopstring or array of stringsoptionalList of strings that stop the generation when they are generated. The returned output will not contain the stop strings.
stop_token_idsarray of integersoptionalList of token IDs that stop the generation when they are generated. The returned output will contain the stop tokens unless they are special tokens.
include_stop_str_in_outputbooleanoptionalWhether to include the stop strings in the output text. Defaults to False.
ignore_eosbooleanoptionalWhether to ignore the EOS token and continue generating tokens after the EOS token is generated.
max_tokensintegeroptionalMaximum number of tokens to generate per output sequence.
logprobsintegeroptionalNumber of log probabilities to return per output token. When set to None, no probability is returned.
prompt_logprobsintegeroptionalNumber of log probabilities to return per prompt token.
detokenizebooleanoptionalWhether to detokenize the output. Defaults to True.
skip_special_tokensbooleanoptionalWhether to skip special tokens in the output. Defaults to True.
spaces_between_special_tokensbooleanoptionalWhether to add spaces between special tokens in the output. Defaults to True.
logits_processorsarrayoptionalList of functions that modify logits based on previously generated tokens and optionally prompt tokens.
logit_biasobjectoptionalList of LogitsProcessors to change the probability of token prediction at runtime.
truncate_prompt_tokensintegeroptionalIf set to an integer k, will use only the last k tokens from the prompt (left-truncation). Default: None (no truncation).
xtc_thresholdnumberoptional(Aphrodite only) In XTC sampling, if 2 or more tokens have a probability above this threshold, consider removing all but the last one. Disabled: 0.
xtc_probabilitynumberoptional(Aphrodite only) The probability that the removal will happen in XTC sampling. Set to 0 to disable. Default: 0.
guided_jsonstring or objectoptionalIf specified, the output will follow the JSON schema. Can be a JSON string or a Python dictionary.
guided_regexstringoptionalIf specified, the output will follow the regex pattern.
guided_choicearray of stringsoptionalIf specified, the output will be exactly one of the provided choices (a list of strings).
guided_decoding_backendstringoptionalOverrides the default guided decoding backend for this specific request. Must be either "outlines" or "lm-format-enforcer".
guided_whitespace_patternstringoptionalOverrides the default whitespace pattern for guided JSON decoding.
nsigmanumberoptional(Aphrodite only) Number of standard deviations from the maximum logit to use as a cutoff threshold. Tokens with logits below (max_logit - nsigma * std_dev) are filtered out. Higher values (e.g. 3.0) keep more tokens, lower values (e.g. 1.0) are more selective. Must be positive. 0 to disable.
dry_multipliernumberoptional(Aphrodite only) Float that controls the magnitude of the DRY sampling penalty. Higher values create stronger penalties against repetition. The penalty is multiplied by this value before being applied. Must be non-negative. 0 disables the sampler.
dry_basenumberoptional(Aphrodite only) Base for the exponential growth of the DRY sampling penalty. Controls how quickly the penalty increases with longer repeated sequences. Must be greater than 1. Higher values (e.g. 2.0) create more aggressive penalties for longer repetitions. Defaults to 1.75.
dry_allowed_lengthintegeroptional(Aphrodite only) Maximum number of tokens that can be repeated without incurring a DRY sampling penalty. Sequences longer than this will be penalized exponentially. Must be at least 1. Defaults to 2.
dry_sequence_breaker_idsarray of integersoptional(Aphrodite only) List of token IDs that stop the matching of repeated content. These tokens will break up the input into sections where repetition is evaluated separately. Common examples are newlines, quotes, and other structural tokens. Defaults to None.
dry_rangeintegeroptional(Aphrodite only) The range of tokens (input + output) to apply the DRY sampler.
skewnumberoptional(Aphrodite only) Bias the token selection towards higher or lower probability tokens. Defaults to 0 (disabled).
sampler_priorityarrayoptional(Aphrodite only) A list of integers to control the order in which samplers are applied.
allowed_token_idsarray of integersoptionalIf provided, the engine will construct a logits processor which only retains scores for the given token ids.
bad_wordsarray of stringsoptionalList of words that are not allowed to be generated.
banned_phrases_token_idsarray of array of integersoptional(Aphrodite only) List of token sequences that are not allowed to be generated.
best_ofintegeroptionalNumber of output sequences that are generated from the prompt. From these `best_of` sequences, the top `n` sequences are returned. `best_of` must be greater than or equal to `n`. By default, `best_of` is set to `n`.
custom_token_bansarray of integersoptional(Aphrodite only) List of token IDs to ban from generating.
deepconf_thresholdnumberoptional(Aphrodite only) Confidence threshold for early stopping in DeepConf.
deepconf_window_sizeintegeroptional(Aphrodite only) Size of the sliding window for confidence calculation in DeepConf.
dry_early_exit_match_lenintegeroptional(Aphrodite only) If we find this large a match in DRY sampling, we stop searching.
dry_max_ngramintegeroptional(Aphrodite only) Maximum length of match to check in DRY sampling.
dry_max_occurrencesintegeroptional(Aphrodite only) How many occurrences of last_token we analyze in DRY sampling.
early_stoppingboolean or stringoptional(Aphrodite only) Controls the stopping condition for beam search. It accepts `True`, `False`, or `"never"`.
enable_deepconfbooleanoptional(Aphrodite only) Enable DeepConf for confidence-based early stopping.
extra_argsobjectoptionalExtra arguments to pass to the engine.
guided_decodingobjectoptionalIf provided, the engine will construct a guided decoding logits processor from these parameters.
mirostat_etanumberoptional(Aphrodite only) Rate at which mirostat updates its internal surprisal value. Range [0, inf).
mirostat_modeintegeroptional(Aphrodite only) Can either be 0 (disabled) or 2 (Mirostat v2).
mirostat_taunumberoptional(Aphrodite only) Target "surprisal" that mirostat works towards. Range [0, inf).
output_kindstringoptionalThe type of output to generate. Can be "cumulative", "delta", or "final_only".
structured_outputsobjectoptional(vLLM only) Parameters for configuring structured outputs.
temperature_lastbooleanoptional(Aphrodite only) Whether to use temperature as the last sampler in the sampling pipeline.
token_ban_rangesarrayoptional(Aphrodite only) List of tuples (tokens, start, length) to ban from generating.
use_beam_searchbooleanoptional(Aphrodite only) Whether to use beam search instead of sampling.

POST /chat/completions

Handles chat completions. This endpoint takes a list of messages and returns a generated response.

POST /completions

Handles text completions. This endpoint takes a prompt and returns a generated response.

POST /tokenize

Tokenizes the given text.

Parameters

ParameterTypeRequiredDescription
modelstringoptionalThe model to use for tokenization.
multi_modelsarray of stringsoptionalA list of models to choose from.
promptstringoptionalThe prompt to tokenize.
messagesarrayoptionalThe messages to tokenize.

Image Generation

POST /img2img

Handles image-to-image generation.

Parameters

ParameterTypeRequiredDescription
sd_model_checkpointstringrequiredThe name of the model checkpoint.
promptstringrequiredThe text prompt.
init_imagesarray of stringsrequiredBase64-encoded initial images.
negative_promptstringoptionalThe negative prompt.
stepsintegeroptionalNumber of sampling steps.
sampler_namestringoptionalSampling method.
widthintegeroptionalImage width.
heightintegeroptionalImage height.
clip_skipintegeroptionalNumber of CLIP layers to skip.
seedintegeroptionalRandom seed.
cfg_scalenumberoptionalClassifier-Free Guidance scale.
streambooleanoptionalWhether to stream the response.
batch_sizeintegeroptionalNumber of images to generate in a batch.
denoising_strengthnumberoptionalDenoising strength for img2img.
maskstringoptionalBase64-encoded mask for inpainting.
mask_blurintegeroptionalMask blur for inpainting.
inpainting_fillintegeroptionalInpainting fill mode.
inpaint_full_resbooleanoptionalWhether to inpaint at full resolution.
inpaint_full_res_paddingintegeroptionalPadding for full-resolution inpainting.
inpainting_mask_invertintegeroptionalWhether to invert the inpainting mask.
initial_noise_multipliernumberoptionalInitial noise multiplier.
detailer_enabledbooleanoptionalEnable the detailer.
detailer_promptstringoptionalPrompt for the detailer.
detailer_negativestringoptionalNegative prompt for the detailer.
detailer_stepsintegeroptionalSteps for the detailer.
detailer_strengthnumberoptionalStrength of the detailer.
detailer_modelstringoptionalModel for the detailer.
detailer_classesstringoptionalClasses for the detailer.
detailer_confnumberoptionalConfidence for the detailer.
detailer_maxintegeroptionalMax detections for the detailer.
detailer_iounumberoptionalIoU for the detailer.
detailer_paddingintegeroptionalPadding for the detailer.
detailer_blurintegeroptionalBlur for the detailer.
detailer_mergebooleanoptionalMerge mode for the detailer.
schedulers_rescale_betasbooleanoptionalRescale betas for schedulers.
schedulers_use_thresholdingbooleanoptionalUse thresholding for schedulers.
schedulers_sigmanumberoptionalSigma for schedulers.
schedulers_beta_schedulestringoptionalBeta schedule for schedulers.
scheduler_etanumberoptionalETA for schedulers.
schedulers_solver_orderintegeroptionalSolver order for schedulers.
schedulers_beta_startnumberoptionalBeta start for schedulers.
schedulers_beta_endnumberoptionalBeta end for schedulers.
schedulers_timesteps_rangestringoptionalTimesteps range for schedulers.
schedulers_shiftnumberoptionalShift for schedulers.
schedulers_sigma_adjustbooleanoptionalSigma adjustment for schedulers.
schedulers_sigma_adjust_minnumberoptionalMin sigma adjustment.
schedulers_sigma_adjust_maxnumberoptionalMax sigma adjustment.

POST /txt2img

Handles text-to-image generation.

Parameters

ParameterTypeRequiredDescription
sd_model_checkpointstringrequiredThe name of the model checkpoint.
promptstringrequiredThe text prompt.
negative_promptstringoptionalThe negative prompt.
stepsintegeroptionalNumber of sampling steps.
sampler_namestringoptionalSampling method.
widthintegeroptionalImage width.
heightintegeroptionalImage height.
clip_skipintegeroptionalNumber of CLIP layers to skip.
seedintegeroptionalRandom seed.
cfg_scalenumberoptionalClassifier-Free Guidance scale.
streambooleanoptionalWhether to stream the response.
batch_sizeintegeroptionalNumber of images to generate in a batch.
hr_sampler_namestringoptionalSampler name for high-res fix.
detailer_enabledbooleanoptionalEnable the detailer.
detailer_promptstringoptionalPrompt for the detailer.
detailer_negativestringoptionalNegative prompt for the detailer.
detailer_stepsintegeroptionalSteps for the detailer.
detailer_strengthnumberoptionalStrength of the detailer.
detailer_modelstringoptionalModel for the detailer.
detailer_classesstringoptionalClasses for the detailer.
detailer_confnumberoptionalConfidence for the detailer.
detailer_maxintegeroptionalMax detections for the detailer.
detailer_iounumberoptionalIoU for the detailer.
detailer_paddingintegeroptionalPadding for the detailer.
detailer_blurintegeroptionalBlur for the detailer.
detailer_mergebooleanoptionalMerge mode for the detailer.
schedulers_rescale_betasbooleanoptionalRescale betas for schedulers.
schedulers_use_thresholdingbooleanoptionalUse thresholding for schedulers.
schedulers_sigmanumberoptionalSigma for schedulers.
schedulers_beta_schedulestringoptionalBeta schedule for schedulers.
scheduler_etanumberoptionalETA for schedulers.
schedulers_solver_orderintegeroptionalSolver order for schedulers.
schedulers_beta_startnumberoptionalBeta start for schedulers.
schedulers_beta_endnumberoptionalBeta end for schedulers.
schedulers_timesteps_rangestringoptionalTimesteps range for schedulers.
schedulers_shiftnumberoptionalShift for schedulers.
schedulers_sigma_adjustbooleanoptionalSigma adjustment for schedulers.
schedulers_sigma_adjust_minnumberoptionalMin sigma adjustment.
schedulers_sigma_adjust_maxnumberoptionalMax sigma adjustment.

POST /upscale-img

Upscales a single image.

Parameters

ParameterTypeRequiredDescription
imagestringrequiredThe base64-encoded image to upscale.
upscaler_1stringoptionalThe name of the upscaler to use.
resize_modeintegeroptionalThe resize mode.
upscaling_resizenumberoptionalThe factor by which to resize the image.

Model Information

GET /models/textgen-models

Retrieves available text generation models. No parameters.

GET /models/image-models

Retrieves available image generation models. No parameters.

GET /upscalers

Retrieves available upscalers. No parameters.

GET /img-options

Retrieves image generation options. No parameters.

GET /img-samplers

Retrieves available image samplers. No parameters.

GET /parallel-requests

Retrieves parallel request limits. No parameters.


SD API V1 Endpoints

Base path: /sdapi/v1

The endpoints under /sdapi/v1 are designed for compatibility with the Stable Diffusion API and share the same parameters as their /v1 counterparts.

POST /sdapi/v1/img2img

See /v1/img2img for parameters.

POST /sdapi/v1/txt2img

See /v1/txt2img for parameters.

POST /sdapi/v1/extra-single-image

See /v1/upscale-img for parameters.

GET /sdapi/v1/sd-models

Retrieves available image models.

GET /sdapi/v1/upscalers

Retrieves available upscalers.

GET /sdapi/v1/options

Retrieves image generation options.

GET /sdapi/v1/samplers

Retrieves available image samplers.