Learn how to use our powerful text generation tools and API features.
View API ReferenceArli AI Text Generation is powered by both Aphrodite-Engine and vLLM depending on the models. As such most of our available generation parameters will be similar to those available in Aphrodite-Engine.
https://github.com/aphrodite-engine/aphrodite-enginehttps://github.com/vllm-project/vllmAll Text Generation API endpoints require authentication using a Bearer Authentication via the Authorization
. Replace {ARLIAI_API_KEY}
in the examples with your actual API key.
Ensure you have access granted to the specific Text Generation models you intend to use. Free accounts are able to use each model for a maximum of 5 requests every 2 days for testing purposes. Text generation requests are subject to rate limits and concurrency limits based on your account plan. Exceeding limits may result in temporary account restrictions.
API Key parameter overrides (set in your account settings) will merge with and take precedence over parameters sent in the request body for allowed parameters.
The number of requests you can make at the same time for a model is determined by the parallel requests allowed for your account.
If you try to send more requests in parallel than allowed, the request will be blocked.
Your API keys are more than just for authentication. From the Account page, you can configure powerful overrides and settings that apply to every API request made with that key. This is perfect for using our API with third-party clients that may not support all of our unique features.
You can set default generation parameters directly on your API key. These settings will override any parameters sent in an API request. This allows you to enforce specific settings or use advanced features not exposed in other interfaces.
temperature
, top_p
, top_k
, repetition_penalty
, and more.multi_models
field. The API will randomly select one model from this list for each request, which is great for variety or A/B testing.multiplier
adjusts the LoRA alpha value, controlling the strength of the fine-tune.hide_thinking
checkbox to ensure the model's reasoning process (content within <think>...</think>
tags) is stripped from the final output.Use the "Model Filter" option to specify a whitelist of models that can be accessed with a particular API key. When this key is used to query the /v1/models/textgen-models
endpoint, only the models from your filtered list will be returned.
The Advanced Chat is a powerful interface for interacting with a single model. It offers extensive control over the chat process and session management.
These are the core parameters that control the text generation process. They are available in both Advanced Chat and Arena Chat modes.
DRY (Don't Repeat Yourself) sampling is a set of parameters designed to prevent the model from repeating sequences of tokens.
The Arena Chat provides a split-screen interface to compare two models or two different sets of settings simultaneously.