GLM-4.5-Air-Iceblink-v2

Creative model

View on Hugging FaceBack to Models

Hourly Usage

Performance Metrics

Avg. Total Time

26.71s

Avg. TTFT

5.75s

Avg. Prefill TPS

4965.90

Avg. Gen TPS

20.03

Model Information

Context Size

131072

Quantization

r32

Engine

aphrodite

Creation Method

LoRA Finetune

Model Type

GLM45A

Chat Template

GLM4

Reasoning

Yes

Vision

No

Parameters

106B

Added At

11/11/2025


license: mit datasets:

  • zerofata/Instruct-Anime
  • zerofata/Roleplay-Anime-Characters
  • zerofata/Instruct-Anime-CreativeWriting
  • zerofata/Summaries-Anime-FandomPages base_model:
  • zai-org/GLM-4.5-Air

ICEBLINK

VERSION 2

image

Overview

Another re-attempt at GLM 4.5 Air. This time using a different training framework, some updated data and better hyperparameters.

This model is a creative writing and RP model. It's pretty verbose. The intent is to keep the behavior of the original model, but to improve writing, dialogue & creativity.

Compared to the original Iceblink, the effect on this one is more pronounced, with hopefully minimal impact on the intelligence.

SillyTavern Settings

Recommended Roleplay Format

> Actions: In plaintext
> Dialogue: "In quotes"
> Thoughts: *In asterisks*

Recommended Samplers

> Temp: 0.8 - 0.9
> MinP: 0.05
> TopP: 0.95 - 1.00

Instruct

GLM4.5 (no thinking): SillyTavern Preset

Quantizations

Creation Process

Creation Process: SFT

SFT on approx 13 million tokens, SFW / NSFW RP, stories, creative instruct & chat data. Some of the SFW datasets are public and can be found in the model datasets list.

I've switched over from Axolotl to MS-Swift w/ Megatron to train MoE models now. There's a roughly 5-10x speedup in training the models, thanks to escaping the naive MoE implementation in TRL. The training time for this run took only 40 minutes, excluding environment setup time.

A low LR for GLM Air appears to be king. Going any higher, I've found it extremely easy to begin overcooking the model.

Special Thanks

A shoutout to the people in BeaverAI discord that helped me test this model and my intermediate versions.

ddh0 (Madison), Ambius, Dysfunctional & my dude.