GLM-4.5-Air-Iceblink

Creative model

View on Hugging FaceBack to Models

Hourly Usage

Performance Metrics

Avg. Total Time

55.51s

Avg. TTFT

8.26s

Avg. Prefill TPS

2489.70

Avg. Gen TPS

16.42

Model Information

Context Size

131072

Quantization

r32

Engine

aphrodite

Creation Method

LoRA Finetune

Model Type

GLM45A

Chat Template

GLM4

Reasoning

Yes

Vision

No

Parameters

106B

Added At

10/2/2025


license: mit datasets:

  • zerofata/Instruct-Anime
  • zerofata/Roleplay-Anime-Characters
  • zerofata/Instruct-Anime-CreativeWriting
  • zerofata/Summaries-Anime-FandomPages base_model:
  • zai-org/GLM-4.5-Air

ICEBLINK

image/png

Overview

An experimental GLM4.5 Air finetune.

Had this one in the works for a while, but was struggling to find the right hyperparams to get this model to behave nicely. Thank you to TheDrummer for helping me out with them.

This model is a creative writing and RP model. It's pretty verbose. The intent is to keep the behavior of the original model, but to slightly improve writing, dialogue & creativity.

SillyTavern Settings

Recommended Roleplay Format

> Actions: In plaintext
> Dialogue: "In quotes"
> Thoughts: *In asterisks*

Recommended Samplers

> Temp: 0.8
> MinP: 0.05
> TopP: 0.95

Instruct

GLM4.5 (no thinking): SillyTavern Preset

Quantizations

Creation Process

Creation Process: SFT

SFT on approx 10 million tokens, SFW / NSFW RP, stories, creative instruct & chat data.

MoE are brutal to train even with a small dataset like mine, so I took a different approach from usual. I used a very low LR in an effort to avoid having to apply DPO / KTO training afterwards.

I think there's likely a better config to be found, but experimentation with the model to find it is quite draining.