Avg. Total Time
47.57s
Avg. TTFT
8.43s
Avg. Prefill TPS
4290.15
Avg. Gen TPS
18.40
Context Size
202752
Quantization
INT8-INT4
Engine
vllm
Creation Method
FFT
Model Type
GLM46F
Chat Template
GLM4
Reasoning
Yes
Vision
No
Parameters
355B
Added At
9/26/2025
language:
๐ Join our Discord community.
๐ Check out the GLM-4.6 technical blog, technical report(GLM-4.5), and Zhipu AI technical documentation.
๐ Use GLM-4.6 API services on Z.ai API Platform.
๐ One click to GLM-4.6.
Compared with GLM-4.5, GLM-4.6 brings several key improvements:
We evaluated GLM-4.6 across eight public benchmarks covering agents, reasoning, and coding. Results show clear gains over GLM-4.5, with GLM-4.6 also holding competitive advantages over leading domestic and international models such as DeepSeek-V3.1-Terminus and Claude Sonnet 4.

Both GLM-4.5 and GLM-4.6 use the same inference method.
you can check our github for more detail.
For general evaluations, we recommend using a sampling temperature of 1.0.
For code-related evaluation tasks (such as LCB), it is further recommended to set:
top_p = 0.95top_k = 40