gemma-4-31B-K1-v5

Creative model

View on Hugging FaceBack to Models

Hourly Usage

Performance Metrics

Avg. Total Time

32.26s

Avg. TTFT

8.65s

Avg. Prefill TPS

616.33

Avg. Gen TPS

25.59

Model Information

Context Size

262144

Quantization

r64

Engine

vllm

Creation Method

LoRA Finetune

Model Type

Gemma31B

Chat Template

Gemma4

Reasoning

Yes

Vision

Yes

Parameters

31B

Added At

5/2/2026


license: apache-2.0 language:

  • zh
  • en base_model:
  • huihui-ai/Huihui-gemma-4-31B-it-abliterated-v2 pipeline_tag: image-text-to-text library_name: transformers tags:
  • RL
  • GRPO
  • cognitive ability
  • conversational

K1 is a model that underwent post-training using datasets such as iannicity/KIMI-K2.5-1000000x, iannicity/Hunter-Alpha-SFT, and stepfun-ai/Step-3.5-Flash-SFT.

It subsequently incorporated GRPO-based reinforcement learning training derived from Chinese logical problems, resulting in relatively consistent reasoning capabilities and enhanced cognition in complex scenarios.

I have also observed that GRPO produces side effects such as improved numerical computation ability, which is related to its influence on layers such as MLP.

If you like my work, you are welcome to support me by buying me a coffee on Ko-fi.

Every bit of your support directly helps me continue creating and allows me to spend more time producing better work:

https://ko-fi.com/ogodwin10