Avg. Total Time
6.59s
Avg. TTFT
5.56s
Avg. Prefill TPS
6.48
Avg. Gen TPS
24.13
Context Size
32768
Quantization
r64
Engine
aphrodite
Creation Method
Merge
Model Type
Llama70B
Chat Template
Llama 3
Reasoning
No
Vision
No
Parameters
70B
Added At
4/9/2025
thumbnail: >- https://cdn-uploads.huggingface.co/production/uploads/67c10cfba43d7939d60160ff/o9eVw82xaajZx_zCO9qyJ.png language:
A furry finetune model based on L3.3-Cu-Mai-R1-70b, chosen for its exceptional features. *Tail swish*
The following templates are recommended from the original Cu-Mai model page, Adjust if needed:
Start Reply With:
'<think> OK, as an objective, detached narrative analyst, let's think this through carefully:'
Reasoning Formatting (no spaces):
'<think>''</think>'