Avg. Total Time
18.64s
Avg. TTFT
11.45s
Avg. Prefill TPS
291.91
Avg. Gen TPS
21.61
Context Size
32768
Quantization
r64
Engine
aphrodite
Creation Method
Merge
Model Type
Llama70B
Chat Template
Llama 3
Reasoning
No
Vision
No
Parameters
70B
Added At
4/9/2025
thumbnail: >- https://cdn-uploads.huggingface.co/production/uploads/67c10cfba43d7939d60160ff/o9eVw82xaajZx_zCO9qyJ.png language:
A furry finetune model based on L3.3-Cu-Mai-R1-70b, chosen for its exceptional features. *Tail swish*
The following templates are recommended from the original Cu-Mai model page, Adjust if needed:
Start Reply With:
'<think> OK, as an objective, detached narrative analyst, let's think this through carefully:'
Reasoning Formatting (no spaces):
'<think>''</think>'