Avg. Total Time
25.70s
Avg. TTFT
14.85s
Avg. Prefill TPS
516.31
Avg. Gen TPS
20.72
Context Size
32768
Quantization
r64
Engine
aphrodite
Creation Method
Unknown
Model Type
Llama70B
Chat Template
Llama 3
Reasoning
No
Vision
No
Parameters
70B
Added At
12/22/2024

I applied the last step of my continuous finetuning method to the Nemotron-70b model from Nvidia. More details bellow:
Quants: (Coming Soon)
Open-LLM-Leaderboard scores: (Coming soon)