Avg. Total Time
21.76s
Avg. TTFT
6.92s
Avg. Prefill TPS
2020.50
Avg. Gen TPS
19.21
Context Size
32768
Quantization
r64
Engine
aphrodite
Creation Method
LoRA Finetune
Model Type
Llama70B
Chat Template
Llama 3
Reasoning
No
Vision
No
Parameters
70B
Added At
12/22/2024
license: cc-by-nc-4.0 language:

Llama-3.1-70B-Hanami-x1
This is an experiment over Euryale v2.2, which I think worked out nicely.
Feels different from it, in a good way. I prefer it over 2.2, and 2.1 from testing.
As usual, the Euryale v2.1 & 2.2 Settings work on it.
min_p of at minimum 0.1 is recommended for Llama 3 types.
I like it, so try it out?