Avg. Total Time
6.29s
Avg. TTFT
6.03s
Avg. Prefill TPS
249.42
Avg. Gen TPS
14.04
Context Size
32768
Quantization
r64
Engine
aphrodite
Creation Method
Merge
Model Type
Llama70B
Chat Template
Llama 3
Reasoning
No
Vision
No
Parameters
70B
Added At
12/22/2024
base_model:
This is a merge of pre-trained language models created using mergekit.
This model was merged using the task arithmetic merge method using nvidia/Llama-3.1-Nemotron-70B-Instruct-HF as a base.
The following models were included in the merge:
The following YAML configuration was used to produce this model:
merge_method: task_arithmetic
base_model: nvidia/Llama-3.1-Nemotron-70B-Instruct-HF
models:
- model: nvidia/Llama-3.1-Nemotron-70B-Instruct-HF
parameters:
weight: 1.0
- model: NeverSleep/Lumimaid-v0.2-70B
parameters:
weight: 1.0
dtype: bfloat16