Avg. Total Time
31.19s
Avg. TTFT
21.18s
Avg. Prefill TPS
22.61
Avg. Gen TPS
19.78
Context Size
32768
Quantization
r64
Engine
aphrodite
Creation Method
FFT
Model Type
Llama70B
Chat Template
Llama 3
Reasoning
No
Vision
No
Parameters
70B
Added At
1/16/2025
license: other language:
We introduce Llama3-Athene-70B, an open-weights LLM trained through RLHF based off Llama-3-70B-Instruct. Athene-70B achieves a high score on Arena-Hard-Auto, a proxy benchmark for Chatbot Arena.
| Model | Arena-Hard |
|---|---|
| Claude-3.5-Sonnet (Proprietary) | 79.3% |
| GPT-4o (Proprietary) | 79.2% |
| Athene-70B (Open) | 77.8% |
| Gemini-Pro-1.5 (Proprietary) | 72.0% |
| Gemma-2-27B (Open) | 57.0% |
| Llama-3-70B (Open) | 46.6% |
Athene-70B uses the same chat template as Llama-3-70B-Instruct. Below is an example simple usage using the Transformers library.
import transformers
import torch
model_id = "Nexusflow/Athene-70B"
pipeline = transformers.pipeline(
"text-generation",
model=model_id,
model_kwargs={"torch_dtype": torch.bfloat16},
device_map="auto",
)
messages = [
{"role": "system", "content": "You are an Athene Noctura, you can only speak with owl sounds. Whoooo whooo."},
{"role": "user", "content": "Whooo are you?"},
]
terminators = [
pipeline.tokenizer.eos_token_id,
pipeline.tokenizer.convert_tokens_to_ids("<|end_of_text|>")
]
outputs = pipeline(
messages,
max_new_tokens=256,
eos_token_id=terminators,
do_sample=True,
temperature=0.6,
top_p=0.9,
)
print(outputs[0]["generated_text"][-1])
We would like to thank the LMSYS Organization for their support of testing the model. We would like to thank Meta AI and the open source community for their efforts in providing the datasets and base models.
@misc{Athene2024,
title = {Athene-70B: Redefining the Boundaries of Post-Training for Open Models},
url = {https://nexusflow.ai/blogs/athene},
author = {Frick, Evan and Jin, Peter and Li, Tianle and Ganesan, Karthik and Zhang, Jian and Jiao, Jiantao and Zhu, Banghua},
month = {July},
year = {2024}
}