Avg. Total Time
12.71s
Avg. TTFT
5.85s
Avg. Prefill TPS
648.20
Avg. Gen TPS
23.11
Context Size
32768
Quantization
r64
Engine
aphrodite
Creation Method
FFT
Model Type
Llama70B
Chat Template
Llama 3
Reasoning
No
Vision
No
Parameters
70B
Added At
12/22/2024
license: llama3.1 base_model:
OpenMath2-Llama3.1-70B is obtained by finetuning Llama3.1-70B-Base with OpenMathInstruct-2.
The model outperforms Llama3.1-70B-Instruct on MATH by 3.9%.
| Model | GSM8K | MATH | AMC 2023 | AIME 2024 | Omni-MATH |
|---|---|---|---|---|---|
| Llama3.1-8B-Instruct | 84.5 | 51.9 | 9/40 | 2/30 | 12.7 |
| OpenMath2-Llama3.1-8B (nemo | HF) | 91.7 | 67.8 | 16/40 | 3/30 | 22.0 |
| + majority@256 | 94.1 | 76.1 | 23/40 | 3/30 | 24.6 |
| Llama3.1-70B-Instruct | 95.8 | 67.9 | 19/40 | 6/30 | 19.0 |
| OpenMath2-Llama3.1-70B (nemo | HF) | 94.9 | 71.9 | 20/40 | 4/30 | 23.1 |
| + majority@256 | 96.0 | 79.6 | 24/40 | 6/30 | 27.6 |
The pipeline we used to produce the data and models is fully open-sourced!
See our paper to learn more details!
Our models are trained with the same "chat format" as Llama3.1-instruct models (same system/user/assistant tokens). Please note that these models have NOT been instruction-tuned on general data and thus might not provide good answers outside of the math domain.
We recommend using instructions in our repo to run inference with these models, but here is an example of how to do it through transformers api:
import transformers
import torch
model_id = "nvidia/OpenMath2-Llama3.1-70B"
pipeline = transformers.pipeline(
"text-generation",
model=model_id,
model_kwargs={"torch_dtype": torch.bfloat16},
device_map="auto",
)
messages = [
{
"role": "user",
"content": "Solve the following math problem. Make sure to put the answer (and only answer) inside \\boxed{}.\n\n" +
"What is the minimum value of $a^2+6a-7$?"},
]
outputs = pipeline(
messages,
max_new_tokens=4096,
)
print(outputs[0]["generated_text"][-1]['content'])
We provide all instructions to fully reproduce our results.
If you find our work useful, please consider citing us!
@article{toshniwal2024openmath2,
title = {OpenMathInstruct-2: Accelerating AI for Math with Massive Open-Source Instruction Data},
author = {Shubham Toshniwal and Wei Du and Ivan Moshkov and Branislav Kisacanin and Alexan Ayrapetyan and Igor Gitman},
year = {2024},
journal = {arXiv preprint arXiv:2410.01560}
}
By accessing this model, you are agreeing to the LLama 3.1 terms and conditions of the license, acceptable use policy and Meta’s privacy policy