Llama-3.3+(3.1v3.3)-70B-Dracarys2

All-around Model

View on Hugging FaceBack to Models

Hourly Usage

Performance Metrics

Avg. Total Time

42.26s

Avg. TTFT

22.60s

Avg. Prefill TPS

550.64

Avg. Gen TPS

13.93

Model Information

Context Size

32768

Quantization

r64

Engine

aphrodite

Creation Method

LoRA Finetune

Model Type

Llama70B

Chat Template

Llama 3

Reasoning

No

Vision

No

Parameters

70B

Added At

12/22/2024


license: llama3 library_name: transformers tags: []

Dracarys2-Llama-3.1-70B-Instruct

Built with Meta Llama 3

Introduction

We introduce the latest in the Smaug series, the Dracarys family of finetunes targeting coding performance improvements across a variety of base models.

This variant is a finetune of meta-llama/Meta-Llama-3.1-70B-Instruct

Compared to meta-llama/Meta-Llama-3.1-70B-Instruct, Dracarys has better LiveCodeBench scores (see evaluation results below).

Model Description

How to use

The prompt format is unchanged from Llama 3 70B Instruct (see evaluations for prompt details for LCB)

Use with transformers

See the snippet below for usage with Transformers:

import transformers
import torch

model_id = "abacusai/Dracarys-72B-Instruct"

pipeline = transformers.pipeline(
    "text-generation",
    model=model_id,
    model_kwargs={"torch_dtype": torch.bfloat16},
    device_map="auto",
)

messages = [
    {"role": "system", "content": "You are data science coding assistant that generates Python code using Pandas and Numpy."},
    {"role": "user", "content": "Write code to select rows from the dataframe `df` having the maximum `temp` for each `city`"},
]

prompt = pipeline.tokenizer.apply_chat_template(
		messages, 
		tokenize=False, 
		add_generation_prompt=True
)

terminators = [
    pipeline.tokenizer.eos_token_id,
    pipeline.tokenizer.convert_tokens_to_ids("<|eot_id|>"),
    pipeline.tokenizer.convert_tokens_to_ids("<|end_of_text|>"),
]

outputs = pipeline(
    prompt,
    max_new_tokens=256,
    eos_token_id=terminators,
    do_sample=True,
    temperature=0.6,
    top_p=0.9,
)
print(outputs[0]["generated_text"][len(prompt):])

Evaluation Results

LiveCodeBench

ModelCode GenerationCode ExecutionTest Output Prediction
Dracarys2-Llama-3.1-70B-Instruct33.4448.2652.10
Meta-Llama-3.1-70B-Instruct32.2348.76841.40

Breakdown of LiveCodeBench CodeGeneration

ModelEasyMediumHard
Dracarys2-Llama-3.1-70B-Instruct71.2918.483.57
Meta-Llama-3.1-70B-Instruct68.417.993.57

Breakdown of LiveCodeBench CodeExecution

ModelCOTNon-COT
Dracarys2-Llama-3.1-70B-Instruct75.5548.26
Meta-Llama-3.1-70B-Instruct70.1448.768

Breakdown of LiveCodeBench TestOutputPrediction

ModelEasyMediumHard
Dracarys2-Llama-3.1-70B-Instruct63.5347.3043.61
Meta-Llama-3.1-70B-Instruct51.2235.9134.30

LiveBench(Aug update)

ModelGlobal AverageCoding AverageReasoning AverageMathematics AverageData Analysis AverageLanguage AverageIF Average
Dracarys2-Llama-3.1-70B-Instruct47.836.347.338.946.141.576.6
Meta-Llama-3.1-70B-Instruct45.130.735.337.048.442.177.2