Llama-3.3-70B-Thalassic-Delta

Creative Model

View on Hugging FaceBack to Models

Hourly Usage

Performance Metrics

Avg. Total Time

5.01s

Avg. TTFT

4.31s

Avg. Prefill TPS

2.55

Avg. Gen TPS

19.89

Model Information

Context Size

32768

Quantization

r64

Engine

aphrodite

Creation Method

Merge

Model Type

Llama70B

Chat Template

Llama 3

Reasoning

No

Vision

No

Parameters

70B

Added At

2/9/2025


base_model:

  • TheDrummer/Anubis-70B-v1
  • EVA-UNIT-01/EVA-LLaMA-3.33-70B-v0.1
  • Sao10K/70B-L3.3-Cirrus-x1
  • Sao10K/L3.1-70B-Hanami-x1
  • SicariusSicariiStuff/Negative_LLAMA_70B
  • deepseek-ai/DeepSeek-R1-Distill-Llama-70B library_name: transformers tags:
  • mergekit
  • merge license: llama3.3

I messed around with the the ingredients in the Thalassic series, essentially testing how much of an effect the base and pivot models had on the merge. In my opinion, this is the best of the Thalassic models.

merge

This is a merge of pre-trained language models created using mergekit.

Merge Details

Merge Method

This model was merged using the SCE merge method using SicariusSicariiStuff/Negative_LLAMA_70B as a base.

Models Merged

The following models were included in the merge:

Configuration

The following YAML configuration was used to produce this model:

models:
  # Pivot model
  - model: deepseek-ai/DeepSeek-R1-Distill-Llama-70B
  # Target models
  - model: Sao10K/70B-L3.3-Cirrus-x1
  - model: Sao10K/L3.1-70B-Hanami-x1
  - model: TheDrummer/Anubis-70B-v1
  - model: EVA-UNIT-01/EVA-LLaMA-3.33-70B-v0.1
merge_method: sce
base_model: SicariusSicariiStuff/Negative_LLAMA_70B
parameters:
  select_topk: 1.0
dtype: bfloat16