Llama-3.3-70B-Wanton-Wolf

Creative Model

View on Hugging FaceBack to Models

Hourly Usage

Performance Metrics

Avg. Total Time

6.59s

Avg. TTFT

5.56s

Avg. Prefill TPS

6.48

Avg. Gen TPS

24.13

Model Information

Context Size

32768

Quantization

r64

Engine

aphrodite

Creation Method

Merge

Model Type

Llama70B

Chat Template

Llama 3

Reasoning

No

Vision

No

Parameters

70B

Added At

4/9/2025


thumbnail: >- https://cdn-uploads.huggingface.co/production/uploads/67c10cfba43d7939d60160ff/o9eVw82xaajZx_zCO9qyJ.png language:


Wanton-Wolf-70B

User Discretion Advised

A furry finetune model based on L3.3-Cu-Mai-R1-70b, chosen for its exceptional features. *Tail swish*


✧ Quantized Formats


✧ Recommended Settings

  • Static Temperature: 1.0-1.05
  • Min P: 0.02
  • DRY Settings (optional):
    • Multiplier: 0.8
    • Base: 1.75
    • Length: 4

✧ Recommended Templates

The following templates are recommended from the original Cu-Mai model page, Adjust if needed:

LLam@ception by @.konnect
LeCeption by @Steel - A completely revamped XML version of Llam@ception 1.5.2 with stepped thinking and reasoning

LeCeption Reasoning Configuration:

Start Reply With:

'<think> OK, as an objective, detached narrative analyst, let's think this through carefully:'

Reasoning Formatting (no spaces):

  • Prefix: '<think>'
  • Suffix: '</think>'

✧ Credits

Model Author

Original Model Creator

  • @SteelSkull - Creator of the L3.3-Cu-Mai-R1-70b base model

Contributors ✨