Avg. Total Time
40.81s
Avg. TTFT
40.55s
Avg. Prefill TPS
63.47
Avg. Gen TPS
15.14
Context Size
32768
Quantization
r64
Engine
aphrodite
Creation Method
Merge
Model Type
Llama70B
Chat Template
Llama 3
Reasoning
No
Vision
No
Parameters
70B
Added At
2/22/2025
base_model:
I don't even know anymore

After banging my head against the wall some more - I actually managed to merge DeepSeek distill into my mess! Along with even more models (my hand just slipped, I swear)
The prose is better than in v0.5, but has a different feel to it, so I guess it's more of a step to the side than forward (hence the title EXTRA instead of 0.6).
The context recall may have improved, or I'm just gaslighting myself to think so.
And of course, since it now has DeepSeek in it - <think> tags!
They kinda work out of the box if you add <think> to the 'Start Reply With' field in ST - that way the model will write a really short character thought in it. However, if we want some OOC reasoning, things get trickier.
My initial thought was that this model could be instructed to use <think> either only for {{char}}'s inner monologue or for detached analysis, but actually it would end up writing character thoughts most of the time anyway, and the times when it did reason stuff it threw the narrative out of the window by making it too formal and even adding some notes at the end.
And so the solution was to add a prefill after the <think> tag. There's a lot of room for improvement, but for now, I think this boats the float or whatever:
<think> [Okay, let me think through what's happening from my "morally ambiguous narrator" perspective before I continue this fictional roleplaying session]
*So
If you add the line break after the tag, the output becomes too formal, and if you remove the asterisk, it becomes too censored. Yeah...
Settings:
Prompt format: Llama3
Samplers: 1.2 Temp, 0.025 minP, 0.25 smoothing factor, 2.0 smoothing curve
The things that I have done to bring about this abomination in our world are truly atrocious - as if v0.5 wasn't bad enough. Merging shouldn't be done the way I did it, really. Maybe one day I will bother to put out a branching diagram of this thing, since just listing the merge steps one by one is confusing.