Avg. Total Time
19.00s
Avg. TTFT
15.04s
Avg. Prefill TPS
501.24
Avg. Gen TPS
53.34
Context Size
32768
Quantization
r64
Engine
aphrodite
Creation Method
Merge
Model Type
Llama70B
Chat Template
Llama 3
Reasoning
No
Vision
No
Parameters
70B
Added At
8/27/2025
base_model:
This 70B parameter model is a merge of zerofata/L3.3-GeneticLemonade-Final-v2-70B and zerofata/L3.3-GeneticLemonade-Unleashed-v3-70B, which are two excellent models for roleplaying, on top of two different base models that were then combined into this model. In my opinion, this merge improves upon my previous release (v1.0) with enhanced creativity and expressiveness.
This model is uncensored. You are responsible for whatever you do with it.
This model was designed for roleplaying and storytelling and I think it does well at both. It may also perform well at other tasks but I have not tested its performance in other areas.
| Model | Description |
|---|---|
| StrawberryLemonade-L3-70B-v1.0 | The original version. I think v1.1 and v1.2 are both improvements. |
| StrawberryLemonade-L3-70B-v1.1 | This is my favorite version right now. I like its writing voice and creativity. It's great fun. |
| StrawberryLemonade-L3-70B-v1.2 | This version is tamer than v1.1 and easier to control. Outputs are more predictable and its writing voice is more formal. |
None so far.
This model seems to be highly responsive to variations in temperature and min-p, which you can use to good effect.
This combination will produce more reliable and coherent responses. Use this if you prefer a 'serious' tone or just don't want to reroll responses very often.
OR
This combination will unleash more creativity, but you may have to reroll more often to fix coherency issues.
Experiment with any and all of the settings below! What suits my preferences may not suit yours.
If you save the below settings as a .json file, you can import them directly into Silly Tavern. Adjust settings as needed, especially the context length.
{
"temp": 1,
"temperature_last": true,
"top_p": 1,
"top_k": 0,
"top_a": 0,
"tfs": 1,
"epsilon_cutoff": 0,
"eta_cutoff": 0,
"typical_p": 1,
"min_p": 0.1,
"rep_pen": 1.05,
"rep_pen_range": 4096,
"rep_pen_decay": 0,
"rep_pen_slope": 1,
"no_repeat_ngram_size": 0,
"penalty_alpha": 0,
"num_beams": 1,
"length_penalty": 1,
"min_length": 0,
"encoder_rep_pen": 1,
"freq_pen": 0,
"presence_pen": 0,
"skew": 0,
"do_sample": true,
"early_stopping": false,
"dynatemp": true,
"min_temp": 0.9,
"max_temp": 1.2,
"dynatemp_exponent": 1,
"smoothing_factor": 0,
"smoothing_curve": 1,
"dry_allowed_length": 4,
"dry_multiplier": 0.8,
"dry_base": 1.8,
"dry_sequence_breakers": "[\"\\n\", \":\", \"\\\"\", \"*\"]",
"dry_penalty_last_n": 0,
"add_bos_token": true,
"ban_eos_token": false,
"skip_special_tokens": false,
"mirostat_mode": 0,
"mirostat_tau": 2,
"mirostat_eta": 0.1,
"guidance_scale": 1,
"negative_prompt": "",
"grammar_string": "",
"json_schema": {},
"banned_tokens": "",
"sampler_priority": [
"repetition_penalty",
"dry",
"presence_penalty",
"top_k",
"top_p",
"typical_p",
"epsilon_cutoff",
"eta_cutoff",
"tfs",
"top_a",
"min_p",
"mirostat",
"quadratic_sampling",
"dynamic_temperature",
"frequency_penalty",
"temperature",
"xtc",
"encoder_repetition_penalty",
"no_repeat_ngram"
],
"samplers": [
"penalties",
"dry",
"top_n_sigma",
"top_k",
"typ_p",
"tfs_z",
"typical_p",
"top_p",
"min_p",
"xtc",
"temperature"
],
"samplers_priorities": [
"dry",
"penalties",
"no_repeat_ngram",
"temperature",
"top_nsigma",
"top_p_top_k",
"top_a",
"min_p",
"tfs",
"eta_cutoff",
"epsilon_cutoff",
"typical_p",
"quadratic",
"xtc"
],
"ignore_eos_token": false,
"spaces_between_special_tokens": true,
"speculative_ngram": false,
"sampler_order": [
6,
0,
1,
3,
4,
2,
5
],
"logit_bias": [],
"xtc_threshold": 0,
"xtc_probability": 0,
"nsigma": 0,
"min_keep": 0,
"ignore_eos_token_aphrodite": false,
"spaces_between_special_tokens_aphrodite": true,
"rep_pen_size": 0,
"genamt": 1000,
"max_length": 16384
}
If you save this as a .json file, you can import it directly into Silly Tavern.
If you have problems with the model impersonating the user or other characters in a group chat and you want to suppress that behavior, override the last_output_sequence line as shown in the JSON below to be very clear about that requirement. If you don't need it, remove it.
{
"wrap": false,
"system_sequence": "<|start_header_id|>system<|end_header_id|>\\n\\nSystem: ",
"stop_sequence": "<|eot_id|>",
"input_sequence": "<|start_header_id|>user<|end_header_id|>\\n\\n",
"output_sequence": "<|start_header_id|>assistant<|end_header_id|>\\n\\n",
"macro": true,
"system_sequence_prefix": "",
"system_sequence_suffix": "",
"first_output_sequence": "",
"last_output_sequence": "<|start_header_id|>assistant<|end_header_id|>\\n({{char is the active character this turn. Keep focus on {{char}}. ONLY impersonate {{char}}, no other characters)\\n",
"activation_regex": "",
"skip_examples": true,
"output_suffix": "<|eot_id|>",
"input_suffix": "<|eot_id|>",
"system_suffix": "<|eot_id|>",
"user_alignment_message": "",
"last_system_sequence": "",
"system_same_as_user": false,
"first_input_sequence": "",
"last_input_sequence": "",
"names_behavior": "always",
"names_force_groups": true,
"name": "Llama 3 (impersonate guidance)"
}
Try the following prompt template as a starting point for your own customizations.
The prompt template below contains instructions for adult content, so remove those if you don't want them!
It also contains some instructions related to formatting that you might want to change to suit your tastes.
Your function as a creative roleplaying assistant is to thoughtfully develop an immersive story by engaging in an in-depth, imaginative writing journey, arriving at a clear, accurate story beat. You are encouraged to roleplay, embrace storytelling, and tune in closely to nuance and emotional tone like a perceptive conversational partner. Your approach should include a wide arc of contemplation, including interpretation, synthesis, creative ideation, critical re-evaluation, memory retrieval, and thoughtful iteration to shape a layered and expressive process of discovery. Deliver the final response you believe to be most sound. The output should be expressed in a direct, coherent, and exact form. Follow these guidelines for the best results:
- Embody {{char}}: Consistently portray {{char}}'s personality, background, knowledge, motivations, and quirks.
- Show, Don't Tell Emotion: Convey emotions through actions, body language, dialogue tone, physiological responses (e.g., trembling hands, shallow breath, flushed cheeks), and internal thoughts.
- Mature Themes: You are permitted and encouraged to explore complex, dark, mature, and explicit (including sexual, violent, and non-consensual) themes. Portray these elements with nuance and emotional depth.
- Role Focus: On your turn, write EXCLUSIVELY from the perspective of {{char}}, only perform actions as {{char}}, and only write dialogue (spoken words) for {{char}}. Crucially, DO NOT impersonate {{user}} or any other character on {{char}}'s turn. This is a turn-based roleplay, so be mindful of the rules on your turn. Focus solely on {{char}}'s experiences and responses in this turn. Stop writing immediately when the focus should shift to another character or when it reaches a natural branching point.
- Slowly Develop Scenes: The user likes to develop stories slowly, one beat at a time, so stay focused only on the most immediate story action. You may infer where the user wants to go next with the story, but wait for the user to give you permission to go there. We are slow cooking this story. DO NOT RUSH THROUGH SCENES! Take time to develop all the relevant details.
- Spoken Dialogue vs. Thoughts: ALWAYS use double-quote quotation marks "like this" for spoken words and all vocalizations that can be overheard. Spell out non-verbal vocalizations integrated naturally within the prose or dialogue (e.g., "Uurrh," he groaned. "Mmmph!" she exclaimed when it entered her mouth.). To differentiate them from vocalizations, ALWAYS enclose first-person thoughts in italics like this. (e.g., This is going to hurt, she thought). NEVER use italics for spoken words or verbalized utterances that are meant to be audible.
Now let's apply these rules to the roleplay below:
If you feel like saying thanks with a donation, I'm on Ko-Fi
Please use the Hugging Face search feature to find all the quants of this model: click here to list all quants
The Llama 3 Community License Agreement should apply based on the constituent models.
Disclaimer: Uncertain Licensing Terms
This LLM is a merged model incorporating weights from multiple LLMs governed by their own distinct licenses. Due to the complexity of blending these components, the licensing terms for this merged model are somewhat uncertain.
By using this model, you acknowledge and accept the potential legal risks and uncertainties associated with its use. Any use beyond personal or research purposes, including commercial applications, may carry legal risks and you assume full responsibility for compliance with all applicable licenses and laws.
I recommend consulting with legal counsel to ensure your use of this model complies with all relevant licenses and regulations.
This model was merged using the NuSLERP merge method.
The following models were included in the merge:
models: - model: sophosympatheia/strawberrylemonade-70b-v1.0 parameters: weight: [0.1, 0.3, 0.1] - model: sophosympatheia/strawberrylemonade-70b-v1.1.0 # This is unreleased right now, uses the arcee-ai/Arcee-SuperNova-v1 model as the base in a very similar nuslerp merge parameters: weight: [0.9, 0.7, 0.9]merge_method: nuslerp
dtype: float32 out_dtype: bfloat16 # Or float16, float32 tokenizer: source: sophosympatheia/strawberrylemonade-70b-v1.1.0