Avg. Total Time
45.90s
Avg. TTFT
21.08s
Avg. Prefill TPS
11149.26
Avg. Gen TPS
16.75
Context Size
262144
Quantization
r64
Engine
vllm
Creation Method
LoRA
Model Type
Qwen35
Chat Template
Qwen3.5
Reasoning
Yes
Vision
Yes
Parameters
27B
Added At
4/4/2026
license: apache-2.0 datasets:
A tentative second version. Hopefully, it's better.
A writing & roleplay finetune of Qwen3.5 27B. The primary emphasis is on writing quality as it strongly generalizes across both domains.
The basic idea is to use a curriculum learning setup to overcome the lack of high quality roleplay data by first training on lower quality roleplay data, then training on higher quality writing data. Starting from ConicCat/Qwen3.5-Antirep-27B, the model was trained on a roughly equal mixture of instruct / roleplay / writing data for three epochs. The model was then trained for eleven epochs on a smaller dataset of book chunks.
<think>\n\n</think>\n prefill or <think>\n prefill. Should think less!0.70.950.4-0.8 should work well.~100k context on 24GB Vram20-24k context with the vulkan backend, although it's pretty tight and may require some fiddling around with open programs e.t.c.ConicCat/AntiRep to mitigate repetitition.
internlm/Condor-SFT-20K for instruct; even though instruct capabilities are not the primary focus, adding some instruct data helps mitigate forgetting and maintains general intellect and instruction following capabilites.
ConicCat/Gutenberg-SFT. A reformatted version of the original Gutenberg DPO dataset by jondurbin for SFT with some slight augmentation to address many of the samples being overly long.
ConicCat/MiniC2_V3.2. The venerable C2, with cleaned and reformatted system prompts, and all user / assistant turns replaced by V3.2.
A dataset of backtranslated books. Unfortunately, I am unable to release this set as all of the data is under copyright.