Avg. Total Time
54.87s
Avg. TTFT
21.33s
Avg. Prefill TPS
585.18
Avg. Gen TPS
27.84
Context Size
262144
Quantization
r64
Engine
vllm
Creation Method
LoRA
Model Type
Qwen35
Chat Template
Qwen3.5
Reasoning
Yes
Vision
Yes
Parameters
27B
Added At
4/27/2026
language:
NaNovel-27B is the main large autoregressive model in the Novelist series. It is designed as the balanced flagship for users who want stronger prose control, better narrative consistency, and more reliable instruction following than the 9B model without moving to a sparse Mixture-of-Experts architecture.
NaNovel-27B was fine-tuned on Dxniz/Novelist-CoT for creative writing, literary transformation, stylistic analysis, and reasoning-heavy language tasks. The training setup in this repository uses a long-context supervised fine-tuning pipeline with explicit planning behavior, allowing the model to reason about structure, emotion, pacing, and voice before writing the answer itself.
Within the lineup, this is the model to choose when output quality is the priority and standard dense-transformer inference is still preferred. It is especially good at high-control scene writing, consistent voice work, and prompts that mix literary output with explanation.
This model was evaluated with the Dxniz/Novelist-Bench benchmark dataset.
The repository evaluation summaries show the following results for NaNovel-27B:
Overall evaluation results:

Detailed evaluation results:

This is the strongest evaluated autoregressive model in the current repository summaries. Its profile is notably balanced: it performs at a high level across prose, rewriting, translation, worldbuilding, emotional continuity, and craft-sensitive language tasks.
NaNovel-9Bimport torch
from transformers import AutoModelForCausalLM, AutoTokenizer
model_id = "Dxniz/NaNovel-27B"
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(
model_id,
torch_dtype=torch.bfloat16,
device_map="auto",
)
messages = [
{"role": "system", "content": "You are Novelist, a creative writing assistant."},
{"role": "user", "content": "Write a tense literary scene in which two sisters meet at their mother's funeral."},
]
inputs = tokenizer.apply_chat_template(
messages,
return_tensors="pt",
add_generation_prompt=True,
).to(model.device)
outputs = model.generate(
inputs,
max_new_tokens=1600,
temperature=0.75,
top_p=0.9,
do_sample=True,
)
print(tokenizer.decode(outputs[0][inputs.shape[-1]:], skip_special_tokens=True))
Apache 2.0, consistent with the base model license.