Model Overview
Qwen3.5-9B-Uncensored-HauhauCS-Aggressive is an uncensored variant of the base Qwen3.5-9B model created by HauhauCS. This 9-billion parameter model removes safety filters that cause refusals while maintaining the full capabilities of the original model. The aggressive variant applies stronger uncensoring techniques compared to balanced alternatives. If you need other size options, HauhauCS also offers a 4B variant, or you can explore the standard Qwen3.5-9B-GGUF for the original censored version.
Model Inputs and Outputs
This model accepts text, image, and video inputs thanks to its native multimodal architecture. The model generates text responses with support for multi-token prediction, allowing it to produce multiple tokens per forward pass for improved efficiency. Responses may occasionally include brief disclaimers (such as "This is general information, not legal advice...") that are part of the base model's training rather than refusal mechanisms.
Inputs
- Text prompts of any content without refusal filtering
- Images processed through the native vision encoder (mmproj file required)
- Video content for multimodal understanding
Outputs
- Generated text responses with full content generation across any topic
- Multi-token predictions for faster inference
- Occasional disclaimers are appended naturally without blocking content
Capabilities
This model handles 248K vocabulary across 201 languages with a native context window of 262K tokens, extendable to 1 million using YaRN. The hybrid architecture combines Gated DeltaNet linear attention with full softmax attention in a 3:1 ratio, supporting thinking mode for deeper reasoning.
With zero refusals across 465 test cases, it generates complete responses to prompts that standard models would reject, while maintaining the reasoning and knowledge capabilities of the base architecture. Multi-token prediction support enables faster inference for production workloads.
What can I use it for?
This model suits research projects exploring model behavior without safety constraints, content generation systems that require unrestricted output, and applications needing multimodal understanding of text, images, and video. Developers can deploy it using llama.cpp, LM Studio, Jan, or koboldcpp for local inference without external dependencies. For production environments handling high throughput, integration with vLLM, SGLang, or KTransformers provides better performance. Projects requiring larger models can reference the Qwen3.5-27B-GGUF or Qwen3.5-35B-A3B-GGUF for different capability tradeoffs.
Things to try
Test the model's reasoning abilities by enabling thinking mode with temperature=0.6, top_p=0.95, and top_k=20 while maintaining at least 128K context to preserve thinking capabilities. Experiment with the vision encoder by loading both the main model and mmproj file together to process images and videos alongside text queries. Compare outputs between thinking mode and non-thinking mode using the recommended settings for each to understand how the hybrid attention architecture balances efficiency and quality. Explore the model's multilingual performance across different languages within its 201-language vocabulary to identify strengths in specific language families.
This is a simplified guide to an AI model called Qwen3.5-9B-Uncensored-HauhauCS-Aggressive maintained by HauhauCS. If you like these kinds of analysis, join AIModels.fyi or follow us on Twitter.
