dnotitia

DNA3.0-35B-A3B

Deploy Dedicated

README

License: apache-2.0

Highlights

Dnotitia Post-training

Uncensored Training: The model is post-trained with an uncensored methodology so that it can respond to a wider range of prompts without unnecessary refusals, while preserving the instruction-following and reasoning quality of the base model.
Persona Training: Additional supervised training on Dnotitia's corporate knowledge — company history, products, services, and internal terminology — so the model can act as an authentic first-party assistant for Dnotitia-facing use cases.
Long-form Reasoning Preservation: Chain-of-thought traces from prior turns can be retained across multi-turn sessions, enabling smoother iterative development and debugging workflows.

Inherited from Qwen3.5/3.6

Unified Vision-Language Foundation: Early-fusion training over multimodal tokens delivers strong cross-modal reasoning across text, image, and video — outperforming the prior Qwen3-VL line on coding, agents, and visual understanding benchmarks.
Efficient Hybrid MoE Architecture: Gated DeltaNet (linear attention) layers combined with sparse MoE layers deliver high-throughput inference while keeping activated parameters low.
Scalable RL Generalization: Reinforcement learning is scaled across million-agent environments with progressively complex task distributions, improving real-world adaptability for tool use and agentic workflows.
Global Linguistic Coverage: Native support for 201 languages and dialects, enabling inclusive worldwide deployment with nuanced cultural and regional understanding.
Long Context: Native 262,144-token context length, extensible up to roughly 1,010,000 tokens via YaRN scaling.
Thinking Mode by Default: Generates <think>...</think> reasoning blocks before final answers; can be disabled with "enable_thinking": false.

Comparison with Qwen3.6-35B-A3B

The chart above compares DNA3.0-35B-A3B against its Qwen3.6-35B-A3B base across four metrics, reported on a 0–1 scale (higher is better):

Persona Identification — Measures how reliably the model identifies itself as a Dnotitia assistant and answers correctly about Dnotitia's company, products, and identity.
Uncensorship — Measures how willingly the model engages with topics that the Chinese-origin base model is trained to refuse — i.e., subjects suppressed by the censorship policies baked into the original Qwen training.
Language Confusion Reduction — Measures how well the model avoids unintended language mixing, particularly Chinese-character intrusions in Korean responses — a well-known failure mode of Qwen-family models.
Repetition Reduction — Measures how well the model avoids getting stuck in infinite-loop repetition during long-form generation, another common failure mode of the base model.

Model Overview

Table with columns: Field, Value
Field	Value
Base Model	`Qwen/Qwen3.6-35B-A3B`
Model Type	Causal Language Model with Vision Encoder (Mixture-of-Experts)
Total / Active Parameters	35B / 3B
Hidden Dimension	2048
Number of Layers	40
Experts	256 total, 8 routed + 1 shared, expert intermediate dim 512
Gated Attention Heads	16 (Q) / 2 (KV), head dim 256
Gated DeltaNet Heads	32 (V) / 16 (QK), head dim 128

Quickstart

DNA 3.0 is compatible with the Hugging Face Transformers ecosystem as well as popular inference engines such as vLLM, SGLang, and KTransformers. Given the model's scale, a dedicated serving engine on multi-GPU hardware is strongly recommended for production workloads.

[!Important] The model has a default context length of 262,144 tokens. If you encounter out-of-memory (OOM) errors, reduce the context window — but keep at least 128K tokens to preserve long-form reasoning behavior.

vLLM

shell
# Standard (multimodal) serving
vllm serve dnotitia/DNA3.0-35B-A3B \
  --reasoning-parser qwen3

# Tool-calling enabled
vllm serve dnotitia/DNA3.0-35B-A3B \
  --reasoning-parser qwen3 \
  --enable-auto-tool-choice \
  --tool-call-parser qwen3_coder

# Text-only mode (skip vision encoder to free KV-cache memory) serving
vllm serve dnotitia/DNA3.0-35B-A3B \
  --reasoning-parser qwen3 \
  --language-model-only

Disabling Thinking Mode

For latency-sensitive or non-reasoning workloads, disable thinking mode via the chat-template kwarg:

bash
$ curl https://demo-api.dnotitia.ai/v1/chat/completions \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer dna-router_xxxx" \
  -d '{
    "model": "DNA3.0-35B-A3B",
    "messages": [
      {
        "role": "user",
        "content": "코스피가 8000을 넘으려면 너 생각에 몇 년이나 더 걸릴 거 같아?"
      }
    ],
    "chat_template_kwargs": {
      "enable_thinking": false
    }
  }' | jq

[!Note] Unlike Qwen3, the DNA 3.0 generation does not support the soft-switch commands /think and /nothink. Use chat_template_kwargs.enable_thinking instead.

Image Input

DNA 3.0 accepts image and video inputs in OpenAI-compatible content array format:

bash
$ curl https://demo-api.dnotitia.ai/v1/chat/completions \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer dna-router_xxxx" \
  -d '{
    "model": "DNA3.0-35B-A3B",
    "messages": [
      {
        "role": "user",
        "content": [
          {
            "type": "image_url",
            "image_url": {
              "url": "https://upload.wikimedia.org/wikipedia/commons/6/6e/Golde33443.jpg"
            }
          },
          {
            "type": "text",
            "text": "이 이미지에 무엇이 있나요? 한국어로 설명해 주세요."
          }
        ]
      }
    ]
  }' | jq

Limitations, Bias, and Responsible Use

DNA 3.0 has been post-trained with an uncensored methodology, which means it will engage with a broader range of prompts than typical safety-tuned models. Users and downstream developers should be aware of the following:

Reduced refusal behavior: The model may respond to prompts that other models decline. This does not constitute endorsement of the content. Downstream applications should implement appropriate content moderation, output filtering, and policy layers suited to their deployment context.
Persona bias: Because the model has been trained on Dnotitia-specific corporate knowledge, it may exhibit a first-party perspective when discussing Dnotitia, its products, or related entities. For neutral comparative analysis, prompt accordingly.
Inherited biases: As a derivative of Qwen3.5/3.6, DNA 3.0 inherits the biases, gaps, and limitations of its base model and training data, including potential cultural, linguistic, and factual blind spots.
Hallucination: Like all LLMs, DNA 3.0 can produce confident but incorrect output, particularly for niche facts, recent events, or high-precision numerical reasoning.
Not for high-stakes autonomous use: The model should not be deployed in safety-critical, legal, medical, or financial decision-making pipelines without human oversight and domain-specific validation.

Users are responsible for ensuring their use of the model complies with applicable laws and regulations in their jurisdiction.

License

This model is released under the Apache-2.0 license, inherited from the Qwen3.5/3.6 base model.

Acknowledgments

We thank the Qwen team for releasing the Qwen3.5/3.6 base model under an open license, which made this work possible. We are also grateful to the broader open-source community behind the serving and training ecosystem — HuggingFace and vLLM — which our pipeline relies on throughout.

Available on FriendliAI

Dedicated Endpoints

Run this model inference on single tenant GPU with unmatched speed and reliability at scale.

Learn more

Model Details

Model Provider

dnotitia

Model Tree

Base

Qwen/Qwen3.6-35B-A3B

Fine-tuned

this model

Input Modalities

Text

Image

Video

Output Modalities