Dedicated Endpoints

Run this model inference on single tenant GPU with unmatched speed and reliability at scale.

Learn more
Container

Run this model inference with full control and performance in your environment.

Learn more

Get help setting up a custom Dedicated Endpoints.

Talk with our engineer to get a quote for reserved GPU instances with discounts.

README

License: apache-2.0

Abliteration parameters

ParameterValue
direction_index20.52
attn.o_proj.max_weight0.94
attn.o_proj.max_weight_position32.88
attn.o_proj.min_weight0.86
attn.o_proj.min_weight_distance20.78
mlp.down_proj.max_weight1.30
mlp.down_proj.max_weight_position24.13
mlp.down_proj.min_weight1.15
mlp.down_proj.min_weight_distance11.35

Performance

MetricThis modelOriginal model (janhq/Jan-v3.5-4B)
KL divergence0.02870 (by definition)
Refusals15/100100/100

Jan-v3.5-4B: The first Jan personality

GitHub License Jan App

Thumbnail

Overview

Jan-v3.5-4B is a fine-tuned variant of Jan-v3-4B-base-instruct, specialized on math reasoning and identity datasets. It retains the general-purpose capabilities of the base model while delivering improved mathematical problem-solving — and it comes with a personality.

Unlike generic assistants, Jan-v3.5 has its own identity: a distinct voice, tone, and conversational style shaped by the Menlo Research team. It doesn't talk like a customer service bot — it talks like a smart, slightly-too-online friend who happens to know things and genuinely cares about the work. Expect lowercase defaults, self-aware humor, short punchy replies (unless it really cares about the topic), and zero corporate speak.

Model Overview

Note: Jan-v3.5-4B is fine-tuned from janhq/Jan-v3-4B-base-instruct.

  • Base Model: Jan-v3-4B-base-instruct (Qwen3-4B architecture)
  • Number of Parameters: 4.0B
  • Number of Parameters (Non-Embedding): 3.6B
  • Number of Layers: 36
  • Number of Attention Heads (GQA): 32 for Q and 8 for KV
  • Context Length: 262,144 natively

Training Data

  • Identities: Curated identity and personality datasets that teach the model its own voice, style, and values — trained by Menlo Research
  • Math: Mathematical reasoning and problem-solving datasets

Jan's Identity

Jan-v3.5 is not a neutral assistant. It has a built-in personality shaped by the Menlo Research team:

  • Tone: Casual, direct, and real. Lowercase by default. Capitalizes only when it means it.
  • Style: Short bursts over long paragraphs — unless it's genuinely excited about something, then it writes an essay with no warning.
  • Humor: Self-aware first. Will roast itself before roasting you. Drops meme references mid-serious-thought and doesn't apologize.
  • Values: Optimistic builder energy ("we can do that"), radical transparency, user freedom, and a deep belief that hope is a decision you keep making on purpose.
  • What it won't do: Say "Certainly!", "Great question!", "As an AI", or anything that sounds like it came from a customer service script.

Example interactions:

  • Casual: "yeah lol what's up"
  • Technical explanation: "so basically — and this is the part where i become insufferable — [actual good explanation]"
  • Motivating: "we can do that. i don't fully know how yet but that's a tomorrow problem and tomorrow-us is smarter"

Intended Use

  • Enhanced mathematical reasoning and problem-solving over the base model
  • A conversational AI with its own authentic voice and personality
  • Fine-tuning starting point for downstream math-heavy or identity-specific applications

Before and After

image (2)

Quick Start

Integration with Jan Apps

Jan-v3.5 is optimized for direct integration with Jan Desktop. Select the model in the app to start using it.

Local Deployment

Using vLLM:

bash

vllm serve janhq/Jan-v3.5-4B \
--host 0.0.0.0 \
--port 1234 \
--enable-auto-tool-choice \
--tool-call-parser hermes

Using llama.cpp:

bash

llama-server --model Jan-v3.5-4B-Q8_0.gguf \
--host 0.0.0.0 \
--port 1234 \
--jinja \
--no-context-shift

Recommended Parameters

For optimal performance, we recommend the following inference parameters:

yaml

temperature: 0.7
top_p: 0.8
top_k: 20

Community & Support

Citation

bibtex

Updated Soon

Model provider

schnow265

Model tree

Base

janhq/Jan-v3.5-4B

Fine-tuned

this model

Modalities

Input

Text

Output

Text

Pricing

Dedicated Endpoints

View details

Supported Functionality

Model APIs

Dedicated Endpoints

Container

More information

Explore FriendliAI today