ttrpg

mosslight-4b

Deploy Dedicated

Dedicated Endpoints

Run this model inference on single tenant GPU with unmatched speed and reliability at scale.

Learn more

Get help setting up a custom Dedicated Endpoints.

Talk with our engineer to get a quote for reserved GPU instances with discounts.

Model Details

Model name: Mosslight 4B
Model ID: ttrpg/mosslight-4b
Base model: Qwen/Qwen3.5-4B
Derivative type: fine-tuned and merged full-weight release
Architecture: Qwen3_5ForConditionalGeneration
Model type: vision-language causal generation
Parameters: approximately 4B
Native context length: 262,144 tokens, as inherited from the base config
License: Apache 2.0, inherited from the base model

Lineage

This model is a fine-tuned, merged derivative of Qwen3.5-4B from Alibaba Cloud/Qwen. The original Apache 2.0 license is preserved in LICENSE, and derivative attribution is documented in NOTICE.

Training and merge details should be completed before publishing a final public version.

Training Details

Base checkpoint: Qwen/Qwen3.5-4B
Fine-tuning method: TODO
Training data: TODO
Merge method: TODO
Output format: merged full weights in sharded Safetensors format
Post-training evaluation: TODO

Files

config.json: model architecture and multimodal configuration.
model.safetensors-00001-of-00002.safetensors
model.safetensors-00002-of-00002.safetensors
model.safetensors.index.json
tokenizer.json, tokenizer_config.json, vocab.json, merges.txt
chat_template.jinja
preprocessor_config.json, video_preprocessor_config.json

Usage

Install a Transformers build that supports Qwen3.5, then load the model using the standard Hugging Face APIs.

python
from transformers import AutoProcessor, AutoModelForImageTextToText

model_id = "ttrpg/mosslight-4b"

processor = AutoProcessor.from_pretrained(model_id, trust_remote_code=True)
model = AutoModelForImageTextToText.from_pretrained(
    model_id,
    device_map="auto",
    torch_dtype="auto",
    trust_remote_code=True,
)

messages = [
    {
        "role": "user",
        "content": [
            {"type": "text", "text": "Briefly introduce yourself."},
        ],
    }
]

inputs = processor.apply_chat_template(
    messages,
    add_generation_prompt=True,
    tokenize=True,
    return_dict=True,
    return_tensors="pt",
).to(model.device)

outputs = model.generate(**inputs, max_new_tokens=256)
print(processor.decode(outputs[0], skip_special_tokens=True))

Serving

Use serving frameworks only after confirming they support Qwen3.5 model classes and the required multimodal processor files.

Example model identifier:

bash
ttrpg/mosslight-4b

Intended Use

Mosslight 4B is intended for experimentation with compact multimodal assistant workflows, text generation, visual question answering, and local model serving.

Limitations

No independent benchmark results are published for this custom release yet.
Behavior and safety characteristics should be evaluated for your target use case before deployment.
This model inherits limitations from the Qwen3.5-4B base model and from the fine-tuning and merge process used for this release.

Attribution

Mosslight 4B is a fine-tuned, merged derivative based on Qwen3.5-4B. Please retain the Apache 2.0 license and attribution notices when redistributing this model or derivatives of it.

Model provider

ttrpg

Model tree

Base

Qwen/Qwen3.5-4B

Fine-tuned

this model

Modalities

Input

Video, Text, Image

Output

Text

Pricing

Dedicated Endpoints

View details

Supported Functionality

Model APIs

Dedicated Endpoints

Container

More information

Model card

Explore FriendliAI today

Get started Talk to an engineer

Model Details

Model name: Mosslight 4B
Model ID: ttrpg/mosslight-4b
Base model: Qwen/Qwen3.5-4B
Derivative type: fine-tuned and merged full-weight release
Architecture: Qwen3_5ForConditionalGeneration
Model type: vision-language causal generation
Parameters: approximately 4B
Native context length: 262,144 tokens, as inherited from the base config
License: Apache 2.0, inherited from the base model

Lineage

This model is a fine-tuned, merged derivative of Qwen3.5-4B from Alibaba Cloud/Qwen. The original Apache 2.0 license is preserved in LICENSE, and derivative attribution is documented in NOTICE.

Training and merge details should be completed before publishing a final public version.

Training Details

Base checkpoint: Qwen/Qwen3.5-4B
Fine-tuning method: TODO
Training data: TODO
Merge method: TODO
Output format: merged full weights in sharded Safetensors format
Post-training evaluation: TODO

Files

config.json: model architecture and multimodal configuration.
model.safetensors-00001-of-00002.safetensors
model.safetensors-00002-of-00002.safetensors
model.safetensors.index.json
tokenizer.json, tokenizer_config.json, vocab.json, merges.txt
chat_template.jinja
preprocessor_config.json, video_preprocessor_config.json

Usage

Install a Transformers build that supports Qwen3.5, then load the model using the standard Hugging Face APIs.

python
from transformers import AutoProcessor, AutoModelForImageTextToText

model_id = "ttrpg/mosslight-4b"

processor = AutoProcessor.from_pretrained(model_id, trust_remote_code=True)
model = AutoModelForImageTextToText.from_pretrained(
    model_id,
    device_map="auto",
    torch_dtype="auto",
    trust_remote_code=True,
)

messages = [
    {
        "role": "user",
        "content": [
            {"type": "text", "text": "Briefly introduce yourself."},
        ],
    }
]

inputs = processor.apply_chat_template(
    messages,
    add_generation_prompt=True,
    tokenize=True,
    return_dict=True,
    return_tensors="pt",
).to(model.device)

outputs = model.generate(**inputs, max_new_tokens=256)
print(processor.decode(outputs[0], skip_special_tokens=True))

Serving

Use serving frameworks only after confirming they support Qwen3.5 model classes and the required multimodal processor files.

Example model identifier:

bash
ttrpg/mosslight-4b

Intended Use

Mosslight 4B is intended for experimentation with compact multimodal assistant workflows, text generation, visual question answering, and local model serving.

Limitations

No independent benchmark results are published for this custom release yet.
Behavior and safety characteristics should be evaluated for your target use case before deployment.
This model inherits limitations from the Qwen3.5-4B base model and from the fine-tuning and merge process used for this release.

Attribution

Mosslight 4B is a fine-tuned, merged derivative based on Qwen3.5-4B. Please retain the Apache 2.0 license and attribution notices when redistributing this model or derivatives of it.

mosslight-4b

Get help setting up a custom Dedicated Endpoints.

README

Model Details

Lineage

Training Details

Files

Usage

Serving

Intended Use

Limitations

Attribution

Explore FriendliAI today

README

Model Details

Lineage

Training Details

Files

Usage

Serving

Intended Use

Limitations

Attribution