achuthc1298

qwen_llm_scs

Model details

Architecture: Qwen3_5ForConditionalGeneration (model_type: qwen3_5)
Base model: Qwen/Qwen3.6-27B (full VLM)
Adaptation: LoRA r=16, alpha=32, dropout 0.05, continued pre-training
LoRA targets: q_proj, k_proj, v_proj, o_proj, out_proj, gate_proj, up_proj, down_proj (language layers only — vision tower not touched)
Precision: BF16 (base FP8 dequantized at load time, then LoRA merged)
Size: ~51 GB, 12 safetensors shards
Domain: spinal cord stimulation clinical and engineering literature

Usage

python
import torch
from transformers import AutoModelForImageTextToText, AutoProcessor

repo = "achuthc1298/qwen_llm_scs"

processor = AutoProcessor.from_pretrained(repo, trust_remote_code=True)
model = AutoModelForImageTextToText.from_pretrained(
    repo,
    dtype=torch.bfloat16,
    device_map="auto",
    trust_remote_code=True,
    attn_implementation="sdpa",
)
model.eval()

# Text-only
messages = [{"role": "user", "content": [{"type": "text", "text": "Summarize the principle of high-frequency SCS."}]}]
inputs = processor.apply_chat_template(messages, add_generation_prompt=True, tokenize=True, return_dict=True, return_tensors="pt").to(model.device)
out = model.generate(**inputs, max_new_tokens=400, do_sample=False)
print(processor.decode(out[0, inputs["input_ids"].shape[1]:], skip_special_tokens=True))

# Vision (figure from a paper)
from PIL import Image
img = Image.open("figure.png").convert("RGB")
messages = [{"role": "user", "content": [
    {"type": "image", "image": img},
    {"type": "text", "text": "Describe this figure."},
]}]
inputs = processor.apply_chat_template(messages, add_generation_prompt=True, tokenize=True, return_dict=True, return_tensors="pt").to(model.device)
out = model.generate(**inputs, max_new_tokens=400, do_sample=False)
print(processor.decode(out[0, inputs["input_ids"].shape[1]:], skip_special_tokens=True))

Hardware

Tested on 2× RTX A6000 (48 GB each) with device_map="auto" and per-GPU memory limits of 44 GiB. Total VRAM at inference ≈ 57 GB in BF16.

Notes

The vision tower (model.visual.*) is identical to the base model — only the language layers received SCS-domain LoRA updates.
Loading uses the native qwen3_5 integration in modern transformers; no custom remote code is bundled.
The chat template is the standard Qwen3-VL template.

License

Inherits the Qwen license of the base model.

Available on FriendliAI

Dedicated Endpoints

Run this model inference on single tenant GPU with unmatched speed and reliability at scale.

Learn more

Model Details

Model Provider

achuthc1298

Model Tree

Base

Qwen/Qwen3.6-27B

Fine-tuned

this model

Input Modalities

TextImageVideo

Output Modalities

Text

Supported Functionality

Dedicated Endpoints

Explore FriendliAI today

Get started Talk to an engineer

Model details

Architecture: Qwen3_5ForConditionalGeneration (model_type: qwen3_5)
Base model: Qwen/Qwen3.6-27B (full VLM)
Adaptation: LoRA r=16, alpha=32, dropout 0.05, continued pre-training
LoRA targets: q_proj, k_proj, v_proj, o_proj, out_proj, gate_proj, up_proj, down_proj (language layers only — vision tower not touched)
Precision: BF16 (base FP8 dequantized at load time, then LoRA merged)
Size: ~51 GB, 12 safetensors shards
Domain: spinal cord stimulation clinical and engineering literature

Usage

python
import torch
from transformers import AutoModelForImageTextToText, AutoProcessor

repo = "achuthc1298/qwen_llm_scs"

processor = AutoProcessor.from_pretrained(repo, trust_remote_code=True)
model = AutoModelForImageTextToText.from_pretrained(
    repo,
    dtype=torch.bfloat16,
    device_map="auto",
    trust_remote_code=True,
    attn_implementation="sdpa",
)
model.eval()

# Text-only
messages = [{"role": "user", "content": [{"type": "text", "text": "Summarize the principle of high-frequency SCS."}]}]
inputs = processor.apply_chat_template(messages, add_generation_prompt=True, tokenize=True, return_dict=True, return_tensors="pt").to(model.device)
out = model.generate(**inputs, max_new_tokens=400, do_sample=False)
print(processor.decode(out[0, inputs["input_ids"].shape[1]:], skip_special_tokens=True))

# Vision (figure from a paper)
from PIL import Image
img = Image.open("figure.png").convert("RGB")
messages = [{"role": "user", "content": [
    {"type": "image", "image": img},
    {"type": "text", "text": "Describe this figure."},
]}]
inputs = processor.apply_chat_template(messages, add_generation_prompt=True, tokenize=True, return_dict=True, return_tensors="pt").to(model.device)
out = model.generate(**inputs, max_new_tokens=400, do_sample=False)
print(processor.decode(out[0, inputs["input_ids"].shape[1]:], skip_special_tokens=True))

Hardware

Tested on 2× RTX A6000 (48 GB each) with device_map="auto" and per-GPU memory limits of 44 GiB. Total VRAM at inference ≈ 57 GB in BF16.

Notes

The vision tower (model.visual.*) is identical to the base model — only the language layers received SCS-domain LoRA updates.
Loading uses the native qwen3_5 integration in modern transformers; no custom remote code is bundled.
The chat template is the standard Qwen3-VL template.

License

Inherits the Qwen license of the base model.

qwen_llm_scs

README

Model details

Usage

Hardware

Notes

License

Explore FriendliAI today

README

Model details

Usage

Hardware

Notes

License