Niraya666

Qwen3-4B-wmvlm-260204

Deploy Dedicated

Dedicated Endpoints

Run this model inference on single tenant GPU with unmatched speed and reliability at scale.

Learn more

Container

Run this model inference with full control and performance in your environment.

Learn more

Get help setting up a custom Dedicated Endpoints.

Talk with our engineer to get a quote for reserved GPU instances with discounts.

Model Summary

WaferSAGE-SFT is fine-tuned from Qwen3-VL on synthetic wafer map VQA data. The model is designed to answer natural language questions about wafer map images, including defect type identification, spatial distribution analysis, morphology description, and root-cause hypothesis generation.

Table with columns: Item, Description
Item	Description
Project	WaferSAGE
Model Type	Vision-Language Model
Base Model	Qwen3-VL Instruct
Fine-tuning Method	LoRA-SFT
Task	Image-text-to-text / Visual Question Answering
Domain	Semiconductor wafer map defect analysis
Language	English

Intended Capabilities

The model can answer questions such as:

What type of defect pattern is visible on this wafer map?
Where are the defective dies located?
Is the defect concentrated near the center, edge, or a specific quadrant?
Does the wafer show a scratch-like, ring-like, clustered, or random pattern?
What process or equipment issue might be associated with this defect pattern?

Example Prompts

text
<image>
What type of defect pattern is visible on this wafer map?

text
<image>
Where are the defects located on the wafer?

text
<image>
Describe the morphology and spatial distribution of this wafer map defect.

text
<image>
What are the possible root-cause hypotheses for this defect pattern?

Training Data

This model was trained on WaferSAGE wafermap VQA data, generated through a multi-stage synthetic data pipeline:

text
Wafer map images + labels
    ↓
Clustering-based cleaning and sampling
    ↓
VLM-generated defect descriptions
    ↓
Structured rubric extraction
    ↓
VQA synthesis
    ↓
LoRA supervised fine-tuning

The VQA data covers:

Defect type identification
Spatial distribution
Morphological description
Root-cause hypothesis
Consistency verification

Training Setup

The SFT models were trained with LoRA adaptation on Qwen3-VL.

Typical configuration:

Table with columns: Hyperparameter, Value
Hyperparameter	Value
Fine-tuning method	LoRA
LoRA rank	16
LoRA alpha	16
LoRA dropout	0
Optimizer	AdamW 8-bit
Learning rate	2e-4
Scheduler	Linear
Epochs	1
Max context length	2048

Usage with Transformers

python
from transformers import pipeline

pipe = pipeline(
    "image-text-to-text",
    model="Niraya666/Qwen3-4B-wmvlm-260204"
)

messages = [
    {
        "role": "user",
        "content": [
            {"type": "image", "url": "path_or_url_to_wafermap.png"},
            {"type": "text", "text": "What type of defect pattern is visible on this wafer map?"}
        ]
    }
]

output = pipe(text=messages, max_new_tokens=256)
print(output)

Usage with vLLM

bash
vllm serve "Niraya666/Qwen3-4B-wmvlm-260204"

Then call the OpenAI-compatible endpoint:

bash
curl -X POST "http://localhost:8000/v1/chat/completions" \
  -H "Content-Type: application/json" \
  --data '{
    "model": "Niraya666/Qwen3-4B-wmvlm-260204",
    "messages": [
      {
        "role": "user",
        "content": [
          {
            "type": "text",
            "text": "Describe the wafer map defect pattern."
          },
          {
            "type": "image_url",
            "image_url": {
              "url": "path_or_url_to_wafermap.png"
            }
          }
        ]
      }
    ],
    "max_tokens": 256
  }'

Evaluation

The model was evaluated using WaferSAGE's rubric-based wafermap VQA benchmark.

Evaluation dimensions include:

Spatial understanding
Morphological understanding
Defect type recognition
Root-cause hypothesis quality
Hallucination avoidance

The evaluation combines:

Rule-based rubric matching
LLM-as-a-Judge scoring
Qualitative error analysis

The SFT model primarily improves domain-specific terminology, response format, and wafermap-specific visual reasoning compared with the base VLM. For stronger performance, see the WaferSAGE RL models trained with GSPO and rubric-aligned rewards.

Limitations

This model is a domain-adapted VLM for wafer map understanding, but it has important limitations:

It should not be used as a standalone root-cause diagnosis system.
Root-cause outputs are hypotheses, not verified fab conclusions.
It may hallucinate defect locations or process causes when the image is ambiguous.
It may inherit biases from synthetic training data and teacher model outputs.
It was trained primarily on public wafer map style data and may not generalize to all fab-specific wafer map formats.
It does not use lot history, process metadata, tool/chamber records, metrology, or inline inspection data.

For production semiconductor engineering use, model outputs should be reviewed by qualified engineers and combined with process context.

Recommended Use

Recommended:

Research on industrial VLMs
Wafer map VQA experiments
Defect pattern description
Data generation and evaluation pipeline development
Local proof-of-concept systems for semiconductor AI

Not recommended:

Automated process control
Final root-cause diagnosis
Yield-impact decisions without expert review
Safety-critical or high-cost manufacturing decisions without validation

WaferSAGE paper: arXiv:2604.27629
WaferSAGE VQA dataset: Niraya666/wafermap-vqa-2602
Rubric-augmented datasets: Niraya666/wafermap-vqa-with-rubrics-2602, Niraya666/wafermap-vqa-with-rubrics-2602_v2
RL model: Niraya666/Qwen3-VL-4B-Instruct-WMVLM-RL-0213

Citation

If you use this model, please cite:

bibtex
@misc{xu2026wafersage,
  title        = {WaferSAGE: Large Language Model-Powered Wafer Defect Analysis via Synthetic Data Generation and Rubric-Guided Reinforcement Learning},
  author       = {Ke Xu and Zhongyuan Lian},
  year         = {2026},
  eprint       = {2604.27629},
  archivePrefix= {arXiv},
  primaryClass = {cs.AI}
}

Model provider

Niraya666

Model tree

Base

unsloth/Qwen3-VL-4B-Instruct

Fine-tuned

this model

Modalities

Input

Text, Image

Output

Text

Pricing

Dedicated Endpoints

View details

Supported Functionality

Model APIs

Dedicated Endpoints

Container

More information

Model card

Explore FriendliAI today

Get started Talk to an engineer

Model Summary

Table with columns: Item, Description
Item	Description
Project	WaferSAGE
Model Type	Vision-Language Model
Base Model	Qwen3-VL Instruct
Fine-tuning Method	LoRA-SFT
Task	Image-text-to-text / Visual Question Answering
Domain	Semiconductor wafer map defect analysis
Language	English

Intended Capabilities

The model can answer questions such as:

What type of defect pattern is visible on this wafer map?
Where are the defective dies located?
Is the defect concentrated near the center, edge, or a specific quadrant?
Does the wafer show a scratch-like, ring-like, clustered, or random pattern?
What process or equipment issue might be associated with this defect pattern?

Example Prompts

text
<image>
What type of defect pattern is visible on this wafer map?

text
<image>
Where are the defects located on the wafer?

text
<image>
Describe the morphology and spatial distribution of this wafer map defect.

text
<image>
What are the possible root-cause hypotheses for this defect pattern?

Training Data

This model was trained on WaferSAGE wafermap VQA data, generated through a multi-stage synthetic data pipeline:

text
Wafer map images + labels
    ↓
Clustering-based cleaning and sampling
    ↓
VLM-generated defect descriptions
    ↓
Structured rubric extraction
    ↓
VQA synthesis
    ↓
LoRA supervised fine-tuning

The VQA data covers:

Defect type identification
Spatial distribution
Morphological description
Root-cause hypothesis
Consistency verification

Training Setup

The SFT models were trained with LoRA adaptation on Qwen3-VL.

Typical configuration:

Table with columns: Hyperparameter, Value
Hyperparameter	Value
Fine-tuning method	LoRA
LoRA rank	16
LoRA alpha	16
LoRA dropout	0
Optimizer	AdamW 8-bit
Learning rate	2e-4
Scheduler	Linear
Epochs	1
Max context length	2048

Usage with Transformers

python
from transformers import pipeline

pipe = pipeline(
    "image-text-to-text",
    model="Niraya666/Qwen3-4B-wmvlm-260204"
)

messages = [
    {
        "role": "user",
        "content": [
            {"type": "image", "url": "path_or_url_to_wafermap.png"},
            {"type": "text", "text": "What type of defect pattern is visible on this wafer map?"}
        ]
    }
]

output = pipe(text=messages, max_new_tokens=256)
print(output)

Usage with vLLM

bash
vllm serve "Niraya666/Qwen3-4B-wmvlm-260204"

Then call the OpenAI-compatible endpoint:

bash
curl -X POST "http://localhost:8000/v1/chat/completions" \
  -H "Content-Type: application/json" \
  --data '{
    "model": "Niraya666/Qwen3-4B-wmvlm-260204",
    "messages": [
      {
        "role": "user",
        "content": [
          {
            "type": "text",
            "text": "Describe the wafer map defect pattern."
          },
          {
            "type": "image_url",
            "image_url": {
              "url": "path_or_url_to_wafermap.png"
            }
          }
        ]
      }
    ],
    "max_tokens": 256
  }'

Evaluation

The model was evaluated using WaferSAGE's rubric-based wafermap VQA benchmark.

Evaluation dimensions include:

Spatial understanding
Morphological understanding
Defect type recognition
Root-cause hypothesis quality
Hallucination avoidance

The evaluation combines:

Rule-based rubric matching
LLM-as-a-Judge scoring
Qualitative error analysis

Limitations

This model is a domain-adapted VLM for wafer map understanding, but it has important limitations:

It should not be used as a standalone root-cause diagnosis system.
Root-cause outputs are hypotheses, not verified fab conclusions.
It may hallucinate defect locations or process causes when the image is ambiguous.
It may inherit biases from synthetic training data and teacher model outputs.
It was trained primarily on public wafer map style data and may not generalize to all fab-specific wafer map formats.
It does not use lot history, process metadata, tool/chamber records, metrology, or inline inspection data.

For production semiconductor engineering use, model outputs should be reviewed by qualified engineers and combined with process context.

Recommended Use

Recommended:

Research on industrial VLMs
Wafer map VQA experiments
Defect pattern description
Data generation and evaluation pipeline development
Local proof-of-concept systems for semiconductor AI

Not recommended:

Automated process control
Final root-cause diagnosis
Yield-impact decisions without expert review
Safety-critical or high-cost manufacturing decisions without validation

WaferSAGE paper: arXiv:2604.27629
WaferSAGE VQA dataset: Niraya666/wafermap-vqa-2602
Rubric-augmented datasets: Niraya666/wafermap-vqa-with-rubrics-2602, Niraya666/wafermap-vqa-with-rubrics-2602_v2
RL model: Niraya666/Qwen3-VL-4B-Instruct-WMVLM-RL-0213

Citation

If you use this model, please cite:

bibtex
@misc{xu2026wafersage,
  title        = {WaferSAGE: Large Language Model-Powered Wafer Defect Analysis via Synthetic Data Generation and Rubric-Guided Reinforcement Learning},
  author       = {Ke Xu and Zhongyuan Lian},
  year         = {2026},
  eprint       = {2604.27629},
  archivePrefix= {arXiv},
  primaryClass = {cs.AI}
}

Qwen3-4B-wmvlm-260204

Get help setting up a custom Dedicated Endpoints.

README

Model Summary

Intended Capabilities

Example Prompts

Training Data

Training Setup

Usage with Transformers

Usage with vLLM

Evaluation

Limitations

Recommended Use

Citation

Explore FriendliAI today

README

Model Summary

Intended Capabilities

Example Prompts

Training Data

Training Setup

Usage with Transformers

Usage with vLLM

Evaluation

Limitations

Recommended Use

Citation

Qwen3-4B-wmvlm-260204

Get help setting up a custom Dedicated Endpoints.

Model Summary

Intended Capabilities

Example Prompts

Training Data

Training Setup

Usage with Transformers

Usage with vLLM

Evaluation

Limitations

Recommended Use

Related Resources

Citation

Explore FriendliAI today

Model Summary

Intended Capabilities

Example Prompts

Training Data

Training Setup

Usage with Transformers

Usage with vLLM

Evaluation

Limitations

Recommended Use

Related Resources

Citation