Run this model inference on single tenant GPU with unmatched speed and reliability at scale.
Run this model inference with full control and performance in your environment.
Get help setting up a custom Dedicated Endpoints.
Talk with our engineer to get a quote for reserved GPU instances with discounts.
README
License: apache-2.0How it was obtained
The model was trained with supervised fine-tuning (SFT) using TRL. Training samples were selected from the MIMIC-CXR dataset, which contains frontal and lateral chest radiographs paired with structured radiology reports.
Training details:
| Base model | Qwen/Qwen3-VL-8B-Instruct |
| Training framework | TRL 0.26.2 (SFTTrainer) |
| Epochs | 2 |
| Total steps | 7 120 |
| Eval loss | 0.428 |
| Eval token accuracy | ~88% |
| Precision | bfloat16 |
| Hardware | 1× NVIDIA A100 |
Usage
python
from transformers import AutoProcessor, Qwen2VLForConditionalGenerationfrom qwen_vl_utils import process_vision_infoimport torchmodel = Qwen2VLForConditionalGeneration.from_pretrained("dmusingu/qwen3-vl-8b-mimic-cxr-sft",torch_dtype=torch.bfloat16,device_map="auto",)processor = AutoProcessor.from_pretrained("denmus/qwen3-vl-8b-mimic-cxr-sft")messages = [{"role": "user","content": [{"type": "image", "image": "<path_or_url_to_cxr>"},{"type": "text", "text": "Describe the findings in this chest X-ray."},],}]text = processor.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)image_inputs, video_inputs = process_vision_info(messages)inputs = processor(text=[text], images=image_inputs, return_tensors="pt").to(model.device)output_ids = model.generate(**inputs, max_new_tokens=256)output = processor.batch_decode(output_ids[:, inputs.input_ids.shape[1]:], skip_special_tokens=True)print(output[0])
Data access
MIMIC-CXR is a credentialed dataset. Access requires PhysioNet registration and completion of the required training at physionet.org/content/mimic-cxr.
Framework versions
- TRL: 0.26.2
- Transformers: 5.7.0
- PyTorch: 2.11.0
- Datasets: 4.8.4
- Tokenizers: 0.22.2
Citation
If you use this model, please cite MIMIC-CXR:
bibtex
@article{johnson2019mimic,title = {MIMIC-CXR, a de-identified publicly available database of chest radiographs with free-text reports},author = {Johnson, Alistair EW and Pollard, Tom J and Berkowitz, Seth J and others},journal = {Scientific data},volume = {6},number = {1},pages = {317},year = {2019},publisher = {Nature Publishing Group}}
Model provider
dmusingu
Model tree
Base
Qwen/Qwen3-VL-8B-Instruct
Fine-tuned
this model
Modalities
Input
Text, Image
Output
Text
Pricing
Dedicated Endpoints
View detailsSupported Functionality
Model APIs
Dedicated Endpoints
Container
More information