Run this model inference on single tenant GPU with unmatched speed and reliability at scale.
Get help setting up a custom Dedicated Endpoints.
Talk with our engineer to get a quote for reserved GPU instances with discounts.
README
License: apache-2.0Training Details
- Base model:
Qwen/Qwen3.5-4B - Run ID:
728676d4-e2b9-44b4-8958-af6cebeddc19 - Precision:
bf16 - Max sequence length:
2048 - Training rows:
82 - Final training loss:
4.099260542127821 - Finished:
2026-06-03T08:44:10Z
The downloaded Tuned Tensor archive SHA-256 was:
text
6690a070de3888b61b91f351a3cd3c6efb361e9ae1d59aa84abfdf7976795b39
Usage
Install recent local model dependencies first. This checkpoint was exported with Transformers 5.6.2 metadata.
bash
pip install 'transformers>=5.6.2' torch accelerate safetensors huggingface_hub
Download locally for the CLI:
bash
huggingface-cli download weijianzhg/youtube-summariser-qwen3.5-4b \--local-dir ./youtube-summariser-qwen3.5-4byoutube-summariser 'https://youtu.be/VIDEO_ID' \--local-model ./youtube-summariser-qwen3.5-4b
For long videos, use the CLI's automatic long-video mode so full transcripts are chunked and synthesized instead of truncated:
bash
youtube-summariser 'https://youtu.be/VIDEO_ID' \--local-model ./youtube-summariser-qwen3.5-4b \--summary-strategy auto
Limitations
This is a small task-specific fine-tune trained on a small dataset. It is useful for testing local YouTube summarization behavior, but long videos should be summarized with a chunked/map-reduce workflow rather than a single prompt. Generated summaries may omit details or make mistakes; check important claims against the original transcript.
Model provider
weijianzhg
Model tree
Base
Qwen/Qwen3.5-4B
Fine-tuned
this model
Modalities
Input
Video, Text, Image
Output
Text
Pricing
Dedicated Endpoints
View detailsSupported Functionality
Model APIs
Dedicated Endpoints
Container
More information