Dedicated Endpoints

Run this model inference on single tenant GPU with unmatched speed and reliability at scale.

Learn more

Get help setting up a custom Dedicated Endpoints.

Talk with our engineer to get a quote for reserved GPU instances with discounts.

README

License: apache-2.0

Training Details

  • Base model: Qwen/Qwen3.5-4B
  • Run ID: 728676d4-e2b9-44b4-8958-af6cebeddc19
  • Precision: bf16
  • Max sequence length: 2048
  • Training rows: 82
  • Final training loss: 4.099260542127821
  • Finished: 2026-06-03T08:44:10Z

The downloaded Tuned Tensor archive SHA-256 was:

text

6690a070de3888b61b91f351a3cd3c6efb361e9ae1d59aa84abfdf7976795b39

Usage

Install recent local model dependencies first. This checkpoint was exported with Transformers 5.6.2 metadata.

bash

pip install 'transformers>=5.6.2' torch accelerate safetensors huggingface_hub

Download locally for the CLI:

bash

huggingface-cli download weijianzhg/youtube-summariser-qwen3.5-4b \
--local-dir ./youtube-summariser-qwen3.5-4b
youtube-summariser 'https://youtu.be/VIDEO_ID' \
--local-model ./youtube-summariser-qwen3.5-4b

For long videos, use the CLI's automatic long-video mode so full transcripts are chunked and synthesized instead of truncated:

bash

youtube-summariser 'https://youtu.be/VIDEO_ID' \
--local-model ./youtube-summariser-qwen3.5-4b \
--summary-strategy auto

Limitations

This is a small task-specific fine-tune trained on a small dataset. It is useful for testing local YouTube summarization behavior, but long videos should be summarized with a chunked/map-reduce workflow rather than a single prompt. Generated summaries may omit details or make mistakes; check important claims against the original transcript.

Model provider

weijianzhg

Model tree

Base

Qwen/Qwen3.5-4B

Fine-tuned

this model

Modalities

Input

Video, Text, Image

Output

Text

Pricing

Dedicated Endpoints

View details

Supported Functionality

Model APIs

Dedicated Endpoints

Container

More information

Explore FriendliAI today