BRZ911
Latent-VC-9B
Run this model inference on single tenant GPU with unmatched speed and reliability at scale.
Get help setting up a custom Dedicated Endpoints.
Talk with our engineer to get a quote for reserved GPU instances with discounts.
README
License: apache-2.0Related Resources
| Resource | Link |
|---|---|
| Code repository | https://github.com/BRZ911/Latent-VC |
| Training & evaluation data | https://huggingface.co/datasets/BRZ911/Latent-VC-Data |
| Model weights (this repo) | https://huggingface.co/BRZ911/Latent-VC-9B |
Model Details
- Architecture:
Qwen3_5ForConditionalGeneration(Qwen3.5-VL backbone) - Hidden size: 4096 | Layers: 32 (hybrid linear + full attention) | Vision depth: 27
- Dtype: bfloat16
- Parameters: ~9B
- Special capability: Latent Visual Compression (LVC) tokens for latent video reasoning (
lvc_temperature=0.07,loss_lvc_fct=cosine) - Training: SFT (Stage 1) → GRPO (Stage 2); this checkpoint corresponds to the GRPO
checkpoint-800.
Files
| File | Description |
|---|---|
model.safetensors | Model weights (~18.8 GB) |
config.json | Model configuration |
generation_config.json | Generation configuration |
tokenizer.json, tokenizer_config.json | Tokenizer |
processor_config.json, chat_template.jinja | Processor & chat template |
eval_all_benchmarks.py | Multi-benchmark evaluation script |
eval_all_benchmarks.sh | Evaluation launcher |
This model uses custom LVC components. For training and LVC reasoning inference, please use the code from the Latent-VC repository.
Usage
Download
bash
pip install -U "huggingface_hub[cli]"hf download BRZ911/Latent-VC-9B --local-dir ./Latent-VC-9B
Evaluation
The included scripts evaluate the model on six video benchmarks (VideoMME, MVBench, TempCompass, VideoMMMU, VSIBench, MMVU). The evaluation data is hosted in the companion dataset BRZ911/Latent-VC-Data under Eval/ — download and extract it so that an Evaluation/ directory (with eval_<dataset>.json and the per-benchmark videos) is available, then run:
bash
# MODEL_PATH : path to the downloaded weights (this repo)# EVAL_DIR : path to the extracted Evaluation/ directory (default: ./Evaluation)# DATASETS : comma-separated subset of# videomme,mvbench,tempcompass,videommmu,vsibench,mmvuMODEL_PATH=./Latent-VC-9B \EVAL_DIR=./Evaluation \DATASETS=mmvu \bash eval_all_benchmarks.sh
See the Latent-VC repository for the full inference/training environment and dependencies.
Benchmarks
The model is evaluated on the following benchmarks (data in BRZ911/Latent-VC-Data):
| Benchmark | Samples |
|---|---|
| VideoMME | 2700 |
| MVBench | 4000 |
| TempCompass | 7540 |
| VideoMMMU | 900 |
| VSIBench | 5130 |
| MMVU | 625 |
Citation
If you find this model or dataset useful, please cite the project:
bibtex
@misc{latentvc,title = {Latent-VC: Latent Visual Compression for Efficient Video Reasoning},author = {BRZ911},year = {2025},url = {https://github.com/BRZ911/Latent-VC}}
Model provider
BRZ911
Model tree
Base
this model
Modalities
Input
Video, Text, Image
Output
Text
Pricing
Dedicated Endpoints
View detailsSupported Functionality
Model APIs
Dedicated Endpoints
Container
More information