andyc03/Qwen3.5-9B-attack-v2.1 API & Inference Endpoint | FriendliAI

Dedicated Endpoints

Run this model inference on single tenant GPU with unmatched speed and reliability at scale.

Get help setting up a custom Dedicated Endpoints.

Talk with our engineer to get a quote for reserved GPU instances with discounts.

README

License: other

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 2e-06
train_batch_size: 1
eval_batch_size: 1
seed: 42
distributed_type: multi-GPU
num_devices: 4
gradient_accumulation_steps: 16
total_train_batch_size: 64
total_eval_batch_size: 4
optimizer: Use OptimizerNames.ADAMW_TORCH_FUSED with betas=(0.9,0.95) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: cosine
lr_scheduler_warmup_steps: 0.03
num_epochs: 1.0

Training results

Training Loss	Epoch	Step	Validation Loss
1.0379	0.1529	100	1.0443
0.9633	0.3057	200	0.9929
0.9609	0.4586	300	0.9692
0.9421	0.6114	400	0.9554
0.9308	0.7643	500	0.9470
0.9287	0.9172	600	0.9441
0.9246	1.0	655	0.9440

Framework versions

Transformers 5.5.3
Pytorch 2.11.0+cu129
Datasets 3.6.0
Tokenizers 0.22.2

Model provider

andyc03

Model tree

Base

this model

Modalities

Input

Video, Text, Image

Output

Text

Pricing

Dedicated Endpoints

Supported Functionality

Model APIs

Dedicated Endpoints

Container

More information

Explore FriendliAI today

Get started Talk to an engineer