Dedicated Endpoints
Run this model inference on single tenant GPU with unmatched speed and reliability at scale.
Container
Run this model inference with full control and performance in your environment.
Get help setting up a custom Dedicated Endpoints.
Talk with our engineer to get a quote for reserved GPU instances with discounts.
README
License: apache-2.0Numbers
- Validation cosine similarity: 0.943
- Trained on 5,281 descriptions, LoRA r=32, 10 epochs
- Cross-validates with Anthropic AR at 84% agreement on AV-generated descriptions
How scoring works
- Feed description through model with injection token
- Forward hook captures hidden state at layer 20, injection position
- Cosine similarity between captured hidden and target activation = score
- Higher score means the description better captures what the model was computing
Related
Model provider
anicka
Model tree
Base
Qwen/Qwen2.5-7B-Instruct
Adapter
this model
Modalities
Input
Text
Output
Text
Pricing
Dedicated Endpoints
View detailsSupported Functionality
Model APIs
Dedicated Endpoints
Container
More information