DavidBShan
linkd-search-qwen3.6-35b-a3b-grpo100-lora
Run this model inference on single tenant GPU with unmatched speed and reliability at scale.
Get help setting up a custom Dedicated Endpoints.
Talk with our engineer to get a quote for reserved GPU instances with discounts.
README
License: apache-2.0linkd-search GRPO LoRA — grpo_000100 (lineage best)
Rank-64 LoRA adapter (alpha 32, all-linear) over Qwen/Qwen3.6-35B-A3B,
trained with the Freesolo SFT -> GRPO pipeline on the linkd-search task:
translating natural-language people-search requests into MongoDB find
filters for the Berkeley.profilematch collection.
- Tinker checkpoint:
tinker://4173b059-c42e-53ef-8782-293086486655:train:0/sampler_weights/grpo_000100 - Lineage-best final_eval score: 0.6672 ("Executable Search Quality", 347 rows, real query execution + LLM judge)
- Renderer at train/eval time:
qwen3_5(thinking enabled; only the final non-thinking text is scored) - Generation settings used for evaluation: temperature 0.0, max_tokens 2048
Model provider
DavidBShan
Model tree
Base
Qwen/Qwen3.6-35B-A3B
Adapter
this model
Modalities
Input
Video, Text, Image
Output
Text
Pricing
Dedicated Endpoints
View detailsSupported Functionality
Model APIs
Dedicated Endpoints
Container
More information