Run this model inference on single tenant GPU with unmatched speed and reliability at scale.
Run this model inference with full control and performance in your environment.
Get help setting up a custom Dedicated Endpoints.
Talk with our engineer to get a quote for reserved GPU instances with discounts.
README
License: otherTraining Summary
- Base model:
unsloth/DeepSeek-R1-Distill-Qwen-14B-bnb-4bit - Method: QLoRA / PEFT LoRA
- Training examples: 2,027 paper-exact SFT rows
- Raw source scale: generated from the larger SPY + QQQ options/GEX pipeline, including ~21.7M historical options rows
- Context length: 4096
- Epochs: 4
- Final train loss: 0.1771
- Output type: LoRA adapter, not a merged full model
Pattern Coverage
The SFT dataset is aligned to the GEX LLM paper framework, including:
gamma_positioningstock_pinning0dte_hedgingpersistent_positivepersistent_negativetransitionallow_conviction- transitional controls
- low-magnitude controls
- shuffled-window controls
Data Layers
This distinction matters:
- Raw data: large historical options/GEX pipeline, approximately 21.7M option rows
- Training set: 2,027 paper-exact SFT examples
- Prior smoke eval: small 32-case schema/label check, not a robustness benchmark
Intended Use
Use this adapter for structured GEX pattern and 30-day regime classification experiments. It is best used with prompts that specify the output schema and provide numerical GEX features.
Not Intended For
This model is not a trading-action model. It should not be used as financial advice, an autonomous trading signal, or a substitute for independent risk management.
Known Limitations
- The SFT data is paper-aligned and partly synthetic/structured; it may learn schema discipline more strongly than real-world robustness.
- A larger held-out eval is still needed for false positives, false negatives, confusion matrix, adversarial prompts, ambiguous inputs, and no-pattern behavior.
- Because this is a LoRA adapter, users need a compatible DeepSeek-R1-Distill-Qwen-14B / Qwen-family base loading path.
Loading Sketch
python
from peft import PeftModelfrom transformers import AutoModelForCausalLM, AutoTokenizerbase = "unsloth/DeepSeek-R1-Distill-Qwen-14B-bnb-4bit"adapter = "dtarkenton/sprocket-gex-deepseek-r1-distill-qwen14b-lora-paper-exact-final"tokenizer = AutoTokenizer.from_pretrained(adapter)model = AutoModelForCausalLM.from_pretrained(base, device_map="auto", trust_remote_code=True)model = PeftModel.from_pretrained(model, adapter)
Model provider
dtarkenton
Model tree
Base
unsloth/DeepSeek-R1-Distill-Qwen-14B-bnb-4bit
Adapter
this model
Modalities
Input
Text
Output
Text
Pricing
Dedicated Endpoints
View detailsSupported Functionality
Model APIs
Dedicated Endpoints
Container
More information