Run this model inference on single tenant GPU with unmatched speed and reliability at scale.
Run this model inference with full control and performance in your environment.
Get help setting up a custom Dedicated Endpoints.
Talk with our engineer to get a quote for reserved GPU instances with discounts.
README
License: apache-2.0Model details
| Base model | Qwen/Qwen3-30B-A3B (MoE, 30.5B total / ~3B active params, 128 experts) |
| Method | QLoRA (4-bit NF4), attention-only LoRA |
| LoRA target modules | q_proj, k_proj, v_proj, o_proj |
| LoRA config | rank 32, alpha 64, dropout 0.05, rsLoRA |
| Trainable params | ~27M (0.09% of base) |
| Epochs / examples | 3 epochs / 995 train (53 val) |
| Final loss | train 0.67, eval 0.67 (closely tracked — no overfitting) |
| Formats | LoRA adapter, GGUF q4_k_m, GGUF q3_k_m |
Files / quant guide
| File | Size | Fits | Notes |
|---|---|---|---|
gguf/qwen3-30b-finance-q4_k_m.gguf | ~18 GB | 24 GB+ VRAM (e.g. RTX 4090/5090) | best quality |
gguf/qwen3-30b-finance-q3_k_m.gguf | ~14 GB | 16 GB VRAM (e.g. RTX 5080) fully on GPU | small quality drop |
adapter_model.safetensors | 103 MB | apply to Qwen/Qwen3-30B-A3B | for Transformers/PEFT |
On a 16 GB card use q3_k_m for full-GPU speed. The q4_k_m exceeds 16 GB and would need
partial CPU offload (--n-gpu-layers below max) — workable on this MoE but slower.
Intended use
Direct, single-turn financial-analysis instruction following, e.g.:
- Assess an earnings beat/miss given EPS actual vs estimate
- Summarize a 10-K/10-Q MD&A excerpt (revenue/margin drivers, outlook)
- Identify and rank material risks from a risk-factors disclosure
- Evaluate a company's fundamentals (valuation, profitability, growth, health)
- Compare same-sector companies on relative strength
Prompt format (Alpaca-style)
The model was trained on the Alpaca instruction template and follows it reliably:
markdown
Below is an instruction that describes a financial analysis task. Write a response that appropriately completes the request.### Instruction:{instruction}### Input:{input}### Response:
How to use
Option A — GGUF with llama.cpp
bash
./llama-cli -m gguf/qwen3-30b-finance-q4_k_m.gguf -n 512 --temp 0 -p "$(cat <<'EOF'Below is an instruction that describes a financial analysis task. Write a response that appropriately completes the request.### Instruction:Assess the quality and implications of the company's earnings.### Input:Company: Example Corp (EXMP)Reported EPS: $2.10Estimated EPS: $1.85Surprise: +13.5%### Response:EOF)"
Option B — LoRA adapter with Transformers + PEFT
python
import torchfrom transformers import AutoModelForCausalLM, AutoTokenizerfrom peft import PeftModelREPO = "joexie/Qwen3-30B-A3B-Finance"model = AutoModelForCausalLM.from_pretrained("Qwen/Qwen3-30B-A3B", torch_dtype=torch.bfloat16, device_map="auto")model = PeftModel.from_pretrained(model, REPO)tok = AutoTokenizer.from_pretrained(REPO)prompt = """Below is an instruction that describes a financial analysis task. Write a response that appropriately completes the request.### Instruction:Identify and prioritize the most material risks from this disclosure.### Input:<paste a risk-factors excerpt or financial data here>### Response:"""inputs = tok(prompt, return_tensors="pt").to(model.device)out = model.generate(**inputs, max_new_tokens=512, do_sample=False)print(tok.decode(out[0][inputs.input_ids.shape[1]:], skip_special_tokens=True))
Always put the data to analyze in the
### Input:block. The model reasons over what you give it; it has no retrieval and will fabricate figures if asked for current market state.
Example output
Instruction: Based on the following MD&A excerpt from Honeywell's 10-Q (Q1 2026), identify the key business developments, financial trends, and management outlook.
Input (excerpt): "Net sales increased due to increased pricing... favorable foreign currency translation (Australian dollar, Chinese renminbi, Canadian dollar)... partially offset by lower sales from the divestiture of the PPE business..."
Model output (abridged):
1. Revenue & Margin Drivers
Positive Drivers:
- Pricing Power: Increased pricing and price adjustments to offset inflation — the ability to pass cost increases to customers.
- FX Favorability: A weakening U.S. dollar boosted international sales (AUD, CNY, CAD strength).
Negative Headwinds:
- Volume Declines: Lower sales volumes dragged on growth, suggesting softer underlying demand.
- Divestiture Impact: The PPE business divestiture structurally reduced the revenue base.
The model reliably returns this kind of structured, source-grounded analysis in the instruction format. Its measured advantage over the base is consistency and format adherence (it answered 10/10 held-out prompts vs the base's 4/10), not a dramatic gap in analytical depth — see EVALUATION.md.
Training data
~1,048 instruction/response pairs (post-filter) built from public sources:
- Yahoo Finance — fundamentals, quarterly earnings (EPS actual vs estimate), price history
- SEC EDGAR — 10-K / 10-Q filings; MD&A, risk-factors, and business sections extracted via a prose-scoring parser (skips tables-of-contents and boilerplate)
Composition: earnings 477, MD&A 268, risk 206, fundamentals 60, comparative 18, news 19. Analysis targets for filing-based tasks were synthetically generated and then filtered to remove refusals/meta-commentary.
Limitations and out-of-scope use
- Not an agent / no tool use. Training contained zero tool-call examples. The model does not reliably perform web search, function calling, or multi-step agentic workflows. In agentic harnesses it has been observed to hallucinate tool arguments and confabulate data. Do not use it as the backend for autonomous agents, chat assistants with tools, or scheduled report generators.
- No real-time data / will hallucinate market figures. It has no retrieval and will fabricate confident-sounding prices, percentages, and headlines if asked for current market state. Always supply the data to analyze in the prompt.
- Not investment advice. Outputs are illustrative analysis, may contain errors, and must not be used for trading or financial decisions.
- Inherits base limitations — knowledge cutoff and biases of Qwen3-30B-A3B.
- English only, tested on US-listed large caps.
License
Apache 2.0, inheriting the base model's license.
Model provider
xerus19573
Model tree
Base
Qwen/Qwen3-30B-A3B
Adapter
this model
Modalities
Input
Text
Output
Text
Pricing
Dedicated Endpoints
View detailsSupported Functionality
Model APIs
Dedicated Endpoints
Container
More information