Dedicated Endpoints

Run this model inference on single tenant GPU with unmatched speed and reliability at scale.

Learn more
Container

Run this model inference with full control and performance in your environment.

Learn more

Get help setting up a custom Dedicated Endpoints.

Talk with our engineer to get a quote for reserved GPU instances with discounts.

README

License: apache-2.0

Model details

Base modelQwen/Qwen3-30B-A3B (MoE, 30.5B total / ~3B active params, 128 experts)
MethodQLoRA (4-bit NF4), attention-only LoRA
LoRA target modulesq_proj, k_proj, v_proj, o_proj
LoRA configrank 32, alpha 64, dropout 0.05, rsLoRA
Trainable params~27M (0.09% of base)
Epochs / examples3 epochs / 995 train (53 val)
Final losstrain 0.67, eval 0.67 (closely tracked — no overfitting)
FormatsLoRA adapter, GGUF q4_k_m, GGUF q3_k_m

Files / quant guide

FileSizeFitsNotes
gguf/qwen3-30b-finance-q4_k_m.gguf~18 GB24 GB+ VRAM (e.g. RTX 4090/5090)best quality
gguf/qwen3-30b-finance-q3_k_m.gguf~14 GB16 GB VRAM (e.g. RTX 5080) fully on GPUsmall quality drop
adapter_model.safetensors103 MBapply to Qwen/Qwen3-30B-A3Bfor Transformers/PEFT

On a 16 GB card use q3_k_m for full-GPU speed. The q4_k_m exceeds 16 GB and would need partial CPU offload (--n-gpu-layers below max) — workable on this MoE but slower.

Intended use

Direct, single-turn financial-analysis instruction following, e.g.:

  • Assess an earnings beat/miss given EPS actual vs estimate
  • Summarize a 10-K/10-Q MD&A excerpt (revenue/margin drivers, outlook)
  • Identify and rank material risks from a risk-factors disclosure
  • Evaluate a company's fundamentals (valuation, profitability, growth, health)
  • Compare same-sector companies on relative strength

Prompt format (Alpaca-style)

The model was trained on the Alpaca instruction template and follows it reliably:

markdown

Below is an instruction that describes a financial analysis task. Write a response that appropriately completes the request.
### Instruction:
{instruction}
### Input:
{input}
### Response:

How to use

Option A — GGUF with llama.cpp

bash

./llama-cli -m gguf/qwen3-30b-finance-q4_k_m.gguf -n 512 --temp 0 -p "$(cat <<'EOF'
Below is an instruction that describes a financial analysis task. Write a response that appropriately completes the request.
### Instruction:
Assess the quality and implications of the company's earnings.
### Input:
Company: Example Corp (EXMP)
Reported EPS: $2.10
Estimated EPS: $1.85
Surprise: +13.5%
### Response:
EOF
)"

Option B — LoRA adapter with Transformers + PEFT

python

import torch
from transformers import AutoModelForCausalLM, AutoTokenizer
from peft import PeftModel
REPO = "joexie/Qwen3-30B-A3B-Finance"
model = AutoModelForCausalLM.from_pretrained(
"Qwen/Qwen3-30B-A3B", torch_dtype=torch.bfloat16, device_map="auto"
)
model = PeftModel.from_pretrained(model, REPO)
tok = AutoTokenizer.from_pretrained(REPO)
prompt = """Below is an instruction that describes a financial analysis task. Write a response that appropriately completes the request.
### Instruction:
Identify and prioritize the most material risks from this disclosure.
### Input:
<paste a risk-factors excerpt or financial data here>
### Response:
"""
inputs = tok(prompt, return_tensors="pt").to(model.device)
out = model.generate(**inputs, max_new_tokens=512, do_sample=False)
print(tok.decode(out[0][inputs.input_ids.shape[1]:], skip_special_tokens=True))

Always put the data to analyze in the ### Input: block. The model reasons over what you give it; it has no retrieval and will fabricate figures if asked for current market state.

Example output

Instruction: Based on the following MD&A excerpt from Honeywell's 10-Q (Q1 2026), identify the key business developments, financial trends, and management outlook.

Input (excerpt): "Net sales increased due to increased pricing... favorable foreign currency translation (Australian dollar, Chinese renminbi, Canadian dollar)... partially offset by lower sales from the divestiture of the PPE business..."

Model output (abridged):

1. Revenue & Margin Drivers

Positive Drivers:

  • Pricing Power: Increased pricing and price adjustments to offset inflation — the ability to pass cost increases to customers.
  • FX Favorability: A weakening U.S. dollar boosted international sales (AUD, CNY, CAD strength).

Negative Headwinds:

  • Volume Declines: Lower sales volumes dragged on growth, suggesting softer underlying demand.
  • Divestiture Impact: The PPE business divestiture structurally reduced the revenue base.

The model reliably returns this kind of structured, source-grounded analysis in the instruction format. Its measured advantage over the base is consistency and format adherence (it answered 10/10 held-out prompts vs the base's 4/10), not a dramatic gap in analytical depth — see EVALUATION.md.

Training data

~1,048 instruction/response pairs (post-filter) built from public sources:

  • Yahoo Finance — fundamentals, quarterly earnings (EPS actual vs estimate), price history
  • SEC EDGAR — 10-K / 10-Q filings; MD&A, risk-factors, and business sections extracted via a prose-scoring parser (skips tables-of-contents and boilerplate)

Composition: earnings 477, MD&A 268, risk 206, fundamentals 60, comparative 18, news 19. Analysis targets for filing-based tasks were synthetically generated and then filtered to remove refusals/meta-commentary.

Limitations and out-of-scope use

  • Not an agent / no tool use. Training contained zero tool-call examples. The model does not reliably perform web search, function calling, or multi-step agentic workflows. In agentic harnesses it has been observed to hallucinate tool arguments and confabulate data. Do not use it as the backend for autonomous agents, chat assistants with tools, or scheduled report generators.
  • No real-time data / will hallucinate market figures. It has no retrieval and will fabricate confident-sounding prices, percentages, and headlines if asked for current market state. Always supply the data to analyze in the prompt.
  • Not investment advice. Outputs are illustrative analysis, may contain errors, and must not be used for trading or financial decisions.
  • Inherits base limitations — knowledge cutoff and biases of Qwen3-30B-A3B.
  • English only, tested on US-listed large caps.

License

Apache 2.0, inheriting the base model's license.

Model provider

xerus19573

Model tree

Base

Qwen/Qwen3-30B-A3B

Adapter

this model

Modalities

Input

Text

Output

Text

Pricing

Dedicated Endpoints

View details

Supported Functionality

Model APIs

Dedicated Endpoints

Container

More information

Explore FriendliAI today