Dedicated Endpoints

Run this model inference on single tenant GPU with unmatched speed and reliability at scale.

Learn more

Get help setting up a custom Dedicated Endpoints.

Talk with our engineer to get a quote for reserved GPU instances with discounts.

README

License: apache-2.0

Intended Use

Primary use: Automated extraction of actionable financial signals from raw news text for downstream quant workflows, portfolio monitoring, or news aggregators.

Input: Any English-language financial news article.

Output: A structured JSON object following the schema below.


Output Schema

json

{
"description": "A brief summary of the news article.",
"keywords": ["keyword1", "keyword2"],
"insights": [
{
"ticker": "AAPL",
"sentiment": "positive|negative|neutral",
"sentiment_reasoning": "Detailed explanation of the sentiment for this ticker."
}
]
}

The model will only output pure JSON — no markdown fences, no preamble.


Quick Start

python

from unsloth import FastModel
from unsloth.chat_templates import get_chat_template
model, tokenizer = FastModel.from_pretrained(
model_name = "makiisthebes/gemma_4_lora_32b_model_financial",
max_seq_length = 8192,
load_in_4bit = True,
)
tokenizer = get_chat_template(tokenizer, chat_template="gemma-4-thinking")
article = "Your financial news article text here..."
prompt = f"""Please extract the relevant information from the provided news article in JSON format.
Provide the output in the following JSON structure:
{{
"description": "A brief summary of the news article.",
"keywords": ["disclosure", "bankruptcy", "lawsuit"],
"insights": [
{{
"ticker": "AAPL",
"sentiment": "positive||negative||neutral",
"sentiment_reasoning": "A detailed explanation of the sentiment analysis for the given ticker."
}}
]
}}
These details should be extracted based on the content of the news article provided below.
Please ensure that the output is in valid JSON format and adheres to the specified structure.
Do not include anything other than pure JSON in your response, including ```json or any explanatory text.
News Article: {article}"""
messages = [{"role": "user", "content": [{"type": "text", "text": prompt}]}]
inputs = tokenizer.apply_chat_template(
messages,
add_generation_prompt=True,
return_tensors="pt",
tokenize=True,
return_dict=True,
).to("cuda")
outputs = model.generate(
**inputs,
max_new_tokens=512,
use_cache=True,
temperature=1.0, top_p=0.95, top_k=64, # Gemma-4 recommended settings
)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))

Training Details

ParameterValue
Base modelunsloth/gemma-4-E2B-it
Fine-tuning methodQLoRA (4-bit)
LoRA rank (r)8
LoRA alpha8
LoRA dropout0
Trainable parameters61,214,720 / 31,334,301,232 (~0.20%)
Vision layers fine-tunedNo (text-only)
Context length8,192 tokens
Chat templategemma-4-thinking
Training steps60
Batch size (effective)4 (1 per device × 4 gradient accumulation)
Learning rate2e-4
LR schedulerLinear
OptimiserAdamW 8-bit
Weight decay0.001
Warmup steps5
HardwareNVIDIA GB10 (121.69 GB VRAM)
Training time~110 minutes
Peak training VRAM~30 GB
Loss (final step)~0.113

Response masking (train_on_responses_only) was applied — the model only learns from assistant JSON outputs, not the user prompt, which improves JSON formatting accuracy.


Dataset

makiisthebes/110kNewsArticlesSentiment

  • 109,524 financial news articles sourced from Investing.com
  • Each sample contains a raw article (input) paired with a structured JSON extraction (output)
  • Split: 80% train / 10% validation / 10% test (ordered, no shuffle — preserves temporal ordering of news)
  • Final training set after truncation filtering: 87,610 samples

Limitations

  • Short fine-tune: Only 60 training steps were used (a fraction of a full epoch). This is a proof-of-concept checkpoint — longer training will improve structured output reliability.
  • Ticker coverage: Sentiment labels are derived from articles and may miss tickers not explicitly mentioned. Ticker symbols may occasionally be incorrect or hallucinated.
  • English only: Trained exclusively on English-language financial news.
  • Domain: Optimised for financial/market news (equities, indices, commodities, FX). Performance on unrelated domains is not guaranteed.
  • JSON compliance: While the model is trained to output pure JSON, very long articles approaching the 8,192 token context limit may cause truncated or malformed outputs.

Framework

Trained using Unsloth 2026.6.1 with:

  • transformers==5.5.0
  • trl (SFTTrainer + SFTConfig)
  • peft (LoRA via FastModel.get_peft_model)
  • datasets==4.3.0

License

The LoRA adapter weights follow the base model's Gemma Terms of Use. The training dataset is separately licensed — see makiisthebes/110kNewsArticlesSentiment for details.

markdown

---
A few things worth noting before you paste this on HuggingFace:
1. **Base model name discrepancy** — Your notebook filename says `31B` and the saved repo is named `gemma_4_lora_32b_model_financial`, but the actual model loaded in the code is `unsloth/gemma-4-E2B-it` (a much smaller variant). Worth double-checking which base model you actually ran the full training on, and update the card accordingly.
2. **`push_to_hub` typo** — Your save cell pushes to `makiisthebes/gemma_4_lora_32b_model_financial` for the model but `HF_ACCOUNT/gemma_4_lora_32b_model_financial` for the tokenizer — the tokenizer likely didn't upload unless you fixed that.
3. **60 steps caveat** — The card calls this out honestly. If you run a full epoch, bump those training stats.

Model provider

makiisthebes

makiisthebes

Model tree

Base

unsloth/gemma-4-E2B-it

Adapter

this model

Modalities

Input

Video, Audio, Text, Image

Output

Text

Pricing

Dedicated Endpoints

View details

Supported Functionality

Model APIs

Dedicated Endpoints

Container

More information

Explore FriendliAI today