DiegoDomLarr

mistral-7b-breast-cancer-qlora

Deploy Dedicated

Dedicated Endpoints

Run this model inference on single tenant GPU with unmatched speed and reliability at scale.

Learn more

Container

Run this model inference with full control and performance in your environment.

Learn more

Get help setting up a custom Dedicated Endpoints.

Talk with our engineer to get a quote for reserved GPU instances with discounts.

Why Fine-Tune a Model?

Large language models like Mistral 7B are trained on broad, internet-scale data. That makes them capable generalists — but generalists have limitations when applied to specialized domains.

Fine-tuning is the process of continuing the training of a pre-trained model on a smaller, domain-specific dataset. Instead of learning from scratch (which requires massive compute and data), we teach an already-capable model to speak the language of a specific field — in this case, breast cancer medicine.

The goal is not to make the model "smarter" in a general sense, but more:

Accurate — using the right clinical terminology and referencing real medical concepts
Consistent — answering breast cancer questions in the structured, informative style of medical Q&A
Relevant — focusing its generation on domain knowledge rather than generic preambles

Why QLoRA?

Training all 7 billion parameters requires dozens of GB of VRAM and days of compute — out of reach without expensive hardware. QLoRA (Quantized Low-Rank Adaptation) makes it accessible with two techniques:

4-bit quantization — model weights are compressed from 32-bit floats to 4-bit integers, reducing VRAM from ~28 GB to ~5 GB with minimal quality loss.
LoRA adapters — instead of updating all 7B parameters, small trainable matrices (adapters) are injected into the attention layers. Only ~1–2% of total parameters are trained. The rest stay frozen.

The result: a meaningful domain fine-tune on a single T4 GPU (Google Colab free tier) in approximately 35 minutes.

Model Details

Table with columns: Property, Value
Property	Value
Base model	`mistralai/Mistral-7B-Instruct-v0.2`
Fine-tuning method	QLoRA (4-bit quantization + LoRA)
LoRA rank (r)	16
LoRA alpha	32
Target modules	`q_proj`, `k_proj`, `v_proj`, `o_proj`
Training epochs	3

Training Data

Trained on DiegoDomLarr/breast-cancer-qa — 1,061 breast cancer Q&A pairs from two sources:

Table with columns: Source, Examples, Description
Source	Examples	Description
PubMedQA (`pqa_labeled`)	29	Human-verified biomedical questions from PubMed abstracts
ChatDoctor-HealthCareMagic-100k	1,032	Real patient–doctor consultations filtered for breast cancer
Total	1,061

Filter keywords: breast cancer, breast carcinoma, BRCA1, BRCA2, HER2, tamoxifen, mastectomy, lumpectomy, mammogram, ductal carcinoma, lobular carcinoma, triple negative breast, aromatase inhibitor, trastuzumab

Dataset statistics:

Average question length: 541 characters
Average answer length: 621 characters

Training format — all examples were converted to Mistral's instruction template:

markdown
<s>[INST] {question} [/INST] {answer} </s>

Evaluation

The fine-tuned model was compared against the base Mistral-7B-Instruct on the prompt:

"What are the main risk factors for breast cancer?"

Fine-tuned model response:

Several factors can increase the risk of developing breast cancer. Here are some of the most common risk factors:

Gender: Being female is the greatest risk factor for breast cancer.

Age: The risk of breast cancer increases as women get older. Most breast cancers are diagnosed in women over the age of 50.

Genetic Factors: Certain genetic mutations, such as those in the BRCA1 and BRCA2 genes, can significantly increase the risk. Women with a family history of breast cancer in first-degree relatives are also at higher risk.

Lifestyle Factors: A lack of physical activity, a diet high in saturated fat, being overweight or obese, and smoking all contribute to increased risk.

Base Mistral response:

Breast cancer is the most common cancer among women worldwide. Several risk factors can increase a woman's chance of developing breast cancer.

Age: The risk increases as women get older. Most breast cancers are diagnosed after age 50.

Genetic factors: A family history of breast cancer increases the risk. Inherited mutations in BRCA1 and BRCA2 significantly increase risk.

Hormonal factors: Extended exposure to estrogen and progesterone can increase risk. Factors include early menstruation, late menopause, and never having given birth.

ROUGE-L score (base vs fine-tuned): 0.4509

A ROUGE-L of ~0.45 confirms the fine-tuned model generates answers that are meaningfully different from the base — more structured, more patient-oriented, and covering additional factors (gender, lifestyle) not prominently addressed by the base model.

How to Use

Requirements

bash
pip install transformers peft bitsandbytes accelerate

Inference

python
import torch
from transformers import AutoTokenizer, AutoModelForCausalLM, BitsAndBytesConfig
from peft import PeftModel

base_model_id = "mistralai/Mistral-7B-Instruct-v0.2"

bnb_config = BitsAndBytesConfig(
    load_in_4bit=True,
    bnb_4bit_quant_type="nf4",
    bnb_4bit_compute_dtype=torch.float16,
    bnb_4bit_use_double_quant=True,
)

tokenizer = AutoTokenizer.from_pretrained(base_model_id)
tokenizer.pad_token = tokenizer.eos_token

base_model = AutoModelForCausalLM.from_pretrained(
    base_model_id,
    quantization_config=bnb_config,
    device_map="auto",
)

model = PeftModel.from_pretrained(base_model, "DiegoDomLarr/mistral-7b-breast-cancer-qlora")
model.eval()
model.config.use_cache = True

prompt = "<s>[INST] What are the side effects of tamoxifen? [/INST]"
inputs = tokenizer(prompt, return_tensors="pt").to("cuda")

with torch.no_grad():
    outputs = model.generate(
        **inputs,
        max_new_tokens=300,
        temperature=0.7,
        do_sample=True,
    )

print(tokenizer.decode(outputs[0], skip_special_tokens=True))

Limitations and Intended Use

This model is for educational and research purposes only. It is not a medical device and must not be used for clinical diagnosis, treatment decisions, or patient care.

Responses may contain inaccuracies or outdated medical information — always verify with a licensed healthcare professional.
The model was trained on ~1,000 examples, which is small by fine-tuning standards. It may hallucinate or generalize poorly on edge-case questions.
Coverage is limited to breast cancer topics represented in the training data.
This model has not been audited, validated, or certified for any medical use.

Tech Stack

Table with columns: Library, Role
Library	Role
`transformers`	Load Mistral-7B and tokenizer
`peft`	Apply and load LoRA adapters
`bitsandbytes`	4-bit quantization
`trl`	SFTTrainer for supervised fine-tuning
`datasets`	Load and process training data
`evaluate`

About This Project

This model was built as an end-to-end learning project covering the full fine-tuning pipeline:

Curating and publishing a domain-specific dataset to HF Hub
Loading a 7B model in 4-bit on consumer hardware (Colab T4)
Applying LoRA adapters with PEFT
Training with SFTTrainer (TRL library)
Evaluating with ROUGE-L against the base model
Publishing the adapter and model card to HuggingFace Hub

Author: DiegoDomLarr Dataset: DiegoDomLarr/breast-cancer-qa Base model: mistralai/Mistral-7B-Instruct-v0.2

License

Apache 2.0 — same as the base model.

Model provider

DiegoDomLarr

Model tree

Base

mistralai/Mistral-7B-Instruct-v0.2

Adapter

this model

Modalities

Input

Text

Output

Text

Pricing

Dedicated Endpoints

View details

Supported Functionality

Model APIs

Dedicated Endpoints

Container

More information

Model card

Explore FriendliAI today

Get started Talk to an engineer

Why Fine-Tune a Model?

Large language models like Mistral 7B are trained on broad, internet-scale data. That makes them capable generalists — but generalists have limitations when applied to specialized domains.

The goal is not to make the model "smarter" in a general sense, but more:

Accurate — using the right clinical terminology and referencing real medical concepts
Consistent — answering breast cancer questions in the structured, informative style of medical Q&A
Relevant — focusing its generation on domain knowledge rather than generic preambles

Why QLoRA?

4-bit quantization — model weights are compressed from 32-bit floats to 4-bit integers, reducing VRAM from ~28 GB to ~5 GB with minimal quality loss.
LoRA adapters — instead of updating all 7B parameters, small trainable matrices (adapters) are injected into the attention layers. Only ~1–2% of total parameters are trained. The rest stay frozen.

The result: a meaningful domain fine-tune on a single T4 GPU (Google Colab free tier) in approximately 35 minutes.

Model Details

Table with columns: Property, Value
Property	Value
Base model	`mistralai/Mistral-7B-Instruct-v0.2`
Fine-tuning method	QLoRA (4-bit quantization + LoRA)
LoRA rank (r)	16
LoRA alpha	32
Target modules	`q_proj`, `k_proj`, `v_proj`, `o_proj`
Training epochs	3

Training Data

Trained on DiegoDomLarr/breast-cancer-qa — 1,061 breast cancer Q&A pairs from two sources:

Table with columns: Source, Examples, Description
Source	Examples	Description
PubMedQA (`pqa_labeled`)	29	Human-verified biomedical questions from PubMed abstracts
ChatDoctor-HealthCareMagic-100k	1,032	Real patient–doctor consultations filtered for breast cancer
Total	1,061

Dataset statistics:

Average question length: 541 characters
Average answer length: 621 characters

Training format — all examples were converted to Mistral's instruction template:

markdown
<s>[INST] {question} [/INST] {answer} </s>

Evaluation

The fine-tuned model was compared against the base Mistral-7B-Instruct on the prompt:

"What are the main risk factors for breast cancer?"

Fine-tuned model response:

Several factors can increase the risk of developing breast cancer. Here are some of the most common risk factors:

Gender: Being female is the greatest risk factor for breast cancer.

Age: The risk of breast cancer increases as women get older. Most breast cancers are diagnosed in women over the age of 50.

Genetic Factors: Certain genetic mutations, such as those in the BRCA1 and BRCA2 genes, can significantly increase the risk. Women with a family history of breast cancer in first-degree relatives are also at higher risk.

Lifestyle Factors: A lack of physical activity, a diet high in saturated fat, being overweight or obese, and smoking all contribute to increased risk.

Base Mistral response:

Breast cancer is the most common cancer among women worldwide. Several risk factors can increase a woman's chance of developing breast cancer.

Age: The risk increases as women get older. Most breast cancers are diagnosed after age 50.

Genetic factors: A family history of breast cancer increases the risk. Inherited mutations in BRCA1 and BRCA2 significantly increase risk.

Hormonal factors: Extended exposure to estrogen and progesterone can increase risk. Factors include early menstruation, late menopause, and never having given birth.

ROUGE-L score (base vs fine-tuned): 0.4509

How to Use

Requirements

bash
pip install transformers peft bitsandbytes accelerate

Inference

python
import torch
from transformers import AutoTokenizer, AutoModelForCausalLM, BitsAndBytesConfig
from peft import PeftModel

base_model_id = "mistralai/Mistral-7B-Instruct-v0.2"

bnb_config = BitsAndBytesConfig(
    load_in_4bit=True,
    bnb_4bit_quant_type="nf4",
    bnb_4bit_compute_dtype=torch.float16,
    bnb_4bit_use_double_quant=True,
)

tokenizer = AutoTokenizer.from_pretrained(base_model_id)
tokenizer.pad_token = tokenizer.eos_token

base_model = AutoModelForCausalLM.from_pretrained(
    base_model_id,
    quantization_config=bnb_config,
    device_map="auto",
)

model = PeftModel.from_pretrained(base_model, "DiegoDomLarr/mistral-7b-breast-cancer-qlora")
model.eval()
model.config.use_cache = True

prompt = "<s>[INST] What are the side effects of tamoxifen? [/INST]"
inputs = tokenizer(prompt, return_tensors="pt").to("cuda")

with torch.no_grad():
    outputs = model.generate(
        **inputs,
        max_new_tokens=300,
        temperature=0.7,
        do_sample=True,
    )

print(tokenizer.decode(outputs[0], skip_special_tokens=True))

Limitations and Intended Use

This model is for educational and research purposes only. It is not a medical device and must not be used for clinical diagnosis, treatment decisions, or patient care.

Responses may contain inaccuracies or outdated medical information — always verify with a licensed healthcare professional.
The model was trained on ~1,000 examples, which is small by fine-tuning standards. It may hallucinate or generalize poorly on edge-case questions.
Coverage is limited to breast cancer topics represented in the training data.
This model has not been audited, validated, or certified for any medical use.

Tech Stack

Table with columns: Library, Role
Library	Role
`transformers`	Load Mistral-7B and tokenizer
`peft`	Apply and load LoRA adapters
`bitsandbytes`	4-bit quantization
`trl`	SFTTrainer for supervised fine-tuning
`datasets`	Load and process training data
`evaluate`

About This Project

This model was built as an end-to-end learning project covering the full fine-tuning pipeline:

Curating and publishing a domain-specific dataset to HF Hub
Loading a 7B model in 4-bit on consumer hardware (Colab T4)
Applying LoRA adapters with PEFT
Training with SFTTrainer (TRL library)
Evaluating with ROUGE-L against the base model
Publishing the adapter and model card to HuggingFace Hub

Author: DiegoDomLarr Dataset: DiegoDomLarr/breast-cancer-qa Base model: mistralai/Mistral-7B-Instruct-v0.2

License

Apache 2.0 — same as the base model.

mistral-7b-breast-cancer-qlora

Get help setting up a custom Dedicated Endpoints.

README

Why Fine-Tune a Model?

Why QLoRA?

Model Details

Training Data

Evaluation

How to Use

Requirements

Inference

Limitations and Intended Use

Tech Stack

About This Project

License

Explore FriendliAI today

README

Why Fine-Tune a Model?

Why QLoRA?

Model Details

Training Data

Evaluation

How to Use

Requirements

Inference

Limitations and Intended Use

Tech Stack

About This Project

License