Dedicated Endpoints

Run this model inference on single tenant GPU with unmatched speed and reliability at scale.

Learn more
Container

Run this model inference with full control and performance in your environment.

Learn more

Get help setting up a custom Dedicated Endpoints.

Talk with our engineer to get a quote for reserved GPU instances with discounts.

README

License: apache-2.0

Installation

bash

pip install torch transformers bitsandbytes accelerate

Inference Example

python

import torch
from transformers import AutoTokenizer, AutoModelForCausalLM, BitsAndBytesConfig
from peft import PeftModel
MODEL_NAME = "ahammad115566/smeft-qwen-7b"
RESPONSE_PREFIX = "\n### Response:\n"
bnb_config = BitsAndBytesConfig(
load_in_4bit=True,
bnb_4bit_quant_type="nf4",
bnb_4bit_compute_dtype=torch.bfloat16,
bnb_4bit_use_double_quant=True,
)
model = AutoModelForCausalLM.from_pretrained(
MODEL_NAME,
quantization_config=bnb_config,
device_map={"": 0},
trust_remote_code=True,
)
model.eval()
tokenizer = AutoTokenizer.from_pretrained(MODEL_NAME, trust_remote_code=True)
if tokenizer.pad_token is None:
tokenizer.pad_token = tokenizer.eos_token
# ================= Inference ==============
def build_prompt(instruction: str, context: str = "") -> str:
parts = [ f"\n### Instruction:\n{instruction}"]
parts.append(RESPONSE_PREFIX)
return "\n".join(parts)
@torch.inference_mode()
def ask(instruction: str, context: str = "") -> str:
inputs = tokenizer(
build_prompt(instruction, context),
return_tensors="pt",
add_special_tokens=True,
).to("cuda:0")
output_ids = model.generate(
**inputs,
max_new_tokens=512,
do_sample=False,
repetition_penalty=1.1,
eos_token_id=tokenizer.eos_token_id,
pad_token_id=tokenizer.pad_token_id,
)
new_tokens = output_ids[0][inputs["input_ids"].shape[-1]:]
return tokenizer.decode(new_tokens, skip_special_tokens=True).strip()
"""

Training Details

ParameterValue
Base ModelQwen2.5-7B-Instruct
Fine-tuning MethodLoRA (merged)
Inference Quantization4-bit NF4 (bitsandbytes)
DomainStandard Model Effective Field Theory
Training CorpusCurated SMEFT and HEP preprints
Task FormatInstruction-following scientific QA

Limitations

This model may occasionally:

  • Hallucinate operator identities
  • Domain-locked by design. The model is not suitable for general-purpose tasks.
  • 1,700 training examples. Coverage of the SMEFT operator space may be uneven; rare operators or non-Warsaw bases may be answered less reliably.
  • Omitting the system prompt will cause the model to behave like the base Qwen2.5-7B-Instruct.

Outputs should be independently verified against the primary literature.


Authors

Ahmed Hammad
High-Energy Physics Researcher
The High Energy Accelerator Research Organization (KEK)

Veronica Sanz
Professor of Theoretical Physics
University of Valencia

Citation

A technical paper describing the dataset construction and fine-tuning procedure is forthcoming.

Please cite the model as:

bibtex

@misc{hammad2026smeftqwen,
author = {Ahmed Hammad and Veronica Sanz},
title = {SMEFT-Qwen-7B: A Domain-Adapted Language Model for Standard Model Effective Field Theory},
year = {2026},
howpublished = {\url{https://huggingface.co/ahammad115566/smeft-qwen-7b}}
}

Model provider

ahammad115566

Model tree

Base

Qwen/Qwen2.5-7B-Instruct

Fine-tuned

this model

Modalities

Input

Text

Output

Text

Pricing

Dedicated Endpoints

View details

Supported Functionality

Model APIs

Dedicated Endpoints

Container

More information

Explore FriendliAI today