smeft-qwen-7b API & Inference Endpoint

Installation

bash
pip install torch transformers bitsandbytes accelerate

Inference Example

python
import torch
from transformers import AutoTokenizer, AutoModelForCausalLM, BitsAndBytesConfig
from peft import PeftModel
MODEL_NAME = "ahammad115566/smeft-qwen-7b"
RESPONSE_PREFIX = "\n### Response:\n"
bnb_config = BitsAndBytesConfig(
    load_in_4bit=True,
    bnb_4bit_quant_type="nf4",
    bnb_4bit_compute_dtype=torch.bfloat16,
    bnb_4bit_use_double_quant=True,
)
model = AutoModelForCausalLM.from_pretrained(
    MODEL_NAME,
    quantization_config=bnb_config,
    device_map={"": 0},
    trust_remote_code=True,
)
model.eval()
tokenizer = AutoTokenizer.from_pretrained(MODEL_NAME, trust_remote_code=True)
if tokenizer.pad_token is None:
    tokenizer.pad_token = tokenizer.eos_token
# ================= Inference ==============
def build_prompt(instruction: str, context: str = "") -> str:
    parts = [ f"\n### Instruction:\n{instruction}"]
    parts.append(RESPONSE_PREFIX)
    return "\n".join(parts)
@torch.inference_mode()
def ask(instruction: str, context: str = "") -> str:
    inputs = tokenizer(
        build_prompt(instruction, context),
        return_tensors="pt",
        add_special_tokens=True,
    ).to("cuda:0")
    output_ids = model.generate(
        **inputs,
        max_new_tokens=512,
        do_sample=False,
        repetition_penalty=1.1,
        eos_token_id=tokenizer.eos_token_id,
        pad_token_id=tokenizer.pad_token_id,
    )
    new_tokens = output_ids[0][inputs["input_ids"].shape[-1]:]
    return tokenizer.decode(new_tokens, skip_special_tokens=True).strip()
"""

#Training Details

Table with columns: Parameter, Value
Parameter	Value
Base Model	Qwen3-8B
Fine-tuning Method	LoRA (merged)
Inference Quantization	4-bit NF4 (bitsandbytes)
Domain	Standard Model Effective Field Theory
Training Corpus	Curated SMEFT and HEP preprints
Task Format	Instruction-following scientific QA

Limitations

This model may occasionally:

Hallucinate operator identities
Domain-locked by design. The model is not suitable for general-purpose tasks.
2,500 training examples. Coverage of the SMEFT operator space may be uneven; rare operators or non-Warsaw bases may be answered less reliably.
Omitting the system prompt will cause the model to behave like the base Qwen3-8B.

Outputs should be independently verified against the primary literature.

Authors

Ahmed Hammad
Assistant professor

Center of AI and natural science, KIAS, Seoul.

Veronica Sanz
Professor of Theoretical Physics

University of Valencia

Citation

A technical paper describing the dataset construction and fine-tuning procedure is forthcoming.

Please cite the model as:

bibtex
@misc{hammad2026smeftqwen,
  author       = {Ahmed Hammad and Veronica Sanz},
  title        = {SMEFT-Qwen-7B: A Domain-Adapted Language Model for Standard Model Effective Field Theory},
  year         = {2026},
  howpublished = {\url{https://huggingface.co/ahammad115566/smeft-qwen-7b}}
}

python

import torch
from transformers import AutoTokenizer, AutoModelForCausalLM, BitsAndBytesConfig
from peft import PeftModel
MODEL_NAME = "ahammad115566/smeft-qwen-7b"
RESPONSE_PREFIX = "\n### Response:\n"
bnb_config = BitsAndBytesConfig(
    load_in_4bit=True,
    bnb_4bit_quant_type="nf4",
    bnb_4bit_compute_dtype=torch.bfloat16,
    bnb_4bit_use_double_quant=True,
)
model = AutoModelForCausalLM.from_pretrained(
    MODEL_NAME,
    quantization_config=bnb_config,
    device_map={"": 0},
    trust_remote_code=True,
)
model.eval()
tokenizer = AutoTokenizer.from_pretrained(MODEL_NAME, trust_remote_code=True)
if tokenizer.pad_token is None:
    tokenizer.pad_token = tokenizer.eos_token
# ================= Inference ==============
def build_prompt(instruction: str, context: str = "") -> str:
    parts = [ f"\n### Instruction:\n{instruction}"]
    parts.append(RESPONSE_PREFIX)
    return "\n".join(parts)
@torch.inference_mode()
def ask(instruction: str, context: str = "") -> str:
    inputs = tokenizer(
        build_prompt(instruction, context),
        return_tensors="pt",
        add_special_tokens=True,
    ).to("cuda:0")
    output_ids = model.generate(
        **inputs,
        max_new_tokens=512,
        do_sample=False,
        repetition_penalty=1.1,
        eos_token_id=tokenizer.eos_token_id,
        pad_token_id=tokenizer.pad_token_id,
    )
    new_tokens = output_ids[0][inputs["input_ids"].shape[-1]:]
    return tokenizer.decode(new_tokens, skip_special_tokens=True).strip()
"""

Parameter

Value

Base Model

Qwen3-8B

Fine-tuning Method

LoRA (merged)

Inference Quantization

4-bit NF4 (bitsandbytes)

Domain

Standard Model Effective Field Theory

Training Corpus

Curated SMEFT and HEP preprints

Task Format

Instruction-following scientific QA

Limitations

This model may occasionally:

Hallucinate operator identities

Domain-locked by design. The model is not suitable for general-purpose tasks.

2,500 training examples. Coverage of the SMEFT operator space may be uneven; rare operators or non-Warsaw bases may be answered less reliably.

Omitting the system prompt will cause the model to behave like the base Qwen3-8B.

Outputs should be independently verified against the primary literature.

Citation

A technical paper describing the dataset construction and fine-tuning procedure is forthcoming.

Please cite the model as:

bibtex

@misc{hammad2026smeftqwen,
  author       = {Ahmed Hammad and Veronica Sanz},
  title        = {SMEFT-Qwen-7B: A Domain-Adapted Language Model for Standard Model Effective Field Theory},
  year         = {2026},
  howpublished = {\url{https://huggingface.co/ahammad115566/smeft-qwen-7b}}
}

smeft-qwen-7b

README

Installation

Inference Example

Limitations

Outputs should be independently verified against the primary literature.

Authors

Citation

Explore FriendliAI today

README

Installation

Inference Example

Limitations

Outputs should be independently verified against the primary literature.

Authors

Citation