Run this model inference on single tenant GPU with unmatched speed and reliability at scale.
Run this model inference with full control and performance in your environment.
Get help setting up a custom Dedicated Endpoints.
Talk with our engineer to get a quote for reserved GPU instances with discounts.
README
License: apache-2.0Model Summary
- Model name:
qwen2.5-7b-instruct-adjuvant-extractor - Base model:
Qwen/Qwen2.5-7B-Instruct - Fine-tuning method: LoRA adapter training, merged into full model weights
- Primary task: Evidence-linked adjuvant extraction from title+abstract text
Prompt Used for Inference
System prompt
text
You are a biomedical information extraction assistant.
User instruction template
text
Extract infectious-disease adjuvants from the text and provide evidence snippets.Return ONLY valid JSON in this format:[{"adjuvant": "<string>", "evidence": "<string>"}, ...]Do not include any extra keys or explanation.
Input format
text
Title: <paper title>Abstract: <paper abstract>
The model receives the user instruction template followed by the title/abstract text.
Actual Output Format Target
The model is prompted to return a JSON array of objects with exactly two keys:
json
[{"adjuvant": "<string>","evidence": "<string>"}]
Expected behavior:
- Return a JSON array (can be empty:
[]). - Each object must contain:
adjuvant: normalized or near-normalized adjuvant nameevidence: supporting text snippet from the same input abstract
- No extra keys and no explanatory text outside JSON.
Input/Output Example
Example Input
text
Title: Intranasal vaccination study using alum and MPLA adjuvants in a murine influenza model.Abstract: Mice immunized with antigen formulated with alum showed increased IgG titers. A separate group receiving MPLA-adjuvanted vaccine demonstrated stronger IFN-gamma responses and reduced viral load after challenge.
Expected Output
json
[{"adjuvant": "alum","evidence": "Mice immunized with antigen formulated with alum showed increased IgG titers."},{"adjuvant": "MPLA","evidence": "A separate group receiving MPLA-adjuvanted vaccine demonstrated stronger IFN-gamma responses and reduced viral load after challenge."}]
Notes on Output Validity
- Output must be valid JSON.
- Output must be a JSON array (use
[]if no supported adjuvant is found). - Each item should include only
adjuvantandevidence. - Evidence text should come from the provided input abstract.
Working Inference Code (Validated)
python
import torchimport jsonfrom transformers import AutoTokenizer, AutoModelForCausalLMrepo_id = "RehanaHasin/Qwen2.5-7B-Instruct-adjuvant-extractor"SYS_PROMPT = "You are a biomedical information extraction assistant."PROMPT_INSTRUCTION = ("Extract infectious-disease adjuvants from the text and provide evidence snippets.\n""Return ONLY valid JSON in this format:\n""[{\"adjuvant\": \"<string>\", \"evidence\": \"<string>\"}, ...]\n""Do not include any extra keys or explanation.")title = "Protective immune response against Streptococcus pyogenes in mice after intranasal vaccination with the fibronectin-binding protein SfbI."abstract = ("Despite the significant impact on human health caused by Streptococcus pyogenes, ""there is currently no vaccine available. Intranasal immunization of mice with either ""SfbI alone or coupled to cholera toxin B subunit (CTB) triggered efficient SfbI-specific responses.")user_input = f"{PROMPT_INSTRUCTION}\n\nTitle: {title}\nAbstract: {abstract}"tokenizer = AutoTokenizer.from_pretrained(repo_id, use_fast=True)model = AutoModelForCausalLM.from_pretrained(repo_id,dtype=torch.float16,device_map="auto",)model.eval()chat = tokenizer.apply_chat_template([{"role": "system", "content": SYS_PROMPT},{"role": "user", "content": user_input},],tokenize=False,add_generation_prompt=True,)inputs = tokenizer(chat, return_tensors="pt", truncation=True, max_length=1024)inputs = {k: v.to(model.get_input_embeddings().weight.device) for k, v in inputs.items()}with torch.no_grad():outputs = model.generate(**inputs,max_new_tokens=200,do_sample=False,pad_token_id=tokenizer.eos_token_id,)prediction = tokenizer.decode(outputs[0][inputs["input_ids"].shape[1]:],skip_special_tokens=True).strip()print(json.dumps(json.loads(prediction), indent=2, ensure_ascii=False))
Intended Use
This model is intended for research workflows in biomedical literature mining, especially:
- infectious disease vaccine literature curation
- vaccine adjuvant concept extraction
- evidence-linked information extraction for downstream manual review
This model is not intended for clinical decision-making.
Training Data and Split Context
The model was trained on a curated infectious disease adjuvant corpus derived from VIOLIN ecosystem resources.
- Corpus size used in workflow: 298 abstracts
- Fixed split framework used across models:
- 256 train
- 13 validation
- 29 test
Training Configuration (Fixed Manuscript Setting)
- LoRA rank (
r): 8 - Learning rate:
2e-4 - Epochs:
5 - Quantization during fine-tuning: 4-bit NF4 with double quantization
- Compute dtype: float16
- Per-device batch size and gradient accumulation were configured for stable updates across model families.
Usage
python
from transformers import AutoTokenizer, AutoModelForCausalLMimport torchrepo_id = "RehanaHasin/Qwen2.5-7B-Instruct-adjuvant-extractor"tokenizer = AutoTokenizer.from_pretrained(repo_id)model = AutoModelForCausalLM.from_pretrained(repo_id,torch_dtype=torch.float16,device_map="auto",)
Prompting Recommendation
Use prompts that explicitly request structured JSON output containing only:
adjuvantevidence
and restrict extra commentary to reduce parsing errors.
Limitations
- Evaluated on a focused infectious-disease adjuvant corpus; broader-domain generalization is not guaranteed.
- Performance depends on abstract quality and terminology variation.
- Structured output may still require post-processing and manual validation.
Ethical and Safety Notes
- Outputs can contain extraction errors or unsupported predictions.
- Human review is required before downstream knowledge integration.
- Not for diagnosis, treatment, or direct patient-care decisions.
Reproducibility Resources
Code, notebooks, and workflow details are available at:
https://github.com/hurlab/Infectious-Disease-Adjuvant-LLM-Fine-tuning
Citation
If you use this model, please cite the associated manuscript and project repository.
Contact
For questions, please contact hasin.rehana@und.edu.
Model provider
RehanaHasin
Model tree
Base
Qwen/Qwen2.5-7B-Instruct
Fine-tuned
this model
Modalities
Input
Text
Output
Text
Pricing
Dedicated Endpoints
View detailsSupported Functionality
Model APIs
Dedicated Endpoints
Container
More information