Run this model inference on single tenant GPU with unmatched speed and reliability at scale.
Run this model inference with full control and performance in your environment.
Get help setting up a custom Dedicated Endpoints.
Talk with our engineer to get a quote for reserved GPU instances with discounts.
README
License: apache-2.0What it knows
The model has expert knowledge across all 6 EU framework programmes:
| Programme | Period | Projects |
|---|---|---|
| FP4 | 1994–1998 | ~13K |
| FP5 | 1998–2002 | ~17K |
| FP6 | 2002–2006 | ~10K |
| FP7 | 2007–2013 | ~26K |
| Horizon 2020 | 2014–2020 | ~36K |
| Horizon Europe | 2021–2027 | ~24K |
It can answer questions about:
- 🔬 Project details: objectives, methodology, expected impacts
- 💰 Funding information: total costs, EC contributions, funding schemes
- 📅 Timelines: start/end dates, project duration
- 🏢 Organizations: coordinators, participants, country information
- 🧬 Scientific domains: EuroSciVoc classifications, research topics
- 📊 Programme-level statistics: funding distribution, topic clusters
- 🔄 Cross-programme comparisons: funding trends across FP4→Horizon Europe
Usage
With PEFT (recommended)
python
from peft import PeftModelfrom transformers import AutoModelForCausalLM, AutoTokenizerbase_model = AutoModelForCausalLM.from_pretrained("Qwen/Qwen2.5-7B-Instruct",torch_dtype="auto",device_map="auto",)model = PeftModel.from_pretrained(base_model, "RCaz/Qwen2.5-7B-EU-Funding-Expert")tokenizer = AutoTokenizer.from_pretrained("Qwen/Qwen2.5-7B-Instruct")messages = [{"role": "system", "content": "You are an expert assistant specializing in European Union research funding programmes."},{"role": "user", "content": "What were the main funding priorities under Horizon 2020?"}]text = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)inputs = tokenizer(text, return_tensors="pt").to(model.device)outputs = model.generate(**inputs, max_new_tokens=512, temperature=0.7)print(tokenizer.decode(outputs[0][inputs.input_ids.shape[1]:], skip_special_tokens=True))
Example prompts
- "Tell me about the ITER project and its EU funding."
- "How much did the EU invest in quantum computing research under Horizon 2020?"
- "Compare the total budgets of FP7 and Horizon 2020."
- "Which organizations coordinated the most EU-funded AI projects?"
- "What scientific domains received the highest funding in Horizon Europe?"
Training details
| Parameter | Value |
|---|---|
| Base model | Qwen/Qwen2.5-7B-Instruct (7.62B params) |
| Method | LoRA SFT (Supervised Fine-Tuning) |
| LoRA rank | 64 |
| LoRA alpha | 16 |
| LoRA target | all-linear layers |
| LoRA dropout | 0.05 |
| Trainable params | ~410M (5.4% of total) |
| Dataset | RCaz/eu-funding-cordis-qa |
| Training samples | 413,499 |
| Validation samples | 21,764 |
| Max sequence length | 2048 |
| Packing | BFD (Best-Fit Decreasing) |
| Avg tokens/sample | ~285 |
| Precision | bf16 |
| Optimizer | AdamW (fused) |
| Learning rate | 2e-4 (cosine schedule) |
| Batch size | 1 × 8 grad accum = 8 effective |
| Hardware | NVIDIA L4 (24GB) |
| Flash Attention | 2.0 |
| Gradient checkpointing | ✓ |
Dataset
The training data was generated from official CORDIS CSV exports covering 126K+ EU-funded research projects. Eight types of Q&A conversations were created:
- Project overview — What is the project about?
- Funding details — How much funding did it receive?
- Timeline — When did it start/end?
- Organizations — Who coordinates/participates?
- Scientific domains — What fields does it cover?
- Topics — What EU topics/calls is it associated with?
- Programme-level — Statistics and trends within a programme
- Cross-programme — Comparisons across framework programmes
All conversations follow the ChatML format with a system prompt establishing EU funding expertise.
Limitations
- Knowledge is based on CORDIS data snapshots and may not reflect the very latest project updates
- The model is specialized for EU funding — it may be less capable on general knowledge tasks compared to the base model
- Financial figures and project details are as accurate as the source CORDIS data
- The model may occasionally hallucinate details for very specific project queries
Citation
If you use this model, please cite:
bibtex
@misc{rcaz2026eufunding,title={Qwen2.5-7B-EU-Funding-Expert: A Fine-tuned LLM for European Research Funding},author={RCaz},year={2026},url={https://huggingface.co/RCaz/Qwen2.5-7B-EU-Funding-Expert}}
Acknowledgments
Model provider
RCaz
Model tree
Base
Qwen/Qwen2.5-7B-Instruct
Adapter
this model
Modalities
Input
Text
Output
Text
Pricing
Dedicated Endpoints
View detailsSupported Functionality
Model APIs
Dedicated Endpoints
Container
More information