Run this model inference on single tenant GPU with unmatched speed and reliability at scale.
Run this model inference with full control and performance in your environment.
Get help setting up a custom Dedicated Endpoints.
Talk with our engineer to get a quote for reserved GPU instances with discounts.
README
License: apache-2.0Overview
Large language models are often strong conversationalists but can struggle with:
- Multi-step planning
- Tool selection and invocation
- ReAct-style reasoning workflows
- Structured action generation
- Separating reasoning from final responses
GPT-OSS AgentBoi adapts GPT-OSS-20B toward these agent-oriented tasks while remaining trainable on consumer hardware through parameter-efficient fine-tuning.
Model Details
| Item | Value |
|---|---|
| Model Name | GPT-OSS AgentBoi |
| Author | shiv207 |
| Base Model | unsloth/gpt-oss-20b-unsloth-bnb-4bit |
| Training Method | LoRA |
| Framework | Unsloth |
| Dataset | Agent-FLAN (ReAct subset) |
| Primary Task | Agentic Tool Use |
| Language | English |
| License | Apache 2.0 |
Training Data
The model was fine-tuned using examples from the Agent-FLAN dataset, specifically the ReAct-style instruction trajectories.
These examples teach the model to:
- Break complex tasks into intermediate steps
- Decide when tool usage is appropriate
- Generate structured actions
- Follow action-observation loops
- Produce concise final responses
Training Setup
Training was performed using:
- GPT-OSS-20B
- Unsloth
- TRL
- LoRA adapters
- Google Colab Tesla T4 GPU
The objective was to improve agentic behavior while keeping training accessible on limited hardware.
Intended Use
This model is intended for:
- AI agents
- Tool-calling systems
- Research assistants
- Retrieval-augmented generation workflows
- Multi-step planning tasks
- Agentic reasoning experiments
Potential applications include:
- Search agents
- Knowledge retrieval systems
- Function-calling assistants
- Research copilots
- Workflow automation agents
Example
User
text
Search for the latest SpaceX launch and summarize it.
Expected Agent Behavior
- Analyze the request.
- Determine that external information is required.
- Generate a structured search action.
- Process retrieved information.
- Produce a concise final answer.
The fine-tuning objective is to increase consistency in these workflows compared to the base model.
Loading the Model
python
from unsloth import FastLanguageModelmodel, tokenizer = FastLanguageModel.from_pretrained("shiv207/gpt_oss_AGENTBOI")
Limitations
- Evaluated primarily through qualitative testing.
- No formal benchmark suite was used.
- Training utilized only a subset of Agent-FLAN.
- Performance may vary on unseen tool schemas.
- Not optimized for general-purpose instruction tuning beyond agent-oriented tasks.
Acknowledgments
This project builds upon the work of:
- OpenAI for GPT-OSS and the Harmony conversation format.
- Unsloth for efficient GPT-OSS fine-tuning support.
- InternLM for the Agent-FLAN dataset.
Repository
Source code and training notebook:
GitHub: https://github.com/shiv207
Author
shiv207
If you find this project useful, feel free to open issues, share feedback, or build on top of it.
Model provider
shiv207
Model tree
Base
unsloth/gpt-oss-20b-unsloth-bnb-4bit
Adapter
this model
Modalities
Input
Text
Output
Text
Pricing
Dedicated Endpoints
View detailsSupported Functionality
Model APIs
Dedicated Endpoints
Container
More information