Overview
Large language models are often strong conversationalists but can struggle with:
- Multi-step planning
- Tool selection and invocation
- ReAct-style reasoning workflows
- Structured action generation
- Separating reasoning from final responses
GPT-OSS AgentBoi adapts GPT-OSS-20B toward these agent-oriented tasks while remaining trainable on consumer hardware through parameter-efficient fine-tuning.
Model Details
Table with columns: Item, Value| Item | Value |
|---|
| Model Name | GPT-OSS AgentBoi |
| Author | shiv207 |
| Base Model | unsloth/gpt-oss-20b-unsloth-bnb-4bit |
| Training Method | LoRA |
| Framework | Unsloth |
| Dataset | Agent-FLAN (ReAct subset) |
| Primary Task | Agentic Tool Use |
| Language | English |
| License | Apache 2.0 |
Training Data
The model was fine-tuned using examples from the Agent-FLAN dataset, specifically the ReAct-style instruction trajectories.
These examples teach the model to:
- Break complex tasks into intermediate steps
- Decide when tool usage is appropriate
- Generate structured actions
- Follow action-observation loops
- Produce concise final responses
Training Setup
Training was performed using:
- GPT-OSS-20B
- Unsloth
- TRL
- LoRA adapters
- Google Colab Tesla T4 GPU
The objective was to improve agentic behavior while keeping training accessible on limited hardware.
Intended Use
This model is intended for:
- AI agents
- Tool-calling systems
- Research assistants
- Retrieval-augmented generation workflows
- Multi-step planning tasks
- Agentic reasoning experiments
Potential applications include:
- Search agents
- Knowledge retrieval systems
- Function-calling assistants
- Research copilots
- Workflow automation agents
Example
User
Search for the latest SpaceX launch and summarize it.
Expected Agent Behavior
- Analyze the request.
- Determine that external information is required.
- Generate a structured search action.
- Process retrieved information.
- Produce a concise final answer.
The fine-tuning objective is to increase consistency in these workflows compared to the base model.
Loading the Model
from unsloth import FastLanguageModel
model, tokenizer = FastLanguageModel.from_pretrained(
"shiv207/gpt_oss_AGENTBOI"
)
Limitations
- Evaluated primarily through qualitative testing.
- No formal benchmark suite was used.
- Training utilized only a subset of Agent-FLAN.
- Performance may vary on unseen tool schemas.
- Not optimized for general-purpose instruction tuning beyond agent-oriented tasks.
Acknowledgments
This project builds upon the work of:
- OpenAI for GPT-OSS and the Harmony conversation format.
- Unsloth for efficient GPT-OSS fine-tuning support.
- InternLM for the Agent-FLAN dataset.
Repository
Source code and training notebook:
GitHub: https://github.com/shiv207
Author
shiv207
If you find this project useful, feel free to open issues, share feedback, or build on top of it.