Dedicated Endpoints

Run this model inference on single tenant GPU with unmatched speed and reliability at scale.

Learn more
Container

Run this model inference with full control and performance in your environment.

Learn more

Get help setting up a custom Dedicated Endpoints.

Talk with our engineer to get a quote for reserved GPU instances with discounts.

README

License: apache-2.0

Overview

Large language models are often strong conversationalists but can struggle with:

  • Multi-step planning
  • Tool selection and invocation
  • ReAct-style reasoning workflows
  • Structured action generation
  • Separating reasoning from final responses

GPT-OSS AgentBoi adapts GPT-OSS-20B toward these agent-oriented tasks while remaining trainable on consumer hardware through parameter-efficient fine-tuning.

Model Details

ItemValue
Model NameGPT-OSS AgentBoi
Authorshiv207
Base Modelunsloth/gpt-oss-20b-unsloth-bnb-4bit
Training MethodLoRA
FrameworkUnsloth
DatasetAgent-FLAN (ReAct subset)
Primary TaskAgentic Tool Use
LanguageEnglish
LicenseApache 2.0

Training Data

The model was fine-tuned using examples from the Agent-FLAN dataset, specifically the ReAct-style instruction trajectories.

These examples teach the model to:

  • Break complex tasks into intermediate steps
  • Decide when tool usage is appropriate
  • Generate structured actions
  • Follow action-observation loops
  • Produce concise final responses

Training Setup

Training was performed using:

  • GPT-OSS-20B
  • Unsloth
  • TRL
  • LoRA adapters
  • Google Colab Tesla T4 GPU

The objective was to improve agentic behavior while keeping training accessible on limited hardware.

Intended Use

This model is intended for:

  • AI agents
  • Tool-calling systems
  • Research assistants
  • Retrieval-augmented generation workflows
  • Multi-step planning tasks
  • Agentic reasoning experiments

Potential applications include:

  • Search agents
  • Knowledge retrieval systems
  • Function-calling assistants
  • Research copilots
  • Workflow automation agents

Example

User

text

Search for the latest SpaceX launch and summarize it.

Expected Agent Behavior

  1. Analyze the request.
  2. Determine that external information is required.
  3. Generate a structured search action.
  4. Process retrieved information.
  5. Produce a concise final answer.

The fine-tuning objective is to increase consistency in these workflows compared to the base model.

Loading the Model

python

from unsloth import FastLanguageModel
model, tokenizer = FastLanguageModel.from_pretrained(
"shiv207/gpt_oss_AGENTBOI"
)

Limitations

  • Evaluated primarily through qualitative testing.
  • No formal benchmark suite was used.
  • Training utilized only a subset of Agent-FLAN.
  • Performance may vary on unseen tool schemas.
  • Not optimized for general-purpose instruction tuning beyond agent-oriented tasks.

Acknowledgments

This project builds upon the work of:

  • OpenAI for GPT-OSS and the Harmony conversation format.
  • Unsloth for efficient GPT-OSS fine-tuning support.
  • InternLM for the Agent-FLAN dataset.

Repository

Source code and training notebook:

GitHub: https://github.com/shiv207

Author

shiv207

If you find this project useful, feel free to open issues, share feedback, or build on top of it.

Model provider

shiv207

Model tree

Base

unsloth/gpt-oss-20b-unsloth-bnb-4bit

Adapter

this model

Modalities

Input

Text

Output

Text

Pricing

Dedicated Endpoints

View details

Supported Functionality

Model APIs

Dedicated Endpoints

Container

More information

Explore FriendliAI today