Dedicated Endpoints

Run this model inference on single tenant GPU with unmatched speed and reliability at scale.

Learn more
Container

Run this model inference with full control and performance in your environment.

Learn more

Get help setting up a custom Dedicated Endpoints.

Talk with our engineer to get a quote for reserved GPU instances with discounts.

README

License: mit

๐Ÿ“Š Model Details

  • Model Name: TIGER-OM (SKT-OM)
  • Architecture: Mixture of Experts (MoE)
  • Total Parameters: 13B (Active parameters much lower due to MoE sparsity)
  • Base Models:
    • Primary Base: Shrijanagain/ST-X-0
    • Expert Integration: Mistral-7B
  • Format: Safetensors (Safe & Fast loading)
  • Quantization: FP16 / BF16 (Original) + Q4_K_M GGUF available in separate repo
  • Context Length: 8192 tokens
  • Training Hardware: AMD Developer Cloud GPUs ($100 developer credits)
  • Inference Optimized: ROCm 7.0 + vLLM + AMD MI300X

๐ŸŒŸ Key Features

  • True MoE Architecture โ€” Sparse activation for better efficiency and performance
  • Think Mode Reasoning โ€” Advanced Chain-of-Thought, Planning, Self-Reflection & Verification
  • Dynamic Plugin System โ€” Intelligent routing to Code, Math, Search, Data Analysis plugins
  • Agentic Capabilities โ€” Full LangGraph multi-agent workflow
  • Advanced RAG Integration โ€” SKT RAG + Query Rewriting + Multi-hop + Reranking
  • Stateful Memory โ€” Persistent conversation context

๐Ÿ—๏ธ Architecture Breakdown

TIGER-OM is built on a 13B MoE backbone:

  • Base: Shrijanagain/ST-X-0 (strong foundational model)
  • Experts: Fine-tuned using Mistral-7B as expert layers for specialized reasoning and tool-use capabilities
  • Router Network: Learned gating mechanism for expert selection
  • Think Mode Layer: Custom system prompt + reasoning controller
  • Plugin Head: Tool calling & execution layer

This hybrid approach (ST-X-0 + Mistral-7B experts) gives excellent reasoning, code understanding, and general intelligence while maintaining MoE efficiency.


๐Ÿ“ Files in this Repo (Safetensors)

  • model-00001-of-0000X.safetensors โ†’ Main model weights
  • config.json
  • tokenizer.json / tokenizer_config.json
  • generation_config.json
  • special_tokens_map.json
  • model.safetensors.index.json

All weights are in safe safetensors format โ€” No pickle risk.


๐Ÿš€ How to Use (Safetensors)

python

from transformers import AutoModelForCausalLM, AutoTokenizer
import torch
model_name = "Shrijanagain/TIGER-OM"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(
model_name,
torch_dtype=torch.bfloat16,
device_map="auto",
trust_remote_code=True
)
prompt = """You are SKT-OM, an advanced agentic AI with Think Mode enabled.
User Query: Calculate training cost comparison and suggest best option..."""
inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
outputs = model.generate(
**inputs,
max_new_tokens=1024,
temperature=0.7,
top_p=0.9,
do_sample=True,
repetition_penalty=1.1
)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))

๐Ÿ”— Important Links


๐Ÿ› ๏ธ Technologies & Stack

  • Base Models: Shrijanagain/ST-X-0 + Mistral-7B Experts
  • RAG: SKT RAG + AMD ADK Kit
  • Agents: LangGraph
  • Hardware: AMD MI300X + ROCm 7.0
  • Inference: vLLM (FP16) + transformers (Safetensors)
  • Training: AMD Developer Cloud

โšก Performance

  • Excellent balance of quality vs efficiency due to MoE architecture
  • Strong performance on reasoning, tool-use, code, and multi-step tasks
  • Significantly lower inference cost compared to dense 13B+ models

๐Ÿ“Œ Use Cases

  • Complex technical Q&A
  • Agentic workflows & tool calling
  • Research assistance
  • Code generation & debugging
  • Mathematical & logical reasoning
  • Comparative analysis
  • Data analysis with plugins

๐Ÿ† Hackathon

AMD Developer Hackathon 2026
Trained entirely on AMD Developer Cloud
Fully built in public with multiple technical updates.


๐Ÿ“„ License

MIT License


Model provider

Shrijanagain

Model tree

Base

mistralai/Mistral-7B-Instruct-v0.3

Fine-tuned

this model

Modalities

Input

Text

Output

Text

Pricing

Dedicated Endpoints

View details

Supported Functionality

Model APIs

Dedicated Endpoints

Container

More information

Explore FriendliAI today