Shrijanagain

TIGER-OM

Deploy Dedicated

📊 Model Details

Model Name: TIGER-OM (SKT-OM)
Architecture: Mixture of Experts (MoE)
Total Parameters: 13B (Active parameters much lower due to MoE sparsity)
Base Models:
- Primary Base: Shrijanagain/ST-X-0
- Expert Integration: Mistral-7B
Format: Safetensors (Safe & Fast loading)
Quantization: FP16 / BF16 (Original) + Q4_K_M GGUF available in separate repo
Context Length: 8192 tokens
Training Hardware: AMD Developer Cloud GPUs ($100 developer credits)
Inference Optimized: ROCm 7.0 + vLLM + AMD MI300X

🌟 Key Features

True MoE Architecture — Sparse activation for better efficiency and performance
Think Mode Reasoning — Advanced Chain-of-Thought, Planning, Self-Reflection & Verification
Dynamic Plugin System — Intelligent routing to Code, Math, Search, Data Analysis plugins
Agentic Capabilities — Full LangGraph multi-agent workflow
Advanced RAG Integration — SKT RAG + Query Rewriting + Multi-hop + Reranking
Stateful Memory — Persistent conversation context

🏗️ Architecture Breakdown

TIGER-OM is built on a 13B MoE backbone:

Base: Shrijanagain/ST-X-0 (strong foundational model)
Experts: Fine-tuned using Mistral-7B as expert layers for specialized reasoning and tool-use capabilities
Router Network: Learned gating mechanism for expert selection
Think Mode Layer: Custom system prompt + reasoning controller
Plugin Head: Tool calling & execution layer

This hybrid approach (ST-X-0 + Mistral-7B experts) gives excellent reasoning, code understanding, and general intelligence while maintaining MoE efficiency.

📁 Files in this Repo (Safetensors)

model-00001-of-0000X.safetensors → Main model weights
config.json
tokenizer.json / tokenizer_config.json
generation_config.json
special_tokens_map.json
model.safetensors.index.json

All weights are in safe safetensors format — No pickle risk.

🚀 How to Use (Safetensors)

python
from transformers import AutoModelForCausalLM, AutoTokenizer
import torch

model_name = "Shrijanagain/TIGER-OM"

tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(
    model_name,
    torch_dtype=torch.bfloat16,
    device_map="auto",
    trust_remote_code=True
)

prompt = """You are SKT-OM, an advanced agentic AI with Think Mode enabled.
User Query: Calculate training cost comparison and suggest best option..."""

inputs = tokenizer(prompt, return_tensors="pt").to(model.device)

outputs = model.generate(
    **inputs,
    max_new_tokens=1024,
    temperature=0.7,
    top_p=0.9,
    do_sample=True,
    repetition_penalty=1.1
)

print(tokenizer.decode(outputs[0], skip_special_tokens=True))

🔗 Important Links

Live Demo: SKT-OM Space
GGUF Quantized (Q4_K_M): Shrijanagain/TIGER-GGUF
GitHub (RAG + ADK Code): SHRIJANAGAIN/SKT-AMD-FILES

🛠️ Technologies & Stack

Base Models: Shrijanagain/ST-X-0 + Mistral-7B Experts
RAG: SKT RAG + AMD ADK Kit
Agents: LangGraph
Hardware: AMD MI300X + ROCm 7.0
Inference: vLLM (FP16) + transformers (Safetensors)
Training: AMD Developer Cloud

⚡ Performance

Excellent balance of quality vs efficiency due to MoE architecture
Strong performance on reasoning, tool-use, code, and multi-step tasks
Significantly lower inference cost compared to dense 13B+ models

📌 Use Cases

Complex technical Q&A
Agentic workflows & tool calling
Research assistance
Code generation & debugging
Mathematical & logical reasoning
Comparative analysis
Data analysis with plugins

🏆 Hackathon

AMD Developer Hackathon 2026
Trained entirely on AMD Developer Cloud
Fully built in public with multiple technical updates.

📄 License

MIT License

Available on FriendliAI

Dedicated Endpoints

Run this model inference on single tenant GPU with unmatched speed and reliability at scale.

Learn more

Container

Run this model inference with full control and performance in your environment.

Learn more

Model Details

Model Provider

Shrijanagain

Model Tree

Base

mistralai/Mistral-7B-Instruct-v0.3

Fine-tuned

this model

Input Modalities

Text

Output Modalities

Text

Supported Functionality

Dedicated EndpointsContainer

Explore FriendliAI today

Get started Talk to an engineer

📊 Model Details

Model Name: TIGER-OM (SKT-OM)
Architecture: Mixture of Experts (MoE)
Total Parameters: 13B (Active parameters much lower due to MoE sparsity)
Base Models:
- Primary Base: Shrijanagain/ST-X-0
- Expert Integration: Mistral-7B
Format: Safetensors (Safe & Fast loading)
Quantization: FP16 / BF16 (Original) + Q4_K_M GGUF available in separate repo
Context Length: 8192 tokens
Training Hardware: AMD Developer Cloud GPUs ($100 developer credits)
Inference Optimized: ROCm 7.0 + vLLM + AMD MI300X

🌟 Key Features

True MoE Architecture — Sparse activation for better efficiency and performance
Think Mode Reasoning — Advanced Chain-of-Thought, Planning, Self-Reflection & Verification
Dynamic Plugin System — Intelligent routing to Code, Math, Search, Data Analysis plugins
Agentic Capabilities — Full LangGraph multi-agent workflow
Advanced RAG Integration — SKT RAG + Query Rewriting + Multi-hop + Reranking
Stateful Memory — Persistent conversation context

🏗️ Architecture Breakdown

TIGER-OM is built on a 13B MoE backbone:

Base: Shrijanagain/ST-X-0 (strong foundational model)
Experts: Fine-tuned using Mistral-7B as expert layers for specialized reasoning and tool-use capabilities
Router Network: Learned gating mechanism for expert selection
Think Mode Layer: Custom system prompt + reasoning controller
Plugin Head: Tool calling & execution layer

This hybrid approach (ST-X-0 + Mistral-7B experts) gives excellent reasoning, code understanding, and general intelligence while maintaining MoE efficiency.

📁 Files in this Repo (Safetensors)

model-00001-of-0000X.safetensors → Main model weights
config.json
tokenizer.json / tokenizer_config.json
generation_config.json
special_tokens_map.json
model.safetensors.index.json

All weights are in safe safetensors format — No pickle risk.

🚀 How to Use (Safetensors)

python
from transformers import AutoModelForCausalLM, AutoTokenizer
import torch

model_name = "Shrijanagain/TIGER-OM"

tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(
    model_name,
    torch_dtype=torch.bfloat16,
    device_map="auto",
    trust_remote_code=True
)

prompt = """You are SKT-OM, an advanced agentic AI with Think Mode enabled.
User Query: Calculate training cost comparison and suggest best option..."""

inputs = tokenizer(prompt, return_tensors="pt").to(model.device)

outputs = model.generate(
    **inputs,
    max_new_tokens=1024,
    temperature=0.7,
    top_p=0.9,
    do_sample=True,
    repetition_penalty=1.1
)

print(tokenizer.decode(outputs[0], skip_special_tokens=True))

🔗 Important Links

Live Demo: SKT-OM Space
GGUF Quantized (Q4_K_M): Shrijanagain/TIGER-GGUF
GitHub (RAG + ADK Code): SHRIJANAGAIN/SKT-AMD-FILES

🛠️ Technologies & Stack

Base Models: Shrijanagain/ST-X-0 + Mistral-7B Experts
RAG: SKT RAG + AMD ADK Kit
Agents: LangGraph
Hardware: AMD MI300X + ROCm 7.0
Inference: vLLM (FP16) + transformers (Safetensors)
Training: AMD Developer Cloud

⚡ Performance

Excellent balance of quality vs efficiency due to MoE architecture
Strong performance on reasoning, tool-use, code, and multi-step tasks
Significantly lower inference cost compared to dense 13B+ models

📌 Use Cases

Complex technical Q&A
Agentic workflows & tool calling
Research assistance
Code generation & debugging
Mathematical & logical reasoning
Comparative analysis
Data analysis with plugins

🏆 Hackathon

AMD Developer Hackathon 2026
Trained entirely on AMD Developer Cloud
Fully built in public with multiple technical updates.

📄 License

MIT License

TIGER-OM

README

📊 Model Details

🌟 Key Features

🏗️ Architecture Breakdown

📁 Files in this Repo (Safetensors)

🚀 How to Use (Safetensors)

🔗 Important Links

🛠️ Technologies & Stack

⚡ Performance

📌 Use Cases

🏆 Hackathon

📄 License

Explore FriendliAI today

README

📊 Model Details

🌟 Key Features

🏗️ Architecture Breakdown

📁 Files in this Repo (Safetensors)

🚀 How to Use (Safetensors)

🔗 Important Links

🛠️ Technologies & Stack

⚡ Performance

📌 Use Cases

🏆 Hackathon

📄 License