shaunak1234

qwen3-32b-telecom-expert

README

License: apache-2.0

🔑 Key Highlights

Base Model: Qwen/Qwen3-32B (33B parameters)
Method: LoRA (Low-Rank Adaptation) — r=64, alpha=128, all-linear targets
Training Data: 2000 multi-turn telecom conversations across 7 domains
Hardware: AMD Instinct MI300X (192GB HBM3) on ROCm 6.2
Training Time: ~3 hours
Trainable Parameters: 536M (1.6% of total)

📡 Domains Covered

Table with columns: Domain, Description
Domain	Description
5G RAN	gNB configuration, beamforming, MIMO, cell planning
5G Core	AMF/SMF/UPF operations, network slicing, NRF management
Transport	MPLS, segment routing, fronthaul/backhaul optimization
Security	IPsec, SUPI/SUCI encryption, network access control
Automation	Ansible/Terraform for network, closed-loop operations
VoLTE/IMS	SIP call flows, QoS, VoNR migration
Cloud Native	CNF deployment, Kubernetes for telco, service mesh

🚀 Quick Start

Loading with PEFT

python
from peft import PeftModel
from transformers import AutoModelForCausalLM, AutoTokenizer
import torch

base_model = AutoModelForCausalLM.from_pretrained(
    "Qwen/Qwen3-32B",
    torch_dtype=torch.bfloat16,
    device_map="auto",
    trust_remote_code=True,
)
model = PeftModel.from_pretrained(base_model, "shaunak1234/qwen3-32b-telecom-expert")
tokenizer = AutoTokenizer.from_pretrained("Qwen/Qwen3-32B", trust_remote_code=True)

# Example prompt
messages = [
    {"role": "system", "content": "You are a senior 5G RAN engineer with expertise in network optimization."},
    {"role": "user", "content": "Our gNB is showing high RACH failure rate in a dense urban cell. What's your troubleshooting approach?"}
]

text = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
inputs = tokenizer(text, return_tensors="pt").to(model.device)
outputs = model.generate(**inputs, max_new_tokens=1024, temperature=0.7, top_p=0.9)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))

Using with vLLM (Merged)

python
# First merge the adapter for faster inference
from peft import PeftModel
from transformers import AutoModelForCausalLM

base = AutoModelForCausalLM.from_pretrained("Qwen/Qwen3-32B", torch_dtype=torch.bfloat16)
model = PeftModel.from_pretrained(base, "shaunak1234/qwen3-32b-telecom-expert")
merged = model.merge_and_unload()
merged.save_pretrained("qwen3-32b-telecom-merged")

# Then serve with vLLM
# vllm serve qwen3-32b-telecom-merged --dtype bfloat16

📊 Training Details

Dataset

Source: shaunak1234/telecom-agentic-dataset
Size: 2000 multi-turn conversations (12.8 MB)
Generation: Synthesized using Qwen3-32B via vLLM with domain-specific prompts
Format: ChatML (system/user/assistant turns)
Complexity: Mix of troubleshooting, configuration, architecture, and operational scenarios

Hyperparameters

Table with columns: Parameter, Value
Parameter	Value
LoRA rank (r)	64
LoRA alpha	128
LoRA dropout	0.05
Target modules	all-linear
Batch size	2
Gradient accumulation	16
Effective batch size	32
Learning rate	2e-4
LR scheduler	Cosine

Training Infrastructure

Table with columns: Component, Details
Component	Details
GPU	AMD Instinct MI300X (192GB HBM3)
Platform	AMD DevCloud
Software	PyTorch 2.5.1 + ROCm 6.2
Framework	Transformers 4.52.4, PEFT 0.19.1
Training speed	~58 seconds/step
Total training time	~3 hours
Cost	~ $6 (a t$ 2/hr)

Training Metrics

Initial loss: 2.34
Trainable parameters: 536,870,912 (1.6% of 33.3B total)
Gradient flow: Verified on 896 LoRA parameter tensors

⚠️ Limitations

Fine-tuned on synthetic data generated by the base model — may reflect base model biases
Focused on telecom domain; general capabilities may be slightly reduced
Not trained for real-time network operations or safety-critical decisions
English only

📄 License

Apache 2.0 (following Qwen3-32B base model license)

🙏 Acknowledgments

Qwen Team for the excellent Qwen3-32B base model
AMD for MI300X GPU access via DevCloud
Hugging Face for PEFT, Transformers, and model hosting

Available on FriendliAI

Dedicated Endpoints

Run this model inference on single tenant GPU with unmatched speed and reliability at scale.

Learn more

Container

Run this model inference with full control and performance in your environment.

Learn more

Model Details

Model Provider

shaunak1234

Model Tree

Base

Qwen/Qwen3-32B

Adapter

this model

Input Modalities

Text

Output Modalities