Dedicated Endpoints
Run this model inference on single tenant GPU with unmatched speed and reliability at scale.
Get help setting up a custom Dedicated Endpoints.
Talk with our engineer to get a quote for reserved GPU instances with discounts.
README
License: otherQuick start (Python)
bash
pip install transformers peft torch
python
from transformers import AutoTokenizer, AutoModelForCausalLMfrom peft import PeftModelbase = "Qwen/Qwen3.6-27B"adapter = "canxp-ai/maplept-large-canada-legal-cpt-4d6666a7"tokenizer = AutoTokenizer.from_pretrained(base, trust_remote_code=True)model = AutoModelForCausalLM.from_pretrained(base, torch_dtype="bfloat16", device_map="auto", trust_remote_code=True)model = PeftModel.from_pretrained(model, adapter)prompt = "Hello!"inputs = tokenizer(prompt, return_tensors="pt").to(model.device)out = model.generate(**inputs, max_new_tokens=200)print(tokenizer.decode(out[0], skip_special_tokens=True))
CLI download
bash
pip install -U "huggingface_hub[cli]"huggingface-cli download canxp-ai/maplept-large-canada-legal-cpt-4d6666a7 --local-dir ./maplept-large-canada-legal-cpt
Training details
- Base model:
Qwen/Qwen3.6-27B - Method: CPT
- Epochs: 2
- Context length: 8192
- Validation split: 0.1
This adapter inherits the upstream license of the base model. See LICENSE_NOTICE.txt in this repo for details.
Model provider
vamman2001
Model tree
Base
Qwen/Qwen3.6-27B
Adapter
this model
Modalities
Input
Video, Text, Image
Output
Text
Pricing
Dedicated Endpoints
View detailsSupported Functionality
Model APIs
Dedicated Endpoints
Container
More information