Dedicated Endpoints

Run this model inference on single tenant GPU with unmatched speed and reliability at scale.

Learn more
Container

Run this model inference with full control and performance in your environment.

Learn more

Get help setting up a custom Dedicated Endpoints.

Talk with our engineer to get a quote for reserved GPU instances with discounts.

README

License: apache-2.0

訓練配置

項目
模型架構GPT-2 (n_layer=12, n_head=12, n_embd=768)
參數量124,242,432
Tokenizerhuggingface-course/code-search-net-tokenizer (BPE, vocab=50,000)
Context length128 tokens
訓練集huggingface-course/codeparrot-ds-train (16,702,061 length-128 chunks)
驗證集huggingface-course/codeparrot-ds-valid
OptimizerAdamW (β₁=0.9, β₂=0.999, weight_decay=0.1)
Learning rate5×10⁻⁴, cosine schedule, warmup 1,000 steps
Effective batch size256 (per_device_bs=64 × grad_accum=2 × world_size=2)
Precisionfp16
平行化DistributedDataParallel (DDP), NCCL/RCCL backend
總步數65,243 (1 epoch)
Eval / save 間隔每 5,000 steps

硬體環境

  • GPU:2 × AMD Radeon Instinct MI50 (32 GB HBM2 each, gfx906)
  • 平台:PyTorch + ROCm,容器化部署
  • 訓練時間:約 19 小時
  • 平均 throughput:159.8 samples/sec, ~1.41 sec/step

Loss 與訓練動力學

每 5,000 steps 取一個訓練 metrics 與 eval metrics 紀錄。

stepepochlearning_ratetrain_lossgrad_normeval_loss
5,0000.0774.952×10⁻⁴2.6770.1801.752
10,0000.1534.762×10⁻⁴1.6850.1521.520
15,0000.2304.437×10⁻⁴1.5290.1531.415
20,0000.3073.996×10⁻⁴1.4470.1451.347
25,0000.3833.467×10⁻⁴1.3860.1541.295
30,0000.4602.880×10⁻⁴1.3340.1601.247
35,0000.5372.271×10⁻⁴1.2880.1601.204
40,0000.6131.675×10⁻⁴1.2410.1701.160
45,0000.6901.128×10⁻⁴1.2000.1751.123
50,0000.7666.631×10⁻⁵1.1620.1741.090
55,0000.8433.072×10⁻⁵1.1350.1801.066
60,0000.9208.175×10⁻⁶1.1130.1911.054
65,0000.9961.78×10⁻⁸1.1060.1801.051

訓練未進行更多 epoch 或超參數搜尋。後半段 cosine 衰減使 lr 趨近於零,gradient norm 維持在 0.15-0.19 區間,未出現發散或不穩定徵兆。

使用

python

from transformers import pipeline
pipe = pipeline(
"text-generation",
model="Marcoson320/codeparrot-gpt2-mi50",
device=0,
)
print(pipe("# scatter plot of x, y\n", max_new_tokens=64)[0]["generated_text"])

限制

  • Context 上限 128 tokens,無法處理較長之程式碼段落。
  • 訓練資料偏重 pandas / sklearn / matplotlib / seaborn 之 GitHub Python,其他領域之程式碼覆蓋有限。
  • 模型容量小,續寫易出現 repetition;推論時可設 repetition_penalty>1.0no_repeat_ngram_size 緩解。

Model provider

Marcoson320

Model tree

Base

this model

Modalities

Input

Text

Output

Text

Pricing

Dedicated Endpoints

View details

Supported Functionality

Model APIs

Dedicated Endpoints

Container

More information

Explore FriendliAI today