aungkomyint

tara10m-sft-v1-2k

Dedicated Endpoints

Run this model inference on single tenant GPU with unmatched speed and reliability at scale.

Learn more
Container

Run this model inference with full control and performance in your environment.

Learn more

Get help setting up a custom Dedicated Endpoints.

Talk with our engineer to get a quote for reserved GPU instances with discounts.

README

License: apache-2.0

Model Details

  • Architecture: Llama-style decoder-only causal LM
  • Parameters: 10,390,784
  • Layers: 6
  • Hidden size: 256
  • Attention heads: 4
  • Vocabulary: 16,000 SentencePiece tokens
  • Context length: 1,024
  • Base checkpoint: Tara10M Colab base checkpoint
  • SFT data: 2,000 cleaned synthetic Burmese-English instruction examples

Intended Use

Best test areas:

  • short English to Burmese translation
  • short Burmese to English translation
  • simple Burmese rewrite
  • travel phrasebook style prompts

Prompt format:

text

Instruction: Translate this sentence to Burmese.
Input: I will go home tomorrow.
Response:

Limitations

This model is very small and still repeats, drifts, and gives wrong translations. It should be used for experimentation only.

Known issues:

  • weak factual reliability
  • repeated phrases
  • mixed Burmese/English output
  • poor long-form generation
  • not suitable for safety-critical use

Training Summary

text

SFT examples: 2,000
Train examples: 1,800
Validation examples: 200
Max steps: 400
Learning rate: 2e-5
Best validation loss: 2.4228

Files

  • model.safetensors
  • config.json
  • tokenizer.model
  • tokenizer_config.json
  • special_tokens_map.json

Model provider

aungkomyint

Model tree

Base

this model

Modalities

Input

Text

Output

Text

Pricing

Dedicated Endpoints

View details

Supported Functionality

Model APIs

Dedicated Endpoints

Container

More information

Explore FriendliAI today