aungkomyint
tara10m-sft-v1-2k
Dedicated Endpoints
Run this model inference on single tenant GPU with unmatched speed and reliability at scale.
Container
Run this model inference with full control and performance in your environment.
Get help setting up a custom Dedicated Endpoints.
Talk with our engineer to get a quote for reserved GPU instances with discounts.
README
License: apache-2.0Model Details
- Architecture: Llama-style decoder-only causal LM
- Parameters: 10,390,784
- Layers: 6
- Hidden size: 256
- Attention heads: 4
- Vocabulary: 16,000 SentencePiece tokens
- Context length: 1,024
- Base checkpoint: Tara10M Colab base checkpoint
- SFT data: 2,000 cleaned synthetic Burmese-English instruction examples
Intended Use
Best test areas:
- short English to Burmese translation
- short Burmese to English translation
- simple Burmese rewrite
- travel phrasebook style prompts
Prompt format:
text
Instruction: Translate this sentence to Burmese.Input: I will go home tomorrow.Response:
Limitations
This model is very small and still repeats, drifts, and gives wrong translations. It should be used for experimentation only.
Known issues:
- weak factual reliability
- repeated phrases
- mixed Burmese/English output
- poor long-form generation
- not suitable for safety-critical use
Training Summary
text
SFT examples: 2,000Train examples: 1,800Validation examples: 200Max steps: 400Learning rate: 2e-5Best validation loss: 2.4228
Files
model.safetensorsconfig.jsontokenizer.modeltokenizer_config.jsonspecial_tokens_map.json
Model provider
aungkomyint
Model tree
Base
this model
Modalities
Input
Text
Output
Text
Pricing
Dedicated Endpoints
View detailsSupported Functionality
Model APIs
Dedicated Endpoints
Container
More information