Dedicated Endpoints

Run this model inference on single tenant GPU with unmatched speed and reliability at scale.

Learn more
Container

Run this model inference with full control and performance in your environment.

Learn more

Get help setting up a custom Dedicated Endpoints.

Talk with our engineer to get a quote for reserved GPU instances with discounts.

README

Model Details

Model Description

This is a fine-tuned version of unsloth/Llama-3.1-8B using QLoRA (Quantized Low-Rank Adaptation) via Unsloth. The model is trained on Ichsan2895/alpaca-gpt4-indonesian dataset containing 49,969 Indonesian instruction-response pairs.

The model uses Llama 3.1 chat template with the system prompt: "Kamu adalah asisten AI yang membantu menjawab pertanyaan pengguna berdasarkan konteks yang diberikan."

  • Developed by: Threedotz
  • Model type: Language Model (Causal LM)
  • Language(s) (NLP): Indonesian (id)
  • License: llama3.1 license
  • Finetuned from model: unsloth/Llama-3.1-8B

Model Sources

Uses

Direct Use

Indonesian language tasks:

  • Question answering in Bahasa Indonesia
  • Instruction following
  • Text generation in Indonesian

Out-of-Scope Use

  • Non-Indonesian language tasks
  • Medical, legal, or financial advice without human oversight

How to Get Started with the Model

python

from transformers import AutoModelForCausalLM, AutoTokenizer, TextStreamer
import torch
# Load model
model_name = "threedotz/llama3.1-8b-qlora-alpaca-indonesian"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(model_name, device_map="auto")
# Prompt
prompt = "Apa itu machine learning?"
# Tokenize & generate
inputs = tokenizer(prompt, return_tensors="pt").to("cuda")
text_streamer = TextStreamer(tokenizer, skip_prompt=True)
_ = model.generate(input_ids=inputs["input_ids"], attention_mask=inputs["attention_mask"],
max_new_tokens=256, streamer=text_streamer)

Training Details

Training Data

  • Dataset: Ichsan2895/alpaca-gpt4-indonesian
  • Size: 49,969 instruction-response pairs
  • Language: Indonesian (Bahasa Indonesia)
  • Format: Alpaca instruction format (input → output)
  • License: CC-BY-SA-4.0

Training Procedure

Preprocessing

Dataset formatted using Llama 3.1 chat template with system prompt.

Training Hyperparameters

HyperparameterValue
Training regime4-bit QLoRA (bf16/fp16)
Max steps800
Per device batch size1
Gradient accumulation steps4
Total batch size4
Learning rate2e-4
LR schedulerlinear
Warmup steps5
Max sequence length512
Optimizerpaged_adamw_8bit
Gradient checkpointingunsloth

QLoRA Configuration

ParameterValue
LoRA rank (r)16
LoRA alpha16
LoRA dropout0
Target modulesq_proj, k_proj, v_proj, o_proj, gate_proj, up_proj, down_proj
Load in 4-bitTrue

Model Statistics

MetricValue
Total parameters8,072,204,288
Trainable parameters41,943,040
Trainable %0.52%

Technical Specifications

Model Architecture

  • Architecture: Llama 3.1 (Decoder-only Transformer)
  • Parameters: 8 billion
  • Quantization: 4-bit (QLoRA)
  • Max sequence length: 2048

Compute Infrastructure

  • GPU: Tesla T4 (Kaggle)
  • Training time: ~1h 17min 11s
  • Framework: Unsloth + TRL + Transformers

Citation

BibTeX:

bibtex

@misc{threedotz2024llama31indonesian,
author = {Threedotz},
title = {Llama 3.1 8B QLoRA Fine-tuned on Alpaca Indonesian},
year = {2024},
publisher = {HuggingFace},
url = {https://huggingface.co/threedotz/llama3.1-8b-qlora-alpaca-indonesian}
}

Model Card Contact

For questions, please contact Threedotz on HuggingFace.

Model provider

Threedotz

Model tree

Base

this model

Modalities

Input

Text

Output

Text

Pricing

Dedicated Endpoints

View details

Supported Functionality

Model APIs

Dedicated Endpoints

Container

More information

Explore FriendliAI today