SupraLabs

Supra-Mini-v6-1M

Deploy Dedicated

Dedicated Endpoints

Run this model inference on single tenant GPU with unmatched speed and reliability at scale.

Learn more

Container

Run this model inference with full control and performance in your environment.

Learn more

Get help setting up a custom Dedicated Endpoints.

Talk with our engineer to get a quote for reserved GPU instances with discounts.

Model Config

Parameters: 1,410,688 (1M)
Architecture: Llama
Vocab size with custom BPE tokenizer: 4096
Hidden Size: 128
Intermediate Size: 256
Hidden Layers: 6
Attention Heads: 4
Key Value Heads: 2
Max Position Embeddings: 1024
Learning rate: 6e-4
Weight Decay: 0.1
Trained in bfloat16

Final Loss

This model reached a final CrossEntropy loss (on the train set) of 3.79.

Benchmarks

All benchmarks were executed using lm_eval.

Table with columns: Task, Value, Random level
Task	Value	Random level
Arc_Easy ↑	0.3026	0.25 (25%)
Wikitext (byte PPL) ↓	3.0043	-
BLiMP ↑	0.6186	0.5 (50%)

For further benchmarks, see benchmarks.md in this repo's files list.

Usage

To use our model, just run this code:

python3
from transformers import pipeline
import torch

print("Loading Supra Mini v6 1M model from Hugging Face...")
pipe = pipeline(
    "text-generation", 
    model="SupraLabs/Supra-Mini-v6-1M",
    device_map="auto",
    torch_dtype=torch.float16 if torch.cuda.is_available() else torch.float32
)

def generate_text(prompt, max_length=150):
    result = pipe(
        prompt, 
        max_new_tokens=max_length,
        do_sample=True,
        temperature=0.5,
        top_k=25,
        top_p=0.9,
        repetition_penalty=1.2,
        pad_token_id=pipe.tokenizer.pad_token_id,
        eos_token_id=pipe.tokenizer.eos_token_id
    )
    return result[0]['generated_text']

test_prompt = "The importance of education is"
print(f"\nPrompt: {test_prompt}")
print("-" * 30)
print("\nOutput:\n" + generate_text(test_prompt))

Use cases

Educational research
deployment or testing/fine-tuning on edge environments
Or more simply, for fun

Limitations

Cannot reason, chat, or code
Incoherent more often than not
Mostly unfactual

Training guide

We trained Supra Mini v6 1M on a single NVIDIA RTX 5060 Ti 16GB in ~3 hours for 1 epoch. The full training code can be found in this repo as train_tokenizer.py (train costum BPE tokenizer with vocab size of 16384) and train_model.py (train the model). The model was trained on the first 5 billion tokens of 70% Sample-10BT from Fineweb-Edu and 30% Cosmopedia-v2.

Model provider

SupraLabs

Model tree

Base

this model

Modalities

Input

Text

Output

Text

Pricing

Dedicated Endpoints

View details

Supported Functionality

Model APIs

Dedicated Endpoints

Container

More information

Model card

Explore FriendliAI today

Get started Talk to an engineer

Model Config

Parameters: 1,410,688 (1M)
Architecture: Llama
Vocab size with custom BPE tokenizer: 4096
Hidden Size: 128
Intermediate Size: 256
Hidden Layers: 6
Attention Heads: 4
Key Value Heads: 2
Max Position Embeddings: 1024
Learning rate: 6e-4
Weight Decay: 0.1
Trained in bfloat16

Final Loss

This model reached a final CrossEntropy loss (on the train set) of 3.79.

Benchmarks

All benchmarks were executed using lm_eval.

Table with columns: Task, Value, Random level
Task	Value	Random level
Arc_Easy ↑	0.3026	0.25 (25%)
Wikitext (byte PPL) ↓	3.0043	-
BLiMP ↑	0.6186	0.5 (50%)

For further benchmarks, see benchmarks.md in this repo's files list.

Usage

To use our model, just run this code:

python3
from transformers import pipeline
import torch

print("Loading Supra Mini v6 1M model from Hugging Face...")
pipe = pipeline(
    "text-generation", 
    model="SupraLabs/Supra-Mini-v6-1M",
    device_map="auto",
    torch_dtype=torch.float16 if torch.cuda.is_available() else torch.float32
)

def generate_text(prompt, max_length=150):
    result = pipe(
        prompt, 
        max_new_tokens=max_length,
        do_sample=True,
        temperature=0.5,
        top_k=25,
        top_p=0.9,
        repetition_penalty=1.2,
        pad_token_id=pipe.tokenizer.pad_token_id,
        eos_token_id=pipe.tokenizer.eos_token_id
    )
    return result[0]['generated_text']

test_prompt = "The importance of education is"
print(f"\nPrompt: {test_prompt}")
print("-" * 30)
print("\nOutput:\n" + generate_text(test_prompt))

Supra-Mini-v6-1M

Get help setting up a custom Dedicated Endpoints.

README

Model Config

Final Loss

Benchmarks

Usage

Use cases

Limitations

Training guide

Explore FriendliAI today

README

Model Config

Final Loss

Benchmarks

Usage

Use cases

Limitations

Training guide