Run this model inference on single tenant GPU with unmatched speed and reliability at scale.
Run this model inference with full control and performance in your environment.
Get help setting up a custom Dedicated Endpoints.
Talk with our engineer to get a quote for reserved GPU instances with discounts.
README
License: apache-2.0📊 Model Details
- Model Name: Nutral v1 Tiny
- Developer: Nebulixlabs
- Model Type: Causal Language Model
- Architecture: Llama (Custom Micro Configuration)
hidden_size: 128intermediate_size: 348num_hidden_layers: 4num_attention_heads: 4num_key_value_heads: 4vocab_size: 2048
- Parameters: ~1.32 Million
- Context Length: 256 Tokens
- Formats Provided: Hugging Face PyTorch (
.safetensors/.bin) &GGUF
🎯 Intended Uses & Capabilities
Because Nutral-v1-Tiny operates with only 1.3M parameters and a restricted 2048-token vocabulary, its capabilities are strictly fundamental.
Primary Use Cases:
- Edge Device Testing: A dummy/baseline LLM to test deployment pipelines (e.g.,
llama.cpp) on hardware with extremely low RAM. - Basic Text Generation: Next-word prediction for simple English sentences.
- Syntax Recognition: Demonstrating basic grammatical structures learned from educational data.
- Educational Purposes: A fast-training baseline to study Llama architecture behavior at a tiny scale.
Out-of-Scope Uses:
- Conversational AI or Chatbots.
- Logical reasoning, math, or coding tasks.
- Factual QA (the model is highly prone to hallucinations due to its size).
🏋️ Training Details
The model was trained from scratch using a fast-extraction pipeline and optimized hardware.
- Dataset: HuggingFaceFW/fineweb-edu (Using the
sample-10BTsplit) - Tokens Trained: 30 Million tokens
- Hardware: 2x NVIDIA T4 GPUs
- Optimizer: AdamW (
optim="adamw_torch") - Precision: FP16
- Hyperparameters:
- Learning Rate:
6e-4 - Weight Decay:
0.01 - Batch Size:
16(with Gradient Accumulation steps:2) - Max Steps:
3700
- Learning Rate:
🚀 How to Get Started
You can load the model using the standard transformers library or run the optimized .gguf file using llama.cpp.
1. Using Hugging Face Transformers
python
import torchfrom transformers import AutoTokenizer, LlamaForCausalLMmodel_id = "Nebulixlabs/Nutral-v1-Tiny"# Load Tokenizer and Modeltokenizer = AutoTokenizer.from_pretrained(model_id)model = LlamaForCausalLM.from_pretrained(model_id)# Generate Textprompt = "The solar system consists of"inputs = tokenizer(prompt, return_tensors="pt")outputs = model.generate(**inputs, max_new_tokens=30, temperature=0.7, do_sample=True)print(tokenizer.decode(outputs[0], skip_special_tokens=True))
Model provider
Nebulixlabs
Model tree
Base
this model
Modalities
Input
Text
Output
Text
Pricing
Dedicated Endpoints
View detailsSupported Functionality
Model APIs
Dedicated Endpoints
Container
More information