Dedicated Endpoints

Run this model inference on single tenant GPU with unmatched speed and reliability at scale.

Learn more
Container

Run this model inference with full control and performance in your environment.

Learn more

Get help setting up a custom Dedicated Endpoints.

Talk with our engineer to get a quote for reserved GPU instances with discounts.

README

License: mit

📊 Training Performance & Metrics

The model successfully converged over its training run with highly stable gradients:

  • Total Training Steps: 20,000
  • Final Total Train Loss: 3.478
  • Final Step Loss: 2.988
  • Gradient Norm Stability: Stable at ~1.12
  • Training Status: Complete / Fully Converged

🚀 Quick Start & Usage

You can easily load and run this model locally using the Transformers library:

python

import torch
from transformers import AutoTokenizer, AutoModelForCausalLM, pipeline
model_id = "agentbyumer/mini-gemma"
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(
model_id,
torch_dtype=torch.float16,
device_map="auto"
)
generator = pipeline("text-generation", model=model, tokenizer=tokenizer)
prompt = "Your specialized prompt here"
outputs = generator(
prompt,
max_new_tokens=150,
do_sample=True,
temperature=0.7,
return_full_text=False
)
print(outputs[0]['generated_text'])

📜 License

This project is licensed under the permissive MIT License. See the accompanying LICENSE file for full details.

Model provider

agentbyumer

Model tree

Base

google/gemma-2b

Fine-tuned

this model

Modalities

Input

Text

Output

Text

Pricing

Dedicated Endpoints

View details

Supported Functionality

Model APIs

Dedicated Endpoints

Container

More information

Explore FriendliAI today