Run this model inference on single tenant GPU with unmatched speed and reliability at scale.
Run this model inference with full control and performance in your environment.
Get help setting up a custom Dedicated Endpoints.
Talk with our engineer to get a quote for reserved GPU instances with discounts.
README
License: mit📊 Training Performance & Metrics
The model successfully converged over its training run with highly stable gradients:
- Total Training Steps: 20,000
- Final Total Train Loss:
3.478 - Final Step Loss:
2.988 - Gradient Norm Stability: Stable at
~1.12 - Training Status: Complete / Fully Converged
🚀 Quick Start & Usage
You can easily load and run this model locally using the Transformers library:
python
import torchfrom transformers import AutoTokenizer, AutoModelForCausalLM, pipelinemodel_id = "agentbyumer/mini-gemma"tokenizer = AutoTokenizer.from_pretrained(model_id)model = AutoModelForCausalLM.from_pretrained(model_id,torch_dtype=torch.float16,device_map="auto")generator = pipeline("text-generation", model=model, tokenizer=tokenizer)prompt = "Your specialized prompt here"outputs = generator(prompt,max_new_tokens=150,do_sample=True,temperature=0.7,return_full_text=False)print(outputs[0]['generated_text'])
📜 License
This project is licensed under the permissive MIT License. See the accompanying LICENSE file for full details.
Model provider
agentbyumer
Model tree
Base
google/gemma-2b
Fine-tuned
this model
Modalities
Input
Text
Output
Text
Pricing
Dedicated Endpoints
View detailsSupported Functionality
Model APIs
Dedicated Endpoints
Container
More information