Run this model inference on single tenant GPU with unmatched speed and reliability at scale.
Get help setting up a custom Dedicated Endpoints.
Talk with our engineer to get a quote for reserved GPU instances with discounts.
README
Overview
This repository contains a PEFT LoRA adapter for Hcompany/Holo-3.1-4B adapted for coding-oriented instruction following and Python problem solving. The adapter is intended to be loaded on top of the base model with PEFT-compatible tooling.
What Is Included
- LoRA adapter weights in
adapter_model.safetensors. - PEFT configuration in
adapter_config.json. - Tokenizer and chat template files copied for convenient loading.
- Evaluation and provenance artifacts from the release run.
Training And Evaluation Summary
The adapter was produced with supervised fine-tuning on curated coding instruction data, including targeted Python problem-solving examples, broader coding instruction examples, and small external coding-instruction samples. Evaluation used an 80-task held-out greedy decoding probe drawn from HumanEval-style and MBPP-style tasks.
Measured result on the held-out probe:
- Base model: 24 / 80 tasks passed.
- Adapter model: 31 / 80 tasks passed.
- Relative lift over the measured base result: 29.17%.
These numbers are a compact functional probe, not a complete benchmark suite.
Intended Use
Use this adapter for coding assistance experiments, Python function synthesis, small algorithmic tasks, and research on lightweight coding adaptation. Load it with PEFT on top of Hcompany/Holo-3.1-4B.
Known Limitations
- The evaluation probe is small and focused on short Python tasks.
- The adapter may still fail hidden edge cases, multi-file tasks, long-context repository work, and non-Python languages.
- Outputs should be tested before use in production or security-sensitive environments.
- The adapter inherits limitations and licensing terms from the base model and training data sources.
File List
adapter_model.safetensors: LoRA adapter weights.adapter_config.json: PEFT adapter configuration.tokenizer.json,tokenizer_config.json,chat_template.jinja: tokenizer/chat assets.release_summary.json: run summary and measured evaluation counts.dataset_selection.json: high-level dataset selection record.eval_before_after_full_code.csv: per-task before/after evaluation table.trainer_log_history.json: trainer log history.
Reproducibility And Provenance
The release artifacts include dataset selection, trainer history, and before/after evaluation outputs to support auditability. The adapter was trained as a parameter-efficient LoRA continuation of the public base model and is distributed separately from the base weights.
Evidence files
Run evidence for this release is stored in the repository under evidence/:
evidence/holo_4b_latest_out_v3_holo-3-1-4b-coding-qlora-30pct-v1_release_release_summary.jsonevidence/holo_4b_repair38_out_holo-3-1-4b-repair38-v1_release_release_summary.json
These files are compact local/Kaggle run artifacts used to document training, evaluation, merge, or quantization evidence for this model family.
Model provider
josephmayo
Model tree
Base
Hcompany/Holo-3.1-4B
Adapter
this model
Modalities
Input
Video, Text, Image
Output
Text
Pricing
Dedicated Endpoints
View detailsSupported Functionality
Model APIs
Dedicated Endpoints
Container
More information