Dedicated Endpoints

Run this model inference on single tenant GPU with unmatched speed and reliability at scale.

Learn more
Container

Run this model inference with full control and performance in your environment.

Learn more

Get help setting up a custom Dedicated Endpoints.

Talk with our engineer to get a quote for reserved GPU instances with discounts.

README

License: mit

Model Details

PropertyValue
Base modelmistralai/Mistral-7B-Instruct-v0.3
Fine-tuning methodQLoRA (4-bit NF4 + LoRA adapters)
LoRA rank / alphar=16 / α=32
Training epochs3
Effective batch size16 (2 × 8 grad accumulation steps)
Learning rate2e-4 (cosine schedule, 3 % warmup)
Max sequence length2048 tokens
Training hardwareKaggle T4 × 2 (via Unsloth)
Training frameworkUnsloth + HuggingFace TRL SFTTrainer
PrecisionFP16
LanguageFrench (fr)
LicenseMIT

Training Dataset

The SFT dataset (togolm/togolm-corpus-v1) consists of instruction–response pairs generated from the TogoLM corpus — a curated collection of documents scraped from Togolese official sources:

SourceDomain
jo.gouv.tgJournal Officiel — laws and decrees
presidence.gouv.tgPresidency — presidential acts and speeches
assemblee-nationale.tgNational Assembly — parliamentary texts
inseed.tgNational Statistics Institute — economic and demographic data
service-public.gouv.tgPublic services directory
finances.gouv.tg / education.gouv.tg / agriculture.gouv.tgMinistries
icilome.comLocal news and analysis

Q&A pairs were generated using Gemini 2.5 Flash and formatted in the Alpaca instruction template.


Usage

Load with Unsloth (recommended)

python

from unsloth import FastLanguageModel
model, tokenizer = FastLanguageModel.from_pretrained(
model_name="togolm/togolm-7b-instruct-v1",
max_seq_length=2048,
load_in_4bit=True,
)
FastLanguageModel.for_inference(model)
prompt = """Below is an instruction about Togo. Write a response that answers it accurately.
### Instruction:
Quel est le taux d'imposition sur les sociétés au Togo ?
### Response:
"""
inputs = tokenizer(prompt, return_tensors="pt").to("cuda")
outputs = model.generate(**inputs, max_new_tokens=256, do_sample=False)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))

Load with standard Transformers + PEFT

python

from transformers import AutoTokenizer, AutoModelForCausalLM
from peft import PeftModel
base = AutoModelForCausalLM.from_pretrained(
"mistralai/Mistral-7B-Instruct-v0.3",
load_in_4bit=True,
device_map="auto",
)
model = PeftModel.from_pretrained(base, "togolm/togolm-7b-instruct-v1")
tokenizer = AutoTokenizer.from_pretrained("togolm/togolm-7b-instruct-v1")

Prompt Format

The model was fine-tuned with the Alpaca instruction template:

markdown

Below is an instruction about Togo. Write a response that answers it accurately.
### Instruction:
{your question about Togo}
### Response:

Intended Use

  • Answering questions about Togolese law, administration, statistics, and public services in French
  • Retrieval-augmented generation (RAG) combined with the TogoLM corpus
  • Research on low-resource African languages and francophone AI

Out-of-Scope Use

  • General-purpose chat or tasks unrelated to Togo
  • Legal or medical advice — always verify with official Togolese sources
  • Languages other than French (coverage is limited)

Project

This model is part of TogoLM — the first open-source AI infrastructure layer focused on Togo, covering corpus collection, RAG engine, fine-tuned LLM, and a public REST API.


Citation

bibtex

@misc{togolm2026,
author = {Kougbada, Omar Farouk},
title = {TogoLM: Open-Source AI Infrastructure for Togo},
year = {2026},
howpublished = {\url{https://huggingface.co/togolm/togolm-7b-instruct-v1}},
}

Model provider

togolm

Model tree

Base

mistralai/Mistral-7B-Instruct-v0.3

Adapter

this model

Modalities

Input

Text

Output

Text

Pricing

Dedicated Endpoints

View details

Supported Functionality

Model APIs

Dedicated Endpoints

Container

More information

Explore FriendliAI today