Run this model inference on single tenant GPU with unmatched speed and reliability at scale.
Run this model inference with full control and performance in your environment.
Get help setting up a custom Dedicated Endpoints.
Talk with our engineer to get a quote for reserved GPU instances with discounts.
README
License: apache-2.0Model Details
| Language | Wolaytta (wal) |
| Language Family | Afro-Asiatic |
| Architecture | Whisper Tiny (39M parameters) |
| Base Model | openai/whisper-tiny |
| Training Data | WAXAL corpus (conversational spontaneous speech) |
| Test WER | 42.6% |
| Test CER | 14.3% |
| License | apache-2.0 |
Intended Use
This model is intended for automatic speech recognition of Wolaytta conversational speech. It was evaluated on the WAXAL test set (spontaneous, image-prompted speech) and partially on FLEURS (read speech). It is suitable for research and low-resource ASR applications. It is not recommended for high-stakes production use without further validation.
Training Data
Fine-tuned on the WAXAL corpus, a large-scale dataset of transcribed, image-prompted spontaneous speech across 19 African languages recorded in participants' natural environments. The Wolaytta training split contains conversational speech across diverse speakers. Data is released under CC-BY 4.0.
Usage
python
from transformers import pipelineasr = pipeline("automatic-speech-recognition",model="waxal-benchmarking/whisper-tiny-waxal-wal")result = asr("audio.wav")print(result["text"])
Test Set Performance (WAXAL Benchmark)
Evaluated on the filtered WAXAL test set (duration >= 1.5s, speech rate >= 4 WPS).
| Metric | Score |
|---|---|
| WER | 42.6% |
| CER | 14.3% |
Full benchmark results across all 19 languages and 6 models are reported in the WAXAL ASR Benchmark paper (citation below).
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 0.0001
- train_batch_size: 32
- eval_batch_size: 64
- seed: 42
- optimizer: Use OptimizerNames.ADAMW_TORCH_FUSED with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
- lr_scheduler_type: linear
- lr_scheduler_warmup_steps: 500
- num_epochs: 30
- mixed_precision_training: Native AMP
Training results
| Training Loss | Epoch | Step | Validation Loss | Wer | Cer |
|---|---|---|---|---|---|
| 0.7548 | 0.3826 | 500 | 0.6793 | 0.4732 | 0.1273 |
| 0.6271 | 0.7651 | 1000 | 0.5768 | 0.4129 | 0.1037 |
| 0.5424 | 1.1477 | 1500 | 0.5427 | 0.4031 | 0.1011 |
| 0.5400 | 1.5302 | 2000 | 0.5197 | 0.3889 | 0.0973 |
| 0.5305 | 1.9128 | 2500 | 0.5066 | 0.3722 | 0.0867 |
| 0.4498 | 2.2953 | 3000 | 0.4947 | 0.3688 | 0.0837 |
| 0.4533 | 2.6779 | 3500 | 0.4963 | 0.3681 | 0.0820 |
| 0.3506 | 3.0604 | 4000 | 0.4955 | 0.3792 | 0.0886 |
| 0.3761 | 3.4430 | 4500 | 0.4972 | 0.3689 | 0.0850 |
Framework versions
- Transformers 5.0.0
- Pytorch 2.10.0+cu128
- Datasets 4.0.0
- Tokenizers 0.22.2
Citation
bibtex
@article{waxalnet2026,title = {The WAXAL ASR Benchmark: Fine-Tuned Edge Models Across 19 African Languages},author = {Olufemi, Victor Tolulope and Babatunde, Oreoluwa and Njema, Ramsey andGbotemi, Bolarinwa and Yen, Wanchi Lucia and Uzodinma, John andAjayi, Sunday and Williams, Oluwademilade and Moshood, Kausar andAnyaele, Innocent Elendu and Arefaine, Akebert Tesfahunegn andHunzwi, Candace and Daniel, Wongel Dawit and Namuganga, Emmilly Immaculate andKadima, Cleophas and Bahizire, Athanase Biluge and Ranaivoson, Onitsiky andAaron, Emmanuel and Ladislaus, Nicholaus Dismas and Muhammed, Idris andSimenya, Jonathan Enoch and Koome, Martin and Endaylalu, Matewos Tegete andAdeyemo, Peter Ifeoluwa and Birindwa, Hondi Prisca and Eze-Mbey, Ukachi Agnes andOduro-Yeboah, Yacoba and Aremu, Toluwani and Adjovi, Pericles andNgueajio, Mikel K and Mitra, Prasenjit},year = {2026},note = {Preprint coming soon}}
Authors
Victor Tolulope Olufemi · Oreoluwa Babatunde · Ramsey Njema · Bolarinwa Gbotemi · Wanchi Lucia Yen · John Uzodinma · Sunday Ajayi · Oluwademilade Williams · Kausar Moshood · Innocent Elendu Anyaele · Akebert Tesfahunegn Arefaine · Candace Hunzwi · Wongel Dawit Daniel · Emmilly Immaculate Namuganga · Cleophas Kadima · Athanase Biluge Bahizire · Onitsiky Ranaivoson · Emmanuel Aaron · Nicholaus Dismas Ladislaus · Idris Muhammed · Jonathan Enoch Simenya · Martin Koome · Matewos Tegete Endaylalu · Peter Ifeoluwa Adeyemo · Hondi Prisca Birindwa · Ukachi Agnes Eze-Mbey · Yacoba Oduro-Yeboah · Toluwani Aremu · Pericles Adjovi · Mikel K Ngueajio · Prasenjit Mitra
Acknowledgements
We thank the following contributors for their language expertise and native-speaker evaluation support: Ajara Oyinloye, Abubakari Sadic Mohammed, Hafiz Adjei, Aliga Norah Lele, Marie-Louise B. Ndamuso, and Odong Diana.
This work was supported by Lynguallabs (compute, researchers & storage), Open Token (compute resources), and CMU Africa (researchers & native speakers).
Model provider
waxal-benchmarking
Model tree
Base
openai/whisper-tiny
Fine-tuned
this model
Modalities
Input
Audio
Output
Text
Pricing
Dedicated Endpoints
View detailsSupported Functionality
Model APIs
Dedicated Endpoints
Container
More information