waxal-benchmarking

whisper-small-waxal-aka

Deploy Dedicated

Dedicated Endpoints

Run this model inference on single tenant GPU with unmatched speed and reliability at scale.

Learn more

Container

Run this model inference with full control and performance in your environment.

Learn more

Get help setting up a custom Dedicated Endpoints.

Talk with our engineer to get a quote for reserved GPU instances with discounts.

Model Details

Table

Language	Akan (`aka`)
Language Family	Niger-Congo
Architecture	Whisper Small (244M parameters)
Base Model	openai/whisper-small
Training Data	WAXAL corpus (conversational spontaneous speech)
Test WER	31.7%
Test CER	11.3%
License	apache-2.0

Intended Use

This model is intended for automatic speech recognition of Akan conversational speech. It was evaluated on the WAXAL test set (spontaneous, image-prompted speech) and partially on FLEURS (read speech). It is suitable for research and low-resource ASR applications. It is not recommended for high-stakes production use without further validation.

Training Data

Fine-tuned on the WAXAL corpus, a large-scale dataset of transcribed, image-prompted spontaneous speech across 19 African languages recorded in participants' natural environments. The Akan training split contains conversational speech across diverse speakers. Data is released under CC-BY 4.0.

Usage

python
from transformers import pipeline

asr = pipeline("automatic-speech-recognition",
               model="waxal-benchmarking/whisper-small-waxal-aka")
result = asr("audio.wav")
print(result["text"])

Test Set Performance (WAXAL Benchmark)

Evaluated on the filtered WAXAL test set (duration >= 1.5s, speech rate >= 4 WPS).

Table with columns: Metric, Score
Metric	Score
WER	31.7%
CER	11.3%

Full benchmark results across all 19 languages and 6 models are reported in the WAXAL ASR Benchmark paper (citation below).

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 0.0001
train_batch_size: 16
eval_batch_size: 64
seed: 42
gradient_accumulation_steps: 2
total_train_batch_size: 32
optimizer: Use OptimizerNames.ADAMW_TORCH_FUSED with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: linear
lr_scheduler_warmup_steps: 500
num_epochs: 30
mixed_precision_training: Native AMP

Training results

Table with columns: Training Loss, Epoch, Step, Validation Loss, Wer, Cer
Training Loss	Epoch	Step	Validation Loss	Wer	Cer
0.8006	1.5823	500	0.4498	0.3172	0.1125
0.4065	3.1646	1000	0.4193	0.3014	0.1060
0.3270	4.7468	1500

Framework versions

Transformers 5.0.0
Pytorch 2.10.0+cu128
Datasets 4.0.0
Tokenizers 0.22.2

Citation

bibtex
@article{waxalnet2026,
  title  = {The WAXAL ASR Benchmark: Fine-Tuned Edge Models Across 19 African Languages},
  author = {Olufemi, Victor Tolulope and Babatunde, Oreoluwa and Njema, Ramsey and
             Gbotemi, Bolarinwa and Yen, Wanchi Lucia and Uzodinma, John and
             Ajayi, Sunday and Williams, Oluwademilade and Moshood, Kausar and
             Anyaele, Innocent Elendu and Arefaine, Akebert Tesfahunegn and
             Hunzwi, Candace and Daniel, Wongel Dawit and Namuganga, Emmilly Immaculate and
             Kadima, Cleophas and Bahizire, Athanase Biluge and Ranaivoson, Onitsiky and
             Aaron, Emmanuel and Ladislaus, Nicholaus Dismas and Muhammed, Idris and
             Simenya, Jonathan Enoch and Koome, Martin and Endaylalu, Matewos Tegete and
             Adeyemo, Peter Ifeoluwa and Birindwa, Hondi Prisca and Eze-Mbey, Ukachi Agnes and
             Oduro-Yeboah, Yacoba and Aremu, Toluwani and Adjovi, Pericles and
             Ngueajio, Mikel K and Mitra, Prasenjit},
  year   = {2026},
  note   = {arXiv preprint arXiv:2606.02375}
}

Authors

Victor Tolulope Olufemi · Oreoluwa Babatunde · Ramsey Njema · Bolarinwa Gbotemi · Wanchi Lucia Yen · John Uzodinma · Sunday Ajayi · Oluwademilade Williams · Kausar Moshood · Innocent Elendu Anyaele · Akebert Tesfahunegn Arefaine · Candace Hunzwi · Wongel Dawit Daniel · Emmilly Immaculate Namuganga · Cleophas Kadima · Athanase Biluge Bahizire · Onitsiky Ranaivoson · Emmanuel Aaron · Nicholaus Dismas Ladislaus · Idris Muhammed · Jonathan Enoch Simenya · Martin Koome · Matewos Tegete Endaylalu · Peter Ifeoluwa Adeyemo · Hondi Prisca Birindwa · Ukachi Agnes Eze-Mbey · Yacoba Oduro-Yeboah · Toluwani Aremu · Pericles Adjovi · Mikel K Ngueajio · Prasenjit Mitra

Acknowledgements

We thank the following contributors for their language expertise and native-speaker evaluation support: Ajara Oyinloye, Abubakari Sadic Mohammed, Hafiz Adjei, Aliga Norah Lele, Marie-Louise B. Ndamuso, and Odong Diana.

This work was supported by Lynguallabs (compute, researchers & storage), Open Token (compute resources), and CMU Africa (researchers & native speakers).

Model provider

waxal-benchmarking

Model tree

Base

openai/whisper-small

Fine-tuned

this model

Modalities

Input

Audio

Output

Text

Pricing

Dedicated Endpoints

View details

Supported Functionality

Model APIs

Dedicated Endpoints

Container

More information

Model card

Explore FriendliAI today

Get started Talk to an engineer

Model Details

Table

Language	Akan (`aka`)
Language Family	Niger-Congo
Architecture	Whisper Small (244M parameters)
Base Model	openai/whisper-small
Training Data	WAXAL corpus (conversational spontaneous speech)
Test WER	31.7%
Test CER	11.3%
License	apache-2.0

Intended Use

Training Data

Usage

python
from transformers import pipeline

asr = pipeline("automatic-speech-recognition",
               model="waxal-benchmarking/whisper-small-waxal-aka")
result = asr("audio.wav")
print(result["text"])

Test Set Performance (WAXAL Benchmark)

Evaluated on the filtered WAXAL test set (duration >= 1.5s, speech rate >= 4 WPS).

Table with columns: Metric, Score
Metric	Score
WER	31.7%
CER	11.3%

Full benchmark results across all 19 languages and 6 models are reported in the WAXAL ASR Benchmark paper (citation below).

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 0.0001
train_batch_size: 16
eval_batch_size: 64
seed: 42
gradient_accumulation_steps: 2
total_train_batch_size: 32
optimizer: Use OptimizerNames.ADAMW_TORCH_FUSED with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: linear
lr_scheduler_warmup_steps: 500
num_epochs: 30
mixed_precision_training: Native AMP

Training results

Table with columns: Training Loss, Epoch, Step, Validation Loss, Wer, Cer
Training Loss	Epoch	Step	Validation Loss	Wer	Cer
0.8006	1.5823	500	0.4498	0.3172	0.1125
0.4065	3.1646	1000	0.4193	0.3014	0.1060
0.3270	4.7468	1500

Framework versions

Transformers 5.0.0
Pytorch 2.10.0+cu128
Datasets 4.0.0
Tokenizers 0.22.2

Citation

bibtex
@article{waxalnet2026,
  title  = {The WAXAL ASR Benchmark: Fine-Tuned Edge Models Across 19 African Languages},
  author = {Olufemi, Victor Tolulope and Babatunde, Oreoluwa and Njema, Ramsey and
             Gbotemi, Bolarinwa and Yen, Wanchi Lucia and Uzodinma, John and
             Ajayi, Sunday and Williams, Oluwademilade and Moshood, Kausar and
             Anyaele, Innocent Elendu and Arefaine, Akebert Tesfahunegn and
             Hunzwi, Candace and Daniel, Wongel Dawit and Namuganga, Emmilly Immaculate and
             Kadima, Cleophas and Bahizire, Athanase Biluge and Ranaivoson, Onitsiky and
             Aaron, Emmanuel and Ladislaus, Nicholaus Dismas and Muhammed, Idris and
             Simenya, Jonathan Enoch and Koome, Martin and Endaylalu, Matewos Tegete and
             Adeyemo, Peter Ifeoluwa and Birindwa, Hondi Prisca and Eze-Mbey, Ukachi Agnes and
             Oduro-Yeboah, Yacoba and Aremu, Toluwani and Adjovi, Pericles and
             Ngueajio, Mikel K and Mitra, Prasenjit},
  year   = {2026},
  note   = {arXiv preprint arXiv:2606.02375}
}

Authors

Acknowledgements

This work was supported by Lynguallabs (compute, researchers & storage), Open Token (compute resources), and CMU Africa (researchers & native speakers).

whisper-small-waxal-aka

Get help setting up a custom Dedicated Endpoints.

README

Model Details

Intended Use

Training Data

Usage

Test Set Performance (WAXAL Benchmark)

Training procedure

Training hyperparameters

Training results

Framework versions

Citation

Authors

Acknowledgements

Explore FriendliAI today

README

Model Details

Intended Use

Training Data

Usage

Test Set Performance (WAXAL Benchmark)

Training procedure

Training hyperparameters

Training results

Framework versions

Citation

Authors

Acknowledgements