๐ฒ๐ฑ Bambara ASR Leaderboard
This leaderboard tracks and evaluates speech recognition models for the Bambara language. Models are ranked based on Word Error Rate (WER), Character Error Rate (CER), and a combined score.
Current Models Performance
๐ Current Best Model: test_1
- WER: 22.64%
- CER: 10.94%
- Combined Score: 19.22%
Models are ranked by selected metric - lower is better
1 | test_1 | 0.2264 | 0.1094 | 0.1922 | 2025-03-15 10:30:45 | 22.64 | 10.94 | 19.22 |
Understanding ASR Metrics
Word Error Rate (WER)
WER measures how accurately the ASR system recognizes whole words:
- Lower values indicate better performance
- Calculated as: (Substitutions + Insertions + Deletions) / Total Words
- A WER of 0% means perfect transcription
- A WER of 20% means approximately 1 in 5 words contains an error
Character Error Rate (CER)
CER measures accuracy at the character level:
- More fine-grained than WER
- Better at capturing partial word matches
- Particularly useful for agglutinative languages like Bambara
Combined Score
- Weighted average: 70% WER + 30% CER
- Provides a balanced evaluation of model performance
- Used as the primary ranking metric
Submit a new model for evaluation
Upload a CSV file with the following format:
- Must contain exactly two columns: 'id' and 'text'
- The 'id' column should match the reference dataset IDs
- The 'text' column should contain your model's transcriptions
Use a descriptive name to identify your model
CSV with columns: id, text
Updated Leaderboard
1 | test_1 | 0.2264 | 0.1094 | 0.1922 | 2025-03-15 10:30:45 | 22.64 | 10.94 | 19.22 |
1 | test_1 | 0.2264 | 0.1094 | 0.1922 | 2025-03-15 10:30:45 | 22.64 | 10.94 | 19.22 |
2 | test_2 | 0.3264 | 0.1094 | 0.1922 | 2025-03-15 10:30:45 | 32.64 | 10.94 | 19.22 |
About the Benchmark Dataset
This leaderboard uses the sudoping01/bambara-speech-recognition-benchmark dataset:
- Contains diverse Bambara speech samples
- Includes various speakers, accents, and dialects
- Covers different speech styles and recording conditions
- Transcribed and validated
How to Generate Predictions
To submit results to this leaderboard:
- Download the audio files from the benchmark dataset
- Run your ASR model on the audio files
- Generate a CSV file with 'id' and 'text' columns
- Submit your results using the form in the "Submit New Results" tab
Evaluation Guidelines
- Text is normalized (lowercase, punctuation removed) before metrics calculation
- Extreme outliers are capped to prevent skewing results
- All submissions are validated for format and completeness
NB: This work is a collaboration between MALIBA-AI, RobotsMali AI4D-LAB and Djelia
About MALIBA-AI
MALIBA-AI: Empowering Mali's Future Through Community-Driven AI Innovation
"No Malian Language Left Behind"
This leaderboard is maintained by the MALIBA-AI initiative to track progress in Bambara speech recognition technology. For more information, visit MALIBA-AI on Hugging Face.