Acceptance of synthetic speech in South African languages: A comparative study of Afrikaans, isiZulu, and Sepedi in healthcare contexts
DOI:
https://doi.org/10.55492/dhasa.v5i02.6721Keywords:
Text-to-Speech, Synthetic Speech Evaluation, Trust in AI Voices, Sociolinguistic Perception, Perceptual Speech QualityAbstract
While text-to-speech technologies have made significant advances in recent years, questions remain about how synthesised speech is accepted in culturally and linguistically diverse settings such as South Africa. This study explores how South Africans perceive synthetic speech in comparison to human-recorded speech across three official languages: Afrikaans, isiZulu, and Sepedi, with healthcare as the application context. Using a blind and randomised listening test, 65 participants rated audio prompts across four acceptance metrics: trust, knowledgeability, lik ability, and relatability. Statistical analysis using the Wilcoxon signed-rank test revealed no significant difference between natural and syn thesised speech perception among Afrikaans speakers. However, low participation rates prevented meaningful analysis of speech percep tion for isiZulu and Sepedi speakers. When combining data from all participants, a medium effect size favouring natural speech was ob served, though this difference was not statistically significant. These findings suggest that synthetic speech adapted from natural recordings may be suit able for certain applications in South Africa, though larger and more linguistically represen tative samples are needed to confirm these results.
Downloads
Published
Issue
Section
License
Copyright (c) 2025 Johannes Abraham Louw, Ilana Wilken

This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.