Minimal Acoustic Markers for Age Prediction in Human Voice: A Machine Learning Approach

Naazeri, Hiva (2025) Minimal Acoustic Markers for Age Prediction in Human Voice: A Machine Learning Approach. Master thesis, Voice Technology (VT).

Preview

PDF
MA6028497HNaazeri.pdf
Download (539kB) | Preview

Abstract

Aging affects the human voice in systematic and measurable ways due to physiological changes in the vocal tract, respiratory system, and laryngeal structures. This thesis investigates the extent to which vocal characteristics can be used to predict a speaker’s age group using a minimal, interpretable set of biologically motivated acoustic features. Leveraging a curated subset of the Mozilla Common Voice dataset, we extracted features such as fundamental frequency (F0), formant frequencies, jitter, shimmer, spectral tilt, speech rate, and mel-frequency cepstral coefficients (MFCCs) to train machine learning models for age group classification. We developed a reproducible audio processing and feature extraction pipeline using open-source tools and evaluated several models, with Random Forests demonstrating the best performance, achieving up to 62% accuracy across five broad age groups. Feature importance analysis revealed that vocal perturbation measures (jitter and shimmer), spectral features, and speech rate were among the most informative for predicting speaker age. Despite limited accuracy in underrepresented age groups (e.g., 50s and 60s), the results suggest that interpretable acoustic biomarkers capture meaningful age-related vocal changes. This work provides a baseline for age prediction from voice with practical implications in human-computer interaction, speaker profiling, and health monitoring. Limitations include class imbalance, reliance on self-reported age labels, and language-specific data. Future research should explore data augmentation, continuous age prediction via regression, expanded feature sets, and cross-linguistic generalizability. Clinical extensions include using vocal biomarkers for early detection of age-related diseases and neurodegenerative disorders, offering a promising, non-invasive diagnostic avenue. Keywords: Voice Aging, Acoustic Biomarkers, Age Prediction, Speech Processing, Machine Learning, Jitter, Shimmer, Spectral Features, Random Forest, Vocal Health Monitoring

Item Type:	Thesis (Master)
Name supervisor:	Coler, M.L.
Date Deposited:	16 Jun 2025 11:10
Last Modified:	16 Jun 2025 11:10
URI:	https://campus-fryslan.studenttheses.ub.rug.nl/id/eprint/660

Actions (login required)

View Item