Resumen
One of the first cues for many neurological disorders are impairments in speech. The traditional method of diagnosing speech disorders such as dysarthria involves a perceptual evaluation from a trained speech therapist. However, this approach is known to be difficult to use for assessing speech impairments due to the subjective nature of the task. As prosodic impairments are one of the earliest cues of dysarthria, the current study presents an automatic method of assessing dysarthria in a range of severity levels using prosody-based measures. We extract prosodic measures related to pitch, speech rate, and rhythm from speakers with dysarthria and healthy controls in English and Korean datasets, despite the fact that these two languages differ in terms of prosodic characteristics. These prosody-based measures are then used as inputs to random forest, support vector machine and neural network classifiers to automatically assess different severity levels of dysarthria. Compared to baseline MFCC features, 18.13% and 11.22% relative accuracy improvement are achieved for English and Korean datasets, respectively, when including prosody-based features. Furthermore, most improvements are obtained with a better classification of mild dysarthric utterances: a recall improvement from 42.42% to 83.34% for English speakers with mild dysarthria and a recall improvement from 36.73% to 80.00% for Korean speakers with mild dysarthria.