RESUMEN
Subglottal Impedance-Based Inverse Filtering (IBIF) allows for the continuous, non-invasive estimation of glottal airflow from a surface accelerometer placed over the anterior neck skin below the larynx. It has been shown to be advantageous for the ambulatory monitoring of vocal function, specifically in the use of high-order statistics to understand long-term vocal behavior. However, during long-term ambulatory recordings over several days, conditions may drift from the laboratory environment where the IBIF parameters were initially estimated due to sensor positioning, skin attachment, or temperature, among other factors. Observation uncertainties and model mismatch may result in significant deviations in the glottal airflow estimates; unfortunately, they are very difficult to quantify in ambulatory conditions due to a lack of a reference signal. To address this issue, we propose a Kalman filter implementation of the IBIF filter, which allows for both estimating the model uncertainty and adapting the airflow estimates to correct for signal deviations. One-way analysis of variance (ANOVA) results from laboratory experiments using the Rainbow Passage indicate an improvement using the modified Kalman filter on amplitude-based measures for phonotraumatic vocal hyperfunction (PVH) subjects compared to the standard IBIF; the latter showing a statistically difference (p-value = 0.02, F = 4.1) with respect to a reference glottal volume velocity signal estimated from a single notch filter used here as ground-truth in this work. In contrast, maximum flow declination rates from subjects with vocal phonotrauma exhibit a small but statistically difference between the ground-truth signal and the modified Kalman filter when using one-way ANOVA (p-value = 0.04, F = 3.3). Other measures did not have significant differences with either the modified Kalman filter or IBIF compared to ground-truth, with the exception of H1-H2, whose performance deteriorates for both methods. Overall, both methods (modified Kalman filter and IBIF) show similar glottal airflow measures, with the advantage of the modified Kalman filter to improve amplitude estimation. Moreover, Kalman filter deviations from the IBIF output airflow might suggest a better representation of some fine details in the ground-truth glottal airflow signal. Other applications may take more advantage from the adaptation offered by the modified Kalman filter implementation.
RESUMEN
The ambulatory assessment of vocal function can be significantly enhanced by having access to physiologically based features that describe underlying pathophysiological mechanisms in individuals with voice disorders. This type of enhancement can improve methods for the prevention, diagnosis, and treatment of behaviorally based voice disorders. Unfortunately, the direct measurement of important vocal features such as subglottal pressure, vocal fold collision pressure, and laryngeal muscle activation is impractical in laboratory and ambulatory settings. In this study, we introduce a method to estimate these features during phonation from a neck-surface vibration signal through a framework that integrates a physiologically relevant model of voice production and machine learning tools. The signal from a neck-surface accelerometer is first processed using subglottal impedance-based inverse filtering to yield an estimate of the unsteady glottal airflow. Seven aerodynamic and acoustic features are extracted from the neck surface accelerometer and an optional microphone signal. A neural network architecture is selected to provide a mapping between the seven input features and subglottal pressure, vocal fold collision pressure, and cricothyroid and thyroarytenoid muscle activation. This non-linear mapping is trained solely with 13,000 Monte Carlo simulations of a voice production model that utilizes a symmetric triangular body-cover model of the vocal folds. The performance of the method was compared against laboratory data from synchronous recordings of oral airflow, intraoral pressure, microphone, and neck-surface vibration in 79 vocally healthy female participants uttering consecutive /pæ/ syllable strings at comfortable, loud, and soft levels. The mean absolute error and root-mean-square error for estimating the mean subglottal pressure were 191 Pa (1.95 cm H2O) and 243 Pa (2.48 cm H2O), respectively, which are comparable with previous studies but with the key advantage of not requiring subject-specific training and yielding more output measures. The validation of vocal fold collision pressure and laryngeal muscle activation was performed with synthetic values as reference. These initial results provide valuable insight for further vocal fold model refinement and constitute a proof of concept that the proposed machine learning method is a feasible option for providing physiologically relevant measures for laboratory and ambulatory assessment of vocal function.
RESUMEN
Phonotraumatic vocal hyperfunction (PVH) is associated with chronic misuse and/or abuse of voice that can result in lesions such as vocal fold nodules. The clinical aerodynamic assessment of vocal function has been recently shown to differentiate between patients with PVH and healthy controls to provide meaningful insight into pathophysiological mechanisms associated with these disorders. However, all current clinical assessment of PVH is incomplete because of its inability to objectively identify the type and extent of detrimental phonatory function that is associated with PVH during daily voice use. The current study sought to address this issue by incorporating, for the first time in a comprehensive ambulatory assessment, glottal airflow parameters estimated from a neck-mounted accelerometer and recorded to a smartphone-based voice monitor. We tested this approach on 48 patients with vocal fold nodules and 48 matched healthy-control subjects who each wore the voice monitor for a week. Seven glottal airflow features were estimated every 50 ms using an impedance-based inverse filtering scheme, and seven high-order summary statistics of each feature were computed every 5 minutes over voiced segments. Based on a univariate hypothesis testing, eight glottal airflow summary statistics were found to be statistically different between patient and healthy-control groups. L1-regularized logistic regression for a supervised classification task yielded a mean (standard deviation) area under the ROC curve of 0.82 (0.25) and an accuracy of 0.83 (0.14). These results outperform the state-of-the-art classification for the same classification task and provide a new avenue to improve the assessment and treatment of hyperfunctional voice disorders.