Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 5 de 5
Filtrar
Más filtros











Base de datos
Intervalo de año de publicación
1.
JMIR Res Protoc ; 12: e51912, 2023 Oct 23.
Artículo en Inglés | MEDLINE | ID: mdl-37870890

RESUMEN

BACKGROUND: Providing Psychotherapy, particularly for youth, is a pressing challenge in the health care system. Traditional methods are resource-intensive, and there is a need for objective benchmarks to guide therapeutic interventions. Automated emotion detection from speech, using artificial intelligence, presents an emerging approach to address these challenges. Speech can carry vital information about emotional states, which can be used to improve mental health care services, especially when the person is suffering. OBJECTIVE: This study aims to develop and evaluate automated methods for detecting the intensity of emotions (anger, fear, sadness, and happiness) in audio recordings of patients' speech. We also demonstrate the viability of deploying the models. Our model was validated in a previous publication by Alemu et al with limited voice samples. This follow-up study used significantly more voice samples to validate the previous model. METHODS: We used audio recordings of patients, specifically children with high adverse childhood experience (ACE) scores; the average ACE score was 5 or higher, at the highest risk for chronic disease and social or emotional problems; only 1 in 6 have a score of 4 or above. The patients' structured voice sample was collected by reading a fixed script. In total, 4 highly trained therapists classified audio segments based on a scoring process of 4 emotions and their intensity levels for each of the 4 different emotions. We experimented with various preprocessing methods, including denoising, voice-activity detection, and diarization. Additionally, we explored various model architectures, including convolutional neural networks (CNNs) and transformers. We trained emotion-specific transformer-based models and a generalized CNN-based model to predict emotion intensities. RESULTS: The emotion-specific transformer-based model achieved a test-set precision and recall of 86% and 79%, respectively, for binary emotional intensity classification (high or low). In contrast, the CNN-based model, generalized to predict the intensity of 4 different emotions, achieved test-set precision and recall of 83% for each. CONCLUSIONS: Automated emotion detection from patients' speech using artificial intelligence models is found to be feasible, leading to a high level of accuracy. The transformer-based model exhibited better performance in emotion-specific detection, while the CNN-based model showed promise in generalized emotion detection. These models can serve as valuable decision-support tools for pediatricians and mental health providers to triage youth to appropriate levels of mental health care services. INTERNATIONAL REGISTERED REPORT IDENTIFIER (IRRID): RR1-10.2196/51912.

2.
JMIR Res Protoc ; 12: e46970, 2023 Jun 23.
Artículo en Inglés | MEDLINE | ID: mdl-37351936

RESUMEN

BACKGROUND: Even before the onset of the COVID-19 pandemic, children and adolescents were experiencing a mental health crisis, partly due to a lack of quality mental health services. The rate of suicide for Black youth has increased by 80%. By 2025, the health care system will be short of 225,000 therapists, further exacerbating the current crisis. Therefore, it is of utmost importance for providers, schools, youth mental health, and pediatric medical providers to integrate innovation in digital mental health to identify problems proactively and rapidly for effective collaboration with other health care providers. Such approaches can help identify robust, reproducible, and generalizable predictors and digital biomarkers of treatment response in psychiatry. Among the multitude of digital innovations to identify a biomarker for psychiatric diseases currently, as part of the macrolevel digital health transformation, speech stands out as an attractive candidate with features such as affordability, noninvasive, and nonintrusive. OBJECTIVE: The protocol aims to develop speech-emotion recognition algorithms leveraging artificial intelligence/machine learning, which can establish a link between trauma, stress, and voice types, including disrupting speech-based characteristics, and detect clinically relevant emotional distress and functional impairments in children and adolescents. METHODS: Informed by theoretical foundations (the Theory of Psychological Trauma Biomarkers and Archetypal Voice Categories), we developed our methodology to focus on 5 emotions: anger, happiness, fear, neutral, and sadness. Participants will be recruited from 2 local mental health centers that serve urban youths. Speech samples, along with responses to the Symptom and Functioning Severity Scale, Patient Health Questionnaire 9, and Adverse Childhood Experiences scales, will be collected using an Android mobile app. Our model development pipeline is informed by Gaussian mixture model (GMM), recurrent neural network, and long short-term memory. RESULTS: We tested our model with a public data set. The GMM with 128 clusters showed an evenly distributed accuracy across all 5 emotions. Using utterance-level features, GMM achieved an accuracy of 79.15% overall, while frame selection increased accuracy to 85.35%. This demonstrates that GMM is a robust model for emotion classification of all 5 emotions and that emotion frame selection enhances accuracy, which is significant for scientific evaluation. Recruitment and data collection for the study were initiated in August 2021 and are currently underway. The study results are likely to be available and published in 2024. CONCLUSIONS: This study contributes to the literature as it addresses the need for speech-focused digital health tools to detect clinically relevant emotional distress and functional impairments in children and adolescents. The preliminary results show that our algorithm has the potential to improve outcomes. The findings will contribute to the broader digital health transformation. INTERNATIONAL REGISTERED REPORT IDENTIFIER (IRRID): DERR1-10.2196/46970.

3.
J Prev Alzheimers Dis ; 10(2): 314-321, 2023.
Artículo en Inglés | MEDLINE | ID: mdl-36946458

RESUMEN

BACKGROUND: Speech impairments are an early feature of Alzheimer's disease (AD) and consequently, analysing speech performance is a promising new digital biomarker for AD screening. Future clinical AD trials on disease modifying drugs will require a shift to very early identification of individuals at risk of dementia. Hence, digital markers of language and speech may offer a method for screening of at-risk populations that are at the earliest stages of AD, eventually in combination with advanced machine learning. To this end, we developed a screening battery consisting of speech-based neurocognitive tests. The automated test performs a remote primary screening using a simple telephone. OBJECTIVES: PROSPECT-AD aims to validate speech biomarkers for identification of individuals with early signs of AD and monitor their longitudinal course through access to well-phenotyped cohorts. DESIGN: PROSPECT-AD leverages ongoing cohorts such as EPAD (UK), DESCRIBE and DELCODE (Germany), and BioFINDER Primary Care (Sweden) and Beta-AARC (Spain) by adding a collection of speech data over the telephone to existing longitudinal follow-ups. Participants at risk of dementia are recruited from existing parent cohorts across Europe to form an AD 'probability-spectrum', i.e., individuals with a low risk to high risk of developing AD dementia. The characterization of cognition, biomarker and risk factor (genetic and environmental) status of each research participants over time combined with audio recordings of speech samples will provide a well-phenotyped population for comparing novel speech markers with current gold standard biomarkers and cognitive scores. PARTICIPANTS: N= 1000 participants aged 50 or older will be included in total, with a clinical dementia rating scale (CDR) score of 0 or 0.5. The study protocol is planned to run according to sites between 12 and 18 months. MEASUREMENTS: The speech protocol includes the following neurocognitive tests which will be administered remotely: Word List [Memory Function], Verbal Fluency [Executive Functions] and spontaneous free speech [Psychological and/ or behavioral symptoms]. Speech features on the linguistic and paralinguistic level will be extracted from the recordings and compared to data from CSF and blood biomarkers, neuroimaging, neuropsychological evaluations, genetic profiles, and family history. Primary candidate marker from speech will be a combination of most significant features in comparison to biomarkers as reference measure. Machine learning and computational techniques will be employed to identify the most significant speech biomarkers that could represent an early indicator of AD pathology. Furthermore, based on the analysis of speech performances, models will be trained to predict cognitive decline and disease progression across the AD continuum. CONCLUSION: The outcome of PROSPECT-AD may support AD drug development research as well as primary or tertiary prevention of dementia by providing a validated tool using a remote approach for identifying individuals at risk of dementia and monitoring individuals over time, either in a screening context or in clinical trials.


Asunto(s)
Enfermedad de Alzheimer , Disfunción Cognitiva , Humanos , Enfermedad de Alzheimer/psicología , Biomarcadores , Disfunción Cognitiva/psicología , Memoria , Habla
4.
Digit Biomark ; 6(3): 107-116, 2022.
Artículo en Inglés | MEDLINE | ID: mdl-36466952

RESUMEN

Introduction: Progressive cognitive decline is the cardinal behavioral symptom in most dementia-causing diseases such as Alzheimer's disease. While most well-established measures for cognition might not fit tomorrow's decentralized remote clinical trials, digital cognitive assessments will gain importance. We present the evaluation of a novel digital speech biomarker for cognition (SB-C) following the Digital Medicine Society's V3 framework: verification, analytical validation, and clinical validation. Methods: Evaluation was done in two independent clinical samples: the Dutch DeepSpA (N = 69 subjective cognitive impairment [SCI], N = 52 mild cognitive impairment [MCI], and N = 13 dementia) and the Scottish SPeAk datasets (N = 25, healthy controls). For validation, two anchor scores were used: the Mini-Mental State Examination (MMSE) and the Clinical Dementia Rating (CDR) scale. Results: Verification: The SB-C could be reliably extracted for both languages using an automatic speech processing pipeline. Analytical Validation: In both languages, the SB-C was strongly correlated with MMSE scores. Clinical Validation: The SB-C significantly differed between clinical groups (including MCI and dementia), was strongly correlated with the CDR, and could track the clinically meaningful decline. Conclusion: Our results suggest that the ki:e SB-C is an objective, scalable, and reliable indicator of cognitive decline, fit for purpose as a remote assessment in clinical early dementia trials.

5.
Cancer Med ; 10(11): 3822-3835, 2021 06.
Artículo en Inglés | MEDLINE | ID: mdl-33938165

RESUMEN

The promise of speech disorders as biomarkers in clinical examination has been identified in a broad spectrum of neurodegenerative diseases. However, to the best of our knowledge, a validated acoustic marker with established discriminative and evaluative properties has not yet been developed for oral tongue cancers. Here we cross-sectionally collected a screening dataset that included acoustic parameters extracted from 3 sustained vowels /ɑ/, /i/, /u/ and binary perceptual outcomes from 12 consonant-vowel syllables. We used a support vector machine with linear kernel function within this dataset to identify the formant centralization ratio (FCR) as a dominant predictor of different perceptual outcomes across gender and syllable. The Acoustic analysis, Perceptual evaluation and Quality of Life assessment (APeQoL) was used to validate the FCR in 33 patients with primary resectable oral tongue cancers. Measurements were taken before (pre-op) and four to six weeks after (post-op) surgery. The speech handicap index (SHI), a speech-specific questionnaire, was also administrated at these time points. Pre-op correlation analysis within the APeQoL revealed overall consistency and a strong correlation between FCR and SHI scores. FCRs also increased significantly with increasing T classification pre-operatively, especially for women. Longitudinally, the main effects of T classification, the extent of resection, and their interaction effects with time (pre-op vs. post-op) on FCRs were all significant. For pre-operative FCR, after merging the two datasets, a cut-off value of 0.970 produced an AUC of 0.861 (95% confidence interval: 0.785-0.938) for T3-4 patients. In sum, this study determined that FCR is an acoustic marker with the potential to detect disease and related speech function in oral tongue cancers. These are preliminary findings that need to be replicated in longitudinal studies and/or larger cohorts.


Asunto(s)
Trastornos de la Articulación/fisiopatología , Minería de Datos , Neoplasias de la Lengua/fisiopatología , Adulto , Anciano , Análisis de Varianza , Área Bajo la Curva , Trastornos de la Articulación/diagnóstico , China , Estudios Transversales , Femenino , Humanos , Masculino , Persona de Mediana Edad , Calidad de Vida , Factores Sexuales , Medición de la Producción del Habla/métodos , Máquina de Vectores de Soporte , Lengua/cirugía , Neoplasias de la Lengua/diagnóstico , Neoplasias de la Lengua/patología , Neoplasias de la Lengua/cirugía
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA