RESUMEN
Hope is a vital coping mechanism, enabling individuals to effectively confront life's challenges. This study proposes a technique employing Natural Language Processing (NLP) tools like Linguistic Inquiry and Word Count (LIWC), NRC-emotion-lexicon, and vaderSentiment to analyze social media posts, extracting psycholinguistic, emotional, and sentimental features from a hope speech dataset. The findings of this study reveal distinct cognitive, emotional, and communicative characteristics and psycholinguistic dimensions, emotions, and sentiments associated with different types of hope shared in social media. Furthermore, the study investigates the potential of leveraging this data to classify different types of hope using machine learning algorithms. Notably, models such as LightGBM and CatBoost demonstrate impressive performance, surpassing traditional methods and competing effectively with deep learning techniques. We employed hyperparameter tuning to optimize the models' parameters and compared their performance using both default and tuned settings. The results highlight the enhanced efficiency achieved through hyperparameter tuning for these models.
Asunto(s)
Emociones , Procesamiento de Lenguaje Natural , Psicolingüística , Medios de Comunicación Sociales , Habla , Humanos , Emociones/fisiología , Psicolingüística/métodos , Esperanza , Aprendizaje Automático , Algoritmos , Aprendizaje ProfundoRESUMEN
PURPOSE: To describe the effects of subthalamic nucleus deep brain stimulation (STN-DBS) on the speech of Spanish-speaking Parkinson's disease (PD) patients during the first year of treatment. METHODS: The speech measures (SMs): maximum phonation time, acoustic voice measures, speech rate, speech intelligibility measures, and oral diadochokinesis rates of nine Colombian idiopathic PD patients (four females and five males; age = 63 ± 7 years; years of PD = 10 ± 7 years; UPDRS-III = 57 ± 6; H&Y = 2 ± 0.3) were studied in OFF and ON medication states before and every three months during the first year after STN-DBS surgery. Praat software and healthy native listeners' ratings were used for speech analysis. Statistical analysis tried to find significant differences in the SMs during follow-up (Friedman test) and between medication states (Wilcoxon paired test). Also, a pre-surgery variation interval (PSVI) of reference for every participant and SM was calculated to make an individual analysis of post-surgery variation. RESULTS: Non-significative post-surgery or medication state-related differences in the SMs were found. Nevertheless, individually, based on PSVIs, the SMs exhibited: no variation, inconsistent or consistent variation during post-surgery follow-up in different combinations, depending on the medication state. CONCLUSION: As a group, participants did not have a shared post-surgery pattern of change in any SM. Instead, based on PSVIs, the SMs varied differently in every participant, which suggests that in Spanish-speaking PD patients, the effects of STN-DBS on speech during the first year of treatment could be highly variable.
Asunto(s)
Estimulación Encefálica Profunda , Enfermedad de Parkinson , Núcleo Subtalámico , Humanos , Enfermedad de Parkinson/terapia , Enfermedad de Parkinson/fisiopatología , Masculino , Femenino , Persona de Mediana Edad , Anciano , Inteligibilidad del Habla/fisiología , Lenguaje , Trastornos del Habla/etiología , Trastornos del Habla/terapia , Habla/fisiología , Medición de la Producción del Habla , Resultado del TratamientoRESUMEN
Speech emotion recognition is key to many fields, including human-computer interaction, healthcare, and intelligent assistance. While acoustic features extracted from human speech are essential for this task, not all of them contribute to emotion recognition effectively. Thus, reduced numbers of features are required within successful emotion recognition models. This work aimed to investigate whether splitting the features into two subsets based on their distribution and then applying commonly used feature reduction methods would impact accuracy. Filter reduction was employed using the Kruskal-Wallis test, followed by principal component analysis (PCA) and independent component analysis (ICA). A set of features was investigated to determine whether the indiscriminate use of parametric feature reduction techniques affects the accuracy of emotion recognition. For this investigation, data from three databases-Berlin EmoDB, SAVEE, and RAVDES-were organized into subsets according to their distribution in applying both PCA and ICA. The results showed a reduction from 6373 features to 170 for the Berlin EmoDB database with an accuracy of 84.3%; a final size of 130 features for SAVEE, with a corresponding accuracy of 75.4%; and 150 features for RAVDESS, with an accuracy of 59.9%.
Asunto(s)
Emociones , Análisis de Componente Principal , Habla , Humanos , Emociones/fisiología , Habla/fisiología , Bases de Datos Factuales , Algoritmos , Reconocimiento de Normas Patrones Automatizadas/métodosRESUMEN
Emotion recognition through speech is a technique employed in various scenarios of Human-Computer Interaction (HCI). Existing approaches have achieved significant results; however, limitations persist, with the quantity and diversity of data being more notable when deep learning techniques are used. The lack of a standard in feature selection leads to continuous development and experimentation. Choosing and designing the appropriate network architecture constitutes another challenge. This study addresses the challenge of recognizing emotions in the human voice using deep learning techniques, proposing a comprehensive approach, and developing preprocessing and feature selection stages while constructing a dataset called EmoDSc as a result of combining several available databases. The synergy between spectral features and spectrogram images is investigated. Independently, the weighted accuracy obtained using only spectral features was 89%, while using only spectrogram images, the weighted accuracy reached 90%. These results, although surpassing previous research, highlight the strengths and limitations when operating in isolation. Based on this exploration, a neural network architecture composed of a CNN1D, a CNN2D, and an MLP that fuses spectral features and spectogram images is proposed. The model, supported by the unified dataset EmoDSc, demonstrates a remarkable accuracy of 96%.
Asunto(s)
Aprendizaje Profundo , Emociones , Redes Neurales de la Computación , Humanos , Emociones/fisiología , Habla/fisiología , Bases de Datos Factuales , Algoritmos , Reconocimiento de Normas Patrones Automatizadas/métodosRESUMEN
Introdução: o presente estudo visa mapear e avaliar a produção registrada sobre Fonoaudiologia Empresarial, a fim de identificar as temáticas mais pesquisadas, bem como as temáticas pouco exploradas em dissertações e teses na área. Objetivo: analisar a produção científica brasileira defendida entre 2002-2022, considerando nível de produção, ano, rede de ensino, instituição de ensino superior (localização geográfica), tipo de pesquisa, descritor registrado (primeiro), local, temática, total da amostra pesquisada e áreas de conhecimento. Método: revisão realizada na Biblioteca Digital Brasileira de Teses e Dissertações, em 05 de maio de 2023, considerando os termos "Fonoaudiologia" e "Empresa", pesquisados no período 2002-2022, segundo as variáveis anteriormente descritas, analisados de forma descritiva. Resultados:dentre 30 fontes registradas, 24-80,0% são dissertações, sendo 2007 o ano mais produtivo (6-20,0%). A Região Sudeste liderou a pesquisa (20-66,7%), representada pela PUC-SP (10-33,3%) e o destaque foi de pesquisas do tipo observacional (22-73,3%), sendo Empresas os locais mais pesquisados (20-66,7%) e o descritor "saúde do trabalhador" o mais utilizado (03-10,0%). A área de conhecimento (CNPq) que mais pesquisou foi Ciências da Saúde (25-83,3%) por meio da subárea Fonoaudiologia (20-66,7%%), sendo a Audiologia a temática mais pesquisada (16-53,3%). Conclusão: foram encontrados 16,53,3% registros na área de Audiologia e as pesquisas realizadas na área de Voz (7-23,3%) abordam os temas relacionados a qualidade vocal, comunicação e expressividade, no entanto, não abordam liderança. Tal dado sugere esforços em pesquisas científicas e atuação profissional, já que a Fonoaudiologia tem como objeto de estudo e atuação, a comunicação humana.(AU)
Introduction: this study aims to explore the Speech-Therapy's literature and its contribution to identify the most researched and few explored themes in dissertations and theses in the area. Objective:to analyze the Brazilian scientific production submitted between 2002 and 2022, considering production level, publication year, institution of defense, geographical location, research methodology, the first descriptor, research location, the thematic focus, total sample size and knowledge areas. Method: the review analysis was conducted using data obtained from the Brazilian Digital Library of Theses and Dissertations on May 5, 2023, using the terms: "Speech-Therapy" and "Company" to retrieve theses and dissertations from 2002 to 2022 according to the variables described above. Data were analyzed descriptively. Results: among the 30 entries retrieved, 24-80,0% were dissertations, most of which defended in 2007 (6-20,0%). The majority of the studies were from the Southeast region (20- 66.7%), represented by Pontifícia Universidade Católica de São Paulo: PUC-SP (10-33.3%) and the highlight was observational researches (22-73.3%) and the majority of the research was conducted at business companies (20-66,7%). In addition, "worker's health" was the most used descriptor (3-10,0%). The knowledge area (CNPQ) that produced the most studies was Health Sciences (25-83,3%) through the subarea of Speech-Language-Pathology (20-66,7%%), with Audiology being the most researched theme (16-53,3%). Conclusion: Audiology was the area with the highest number of studies found 16,53,3%. Research conducted in the Voice field (7-23,3%) addresses topics related to vocal quality, communication and expressiveness, however, they do not address leadership. The findings suggest a need for future research. Further studies can build upon insights to advance knowledge and promote evidence-based practice in the field of business companies, considering that Speech-Therapy has as its object of study and activity human communication. (AU)
Introducción: este estudio tiene como objetivo mapear y evaluar la producción grabada sobre Fonoaudiología Empresarial, con el fin de identificar los temas más investigados, así como los temas poco explorados en disertaciones y tesis en el área. Objetivo: analizar la producción científica brasileña defendida entre 2002-2022, considerando nivel de producción, año, red educativa, institución de educación superior (ubicación geográfica), tipo de investigación, descriptor registrado (primero), ubicación, tema, muestra total investigada y áreas. del conocimiento. Método: revisión realizada en la Biblioteca Digital Brasileña de Tesis y Disertaciones, el 5 de mayo de 2023, considerando los términos "Fonoaudiología" y "Empresa", investigados en el período 2002-2022, según las variables previamente descritas, analizadas en una manera descriptiva. Resultados: entre 30 fuentes registradas, 24-80,0% son disertaciones, siendo 2007 el año más productivo (6-20,0%). La Región Sudeste lideró la investigación (20-66,7%), representada por la PUC-SP (10-33,3%) y destaque para la investigación observacional (22-73,3%), siendo las Empresas las localidades más investigadas (20-66,7%) y el descriptor "salud del trabajador" el más utilizado (03-10,0%). El área del conocimiento (CNPq) más investigada fue Ciencias de la Salud (25-83,3%) a través de la subárea Fonoaudiología (20-66,7%), siendo la Audiología el tema más investigado (16-53,3%). Conclusión: Se encontraron 16,53,3% registros en el área de Audiología y las investigaciones realizadas en el área de Voz (7-23,3%) abordan temas relacionados con la calidad vocal, la comunicación y la expresividad, sin embargo, no abordan el liderazgo. Estos datos sugieren esfuerzos en la investigación científica y en el desempeño profesional, ya que la Fonoaudiología tiene como objeto de estudio y actividad la comunicación humana. (AU)
Asunto(s)
Organizaciones , Tesis Académicas como Asunto , Fonoaudiología , Habla , Voz , Brasil , Bibliometría , Comunicación , LiderazgoRESUMEN
OBJECTIVE: This study aimed to compare the influence of four different maxillary removable orthodontic retainers on speech. MATERIAL AND METHODS: Eligibility criteria for sample selection were: 20-40-year subjects with acceptable occlusion, native speakers of Portuguese. The volunteers (n=21) were divided in four groups randomized with a 1:1:1:1 allocation ratio. The four groups used, in random order, the four types of retainers full-time for 21 days each, with a washout period of 7-days. The removable maxillary retainers were: conventional wraparound, wraparound with an anterior hole, U-shaped wraparound, and thermoplastic retainer. Three volunteers were excluded. The final sample comprised 18 subjects (11 male; 7 female) with mean age of 27.08 years (SD=4.65). The speech evaluation was performed in vocal excerpts recordings made before, immediately after, and 21 days after the installation of each retainer, with auditory-perceptual and acoustic analysis of formant frequencies F1 and F2 of the vowels. Repeated measures ANOVA and Friedman with Tukey tests were used for statistical comparison. RESULTS: Speech changes increased immediately after conventional wraparound and thermoplastic retainer installation, and reduced after 21 days, but not to normal levels. However, this increase was statistically significant only for the wraparound with anterior hole and the thermoplastic retainer. Formant frequencies of vowels were altered at initial time, and the changes remained in conventional, U-shaped and thermoplastic appliances after three weeks. CONCLUSIONS: The thermoplastic retainer was more harmful to the speech than wraparound appliances. The conventional and U-shaped retainers interfered less in speech. The three-week period was not sufficient for speech adaptation.
Asunto(s)
Estudios Cruzados , Retenedores Ortodóncicos , Humanos , Femenino , Masculino , Adulto , Diseño de Aparato Ortodóncico , Adulto Joven , Habla/fisiologíaRESUMEN
Diagnostic tests for Parkinsonism based on speech samples have shown promising results. Although abnormal auditory feedback integration during speech production and impaired rhythmic organization of speech are known in Parkinsonism, these aspects have not been incorporated into diagnostic tests. This study aimed to identify Parkinsonism using a novel speech behavioral test that involved rhythmically repeating syllables under different auditory feedback conditions. The study included 30 individuals with Parkinson's disease (PD) and 30 healthy subjects. Participants were asked to rhythmically repeat the PA-TA-KA syllable sequence, both whispering and speaking aloud under various listening conditions. The results showed that individuals with PD had difficulties in whispering and articulating under altered auditory feedback conditions, exhibited delayed speech onset, and demonstrated inconsistent rhythmic structure across trials compared to controls. These parameters were then fed into a supervised machine-learning algorithm to differentiate between the two groups. The algorithm achieved an accuracy of 85.4%, a sensitivity of 86.5%, and a specificity of 84.3%. This pilot study highlights the potential of the proposed behavioral paradigm as an objective and accessible (both in cost and time) test for identifying individuals with Parkinson's disease.
Asunto(s)
Retroalimentación Sensorial , Enfermedad de Parkinson , Habla , Humanos , Femenino , Masculino , Anciano , Enfermedad de Parkinson/fisiopatología , Enfermedad de Parkinson/diagnóstico , Persona de Mediana Edad , Habla/fisiología , Retroalimentación Sensorial/fisiología , Proyectos Piloto , Trastornos Parkinsonianos/fisiopatología , Estudios de Casos y ControlesRESUMEN
Pauses in speech are indicators of cognitive effort during language production and have been examined to inform theories of lexical, grammatical and discourse processing in healthy speakers and individuals with aphasia (IWA). Studies of pauses have commonly focused on their location and duration in relation to grammatical properties such as word class or phrase complexity. However, recent studies of speech output in aphasia have revealed that utterances of IWA are characterised by stronger collocations, i.e., combinations of words that are often used together. We investigated the effects of collocation strength and lexical frequency on pause duration in comic strip narrations of IWA and non-brain-damaged (NBD) individuals with part of speech (PoS; content and function words) as covariate. Both groups showed a decrease in pause duration within more strongly collocated bigrams and before more frequent content words, with stronger effects in IWA. These results are consistent with frameworks which propose that strong collocations are more likely to be processed as holistic, perhaps even word-like, units. Usage-based approaches prove valuable in explaining patterns of preservation and impairment in aphasic language production.
Asunto(s)
Afasia , Habla , Humanos , Afasia/fisiopatología , Habla/fisiología , Masculino , Femenino , Persona de Mediana Edad , Anciano , Adulto , LenguajeRESUMEN
THIS ARTICLE USES WORDS OR LANGUAGE THAT IS CONSIDERED PROFANE, VULGAR, OR OFFENSIVE BY SOME READERS. Hate speech detection in online social networks is a multidimensional problem, dependent on language and cultural factors. Most supervised learning resources for this task, such as labeled datasets and Natural Language Processing (NLP) tools, have been specifically tailored for English. However, a large portion of web users around the world speak different languages, creating an important need for efficient multilingual hate speech detection approaches. In particular, such approaches should be able to leverage the limited cross-lingual resources currently existing in their learning process. The cross-lingual transfer in this task has been difficult to achieve successfully. Therefore, we propose a simple yet effective method to approach this problem. To our knowledge, ours is the first attempt to create a multilingual embedding model specific to this problem. We validate the effectiveness of our approach by performing an extensive comparative evaluation against several well-known general-purpose language models that, unlike ours, have been trained on massive amounts of data. We focus on a zero-shot cross-lingual evaluation scenario in which we classify hate speech in one language without having access to any labeled data. Despite its simplicity, our embeddings outperform more complex models for most experimental settings we tested. In addition, we provide further evidence of the effectiveness of our approach through an ad hoc qualitative exploratory analysis, which captures how hate speech is displayed in different languages. This analysis allows us to find new cross-lingual relations between words in the hate-speech domain. Overall, our findings indicate common patterns in how hate speech is expressed across languages and that our proposed model can capture such relationships significantly.
Asunto(s)
Multilingüismo , Procesamiento de Lenguaje Natural , Humanos , Habla/fisiología , Lenguaje , OdioRESUMEN
Dementia can disrupt how people experience and describe events as well as their own role in them. Alzheimer's disease (AD) compromises the processing of entities expressed by nouns, while behavioral variant frontotemporal dementia (bvFTD) entails a depersonalized perspective with increased third-person references. Yet, no study has examined whether these patterns can be captured in connected speech via natural language processing tools. To tackle such gaps, we asked 96 participants (32 AD patients, 32 bvFTD patients, 32 healthy controls) to narrate a typical day of their lives and calculated the proportion of nouns, verbs, and first- or third-person markers (via part-of-speech and morphological tagging). We also extracted objective properties (frequency, phonological neighborhood, length, semantic variability) from each content word. In our main study (with 21 AD patients, 21 bvFTD patients, and 21 healthy controls), we used inferential statistics and machine learning for group-level and subject-level discrimination. The above linguistic features were correlated with patients' scores in tests of general cognitive status and executive functions. We found that, compared with HCs, (i) AD (but not bvFTD) patients produced significantly fewer nouns, (ii) bvFTD (but not AD) patients used significantly more third-person markers, and (iii) both patient groups produced more frequent words. Machine learning analyses showed that these features identified individuals with AD and bvFTD (AUC = 0.71). A generalizability test, with a model trained on the entire main study sample and tested on hold-out samples (11 AD patients, 11 bvFTD patients, 11 healthy controls), showed even better performance, with AUCs of 0.76 and 0.83 for AD and bvFTD, respectively. No linguistic feature was significantly correlated with cognitive test scores in either patient group. These results suggest that specific cognitive traits of each disorder can be captured automatically in connected speech, favoring interpretability for enhanced syndrome characterization, diagnosis, and monitoring.
Asunto(s)
Enfermedad de Alzheimer , Demencia Frontotemporal , Habla , Humanos , Demencia Frontotemporal/psicología , Demencia Frontotemporal/diagnóstico , Enfermedad de Alzheimer/diagnóstico , Enfermedad de Alzheimer/psicología , Femenino , Masculino , Anciano , Persona de Mediana Edad , Estudios de Casos y Controles , Biomarcadores , Procesamiento de Lenguaje Natural , Aprendizaje Automático , Pruebas Neuropsicológicas , Función Ejecutiva/fisiologíaRESUMEN
Speech can be defined as the human ability to communicate through a sequence of vocal sounds. Consequently, speech requires an emitter (the speaker) capable of generating the acoustic signal and a receiver (the listener) able to successfully decode the sounds produced by the emitter (i.e., the acoustic signal). Time plays a central role at both ends of this interaction. On the one hand, speech production requires precise and rapid coordination, typically within the order of milliseconds, of the upper vocal tract articulators (i.e., tongue, jaw, lips, and velum), their composite movements, and the activation of the vocal folds. On the other hand, the generated acoustic signal unfolds in time, carrying information at different timescales. This information must be parsed and integrated by the receiver for the correct transmission of meaning. This chapter describes the temporal patterns that characterize the speech signal and reviews research that explores the neural mechanisms underlying the generation of these patterns and the role they play in speech comprehension.
Asunto(s)
Habla , Humanos , Habla/fisiología , Percepción del Habla/fisiología , Acústica del Lenguaje , PeriodicidadRESUMEN
Semantic verbal fluency (SVF) impairment is present in several neurological disorders. Although activation in SVF-related areas has been reported, how these regions are connected and their functional roles in the network remain divergent. We assessed SVF static and dynamic functional connectivity (FC) and effective connectivity in healthy participants using functional magnetic resonance imaging. We observed activation in the inferior frontal (IFG), middle temporal (pMTG) and angular gyri (AG), anterior cingulate (AC), insular cortex, and regions of the superior, middle, and medial frontal gyri (SFG, MFG, MidFG). Our static FC analysis showed a highly interconnected task and resting state network. Increased connectivity of AC with the pMTG and AG was observed for the task. The dynamic FC analysis provided circuits with connections similarly modulated across time and regions related to category identification, language comprehension, word selection and recovery, word generation, inhibition of speaking, speech planning, and articulatory planning of orofacial movements. Finally, the effective connectivity analysis provided a network that best explained our data, starting at the AG and going to the pMTG, from which there was a division between the ventral and dorsal streams. The SFG and MFG regions were connected and modulated by the MidFG, while the inferior regions formed the ventral stream. Therefore, we successfully assessed the SVF network, exploring regions associated with the entire processing, from category identification to word generation. The methodological approach can be helpful for further investigation of the SVF network in neurological disorders.
Asunto(s)
Mapeo Encefálico , Encéfalo , Imagen por Resonancia Magnética , Vías Nerviosas , Semántica , Humanos , Masculino , Femenino , Imagen por Resonancia Magnética/métodos , Adulto , Mapeo Encefálico/métodos , Vías Nerviosas/fisiología , Vías Nerviosas/diagnóstico por imagen , Adulto Joven , Encéfalo/fisiología , Encéfalo/diagnóstico por imagen , Conducta Verbal/fisiología , Habla/fisiología , Red Nerviosa/fisiología , Red Nerviosa/diagnóstico por imagenRESUMEN
PURPOSE: To present the content and response process validity evidence of the Speaking in Public Coping of Scale (ECOFAP). METHODS: A methodological study to develop and validate the instrument. It followed the instrument development method with theoretical, empirical, and analytical procedures, based on the validity criteria of the Standards for Educational and Psychological Testing (SEPT). The process of obtaining content validity evidence had two stages: 1) conceptual definition of the construct, based on theoretical precepts of speaking in public and the Motivational Theory of Coping (MTC); 2) developing items and response keys, structuring the instrument, assessment by a committee with 10 specialists, restructuring scale items, and developing the ECOFAP pilot version. Item representativity was analyzed through the item content validity index. The response process was conducted in a single stage with a convenience sample of 30 people with and without difficulties speaking in public, from the campus of a Brazilian university, belonging to various social and professional strata. In this process, the respondents' verbal and nonverbal reactions were qualitatively analyzed. RESULTS: The initial version of ECOFAP, consisting of 46 items, was evaluated by judges and later reformulated, resulting in a second version with 60 items. This second version was again submitted for expert analysis, and the content validity index per item was calculated. 18 items were excluded, resulting in a third version of 42 items. The validity evidence based on the response processes of the 42-item version was applied to a sample of 30 individuals, resulting in the rewriting of one item and the inclusion of six more items, resulting in the pilot version of ECOFAP with 48 items. CONCLUSION: ECOFAP pilot version has items with well-structured semantics and syntactic, representing strategies to cope with speaking in public.
OBJETIVO: Apresentar as evidências de validade baseadas no conteúdo e nos processos de resposta da Escala de Coping para a Fala em Público (ECOFAP). MÉTODO: Estudo metodológico de elaboração e validação de instrumento. Seguiu-se o modelo de elaboração de instrumentos com procedimentos teóricos, empíricos e analíticos, baseados nos critérios de validade do Standards for Educational and Psychological Testing (SEPT). O processo de obtenção das evidências de validade baseadas no conteúdo foi realizado em duas etapas: 1) definição conceitual do construto, elaborado com base nos preceitos teóricos da fala em público e da Teoria Motivacional do Coping (TMC); 2) elaboração dos itens e chave de respostas, estruturação do instrumento, avaliação por comitê de dez especialistas, reestruturação dos itens da escala, realizada em três momentos, até a elaboração da versão piloto da ECOFAP. O processo de resposta foi realizado com amostra de conveniência de 30 indivíduos, com e sem dificuldades de fala em público, no campus de uma universidade brasileira, pertencentes a diferentes extratos sociais e profissões. Nesse processo, foram analisadas qualitativamente as reações verbais e não verbais dos respondentes. RESULTADOS: A primeira versão da ECOFAP, composta por 46 itens, foi avaliada pelos juízes e posteriormente reformulada, resultando em uma segunda versão com 60 itens. Essa segunda versão foi novamente submetida à análise de especialistas e calculado o índice de validade de conteúdo por item. Foram excluídos 18 itens, originando uma terceira versão de 42 itens. As evidências de validade com base nos processos de resposta da versão de 42 itens foram aplicadas em uma amostra de 30 indivíduos, resultando na reescrita de um item e inclusão de mais seis itens, originando a versão piloto da ECOFAP de 48 itens. CONCLUSÃO: A versão piloto da ECOFAP apresenta itens bem estruturados semântica e sintaticamente que representam estratégias de enfrentamento para a fala em público.
Asunto(s)
Adaptación Psicológica , Psicometría , Humanos , Reproducibilidad de los Resultados , Masculino , Femenino , Brasil , Encuestas y Cuestionarios , Psicometría/normas , Adulto , Adulto Joven , Persona de Mediana Edad , HablaRESUMEN
PURPOSE: To assess the influence of the listener experience, measurement scales and the type of speech task on the auditory-perceptual evaluation of the overall severity (OS) of voice deviation and the predominant type of voice (rough, breathy or strain). METHODS: 22 listeners, divided into four groups participated in the study: speech-language pathologist specialized in voice (SLP-V), SLP non specialized in voice (SLP-NV), graduate students with auditory-perceptual analysis training (GS-T), and graduate students without auditory-perceptual analysis training (GS-U). The subjects rated the OS of voice deviation and the predominant type of voice of 44 voices by visual analog scale (VAS) and the numerical scale (score "G" from GRBAS), corresponding to six speech tasks such as sustained vowel /a/ and /É/, sentences, number counting, running speech, and all five previous tasks together. RESULTS: Sentences obtained the best interrater reliability in each group, using both VAS and GRBAS. SLP-NV group demonstrated the best interrater reliability in OS judgment in different speech tasks using VAS or GRBAS. Sustained vowel (/a/ and /É/) and running speech obtained the best interrater reliability among the groups of listeners in judging the predominant vocal quality. GS-T group got the best result of interrater reliability in judging the predominant vocal quality. CONCLUSION: The time of experience in the auditory-perceptual judgment of the voice, the type of training to which they were submitted, and the type of speech task influence the reliability of the auditory-perceptual evaluation of vocal quality.
Asunto(s)
Disfonía , Percepción del Habla , Humanos , Habla , Reproducibilidad de los Resultados , Medición de la Producción del Habla , Variaciones Dependientes del Observador , Calidad de la Voz , Acústica del LenguajeRESUMEN
Stuttering, affecting approximately 1% of the global population, is a complex speech disorder significantly impacting individuals' quality of life. Prior studies using electromyography (EMG) to examine orofacial muscle activity in stuttering have presented mixed results, highlighting the variability in neuromuscular responses during stuttering episodes. Fifty-five participants with stuttering and 30 individuals without stuttering, aged between 18 and 40, participated in the study. EMG signals from five facial and cervical muscles were recorded during speech tasks and analyzed for mean amplitude and frequency activity in the 5-15 Hz range to identify significant differences. Upon analysis of the 5-15 Hz frequency range, a higher average amplitude was observed in the zygomaticus major muscle for participants while stuttering (p < 0.05). Additionally, when assessing the overall EMG signal amplitude, a higher average amplitude was observed in samples obtained from disfluencies in participants who did not stutter, particularly in the depressor anguli oris muscle (p < 0.05). Significant differences in muscle activity were observed between the two groups, particularly in the depressor anguli oris and zygomaticus major muscles. These results suggest that the underlying neuromuscular mechanisms of stuttering might involve subtle aspects of timing and coordination in muscle activation. Therefore, these findings may contribute to the field of biosensors by providing valuable perspectives on neuromuscular mechanisms and the relevance of electromyography in stuttering research. Further research in this area has the potential to advance the development of biosensor technology for language-related applications and therapeutic interventions in stuttering.
Asunto(s)
Electromiografía , Músculos Faciales , Habla , Tartamudeo , Humanos , Electromiografía/métodos , Masculino , Adulto , Femenino , Tartamudeo/fisiopatología , Habla/fisiología , Músculos Faciales/fisiología , Músculos Faciales/fisiopatología , Fenómenos Biomecánicos/fisiología , Adulto Joven , Adolescente , Contracción Muscular/fisiologíaRESUMEN
Objective. In recent years, electroencephalogram (EEG)-based brain-computer interfaces (BCIs) applied to inner speech classification have gathered attention for their potential to provide a communication channel for individuals with speech disabilities. However, existing methodologies for this task fall short in achieving acceptable accuracy for real-life implementation. This paper concentrated on exploring the possibility of using inter-trial coherence (ITC) as a feature extraction technique to enhance inner speech classification accuracy in EEG-based BCIs.Approach. To address the objective, this work presents a novel methodology that employs ITC for feature extraction within a complex Morlet time-frequency representation. The study involves a dataset comprising EEG recordings of four different words for ten subjects, with three recording sessions per subject. The extracted features are then classified using k-nearest-neighbors (kNNs) and support vector machine (SVM).Main results. The average classification accuracy achieved using the proposed methodology is 56.08% for kNN and 59.55% for SVM. These results demonstrate comparable or superior performance in comparison to previous works. The exploration of inter-trial phase coherence as a feature extraction technique proves promising for enhancing accuracy in inner speech classification within EEG-based BCIs.Significance. This study contributes to the advancement of EEG-based BCIs for inner speech classification by introducing a feature extraction methodology using ITC. The obtained results, on par or superior to previous works, highlight the potential significance of this approach in improving the accuracy of BCI systems. The exploration of this technique lays the groundwork for further research toward inner speech decoding.
Asunto(s)
Interfaces Cerebro-Computador , Electroencefalografía , Habla , Humanos , Electroencefalografía/métodos , Electroencefalografía/clasificación , Masculino , Habla/fisiología , Femenino , Adulto , Máquina de Vectores de Soporte , Adulto Joven , Reproducibilidad de los Resultados , AlgoritmosRESUMEN
PURPOSE: To seek evidence of validity and reliability for the Compressed Speech Test with Figures. METHODS: The study was subdivided into three stages: construct validation, criteria and reliability. All participants were aged between 6:00 and 8:11. For the construct, Compressed Speech with Figures and the gold standard Adapted Compressed Speech test were applied to children with typical phonological development. For criterion analysis, Compressed Speech with Figures was applied in two groups, with typical (G1) and atypical (G2) phonological development. Finally, the application protocols underwent analysis by two Speech Therapists, with experience in the area of Central Auditory Processing, seeking to obtain an inter-evaluator reliability analysis. RESULTS: The correlation test indicated an almost perfect construct (correlation 0.843 for the right ear and 0.823 for the left ear). In the criterion analysis, it was noticed that both groups presented satisfactory results (G1 = 99.6 to 100%; G2 = 96 to 96.5%). The reliability analysis demonstrated that the protocol is easy to analyze, as both professionals presented unanimous responses. CONCLUSION: It was possible to obtain evidence of validity and reliability for the Compressed Speech with Figures instrument. The construct analysis showed that the instrument measures the same variable as the gold standard test, with an almost perfect correlation. In the criterion analysis, both groups presented similar performance, demonstrating that the instrument does not seem to differentiate populations with and without mild phonological disorder. The inter-evaluator reliability analysis demonstrated that the protocol is easy to analyze and score.
OBJETIVO: Buscar evidências de validade e fidedignidade para o Teste de Fala Comprimida com Figuras. MÉTODO: O estudo foi subdividido em três etapas: validação de construto, critério e fidedignidade. Todos os participantes tinham idade entre 6:00 e 8:11. Para o construto, aplicou-se o Fala Comprimida com Figuras e o teste padrão ouro Fala Comprimida Adaptado em crianças com desenvolvimento fonológico típico. Para análise de critério, aplicou-se o Fala Comprimida com Figuras em dois grupos, com desenvolvimento fonológico típico (G1) e atípico (G2). Por fim, os protocolos de aplicação passaram pela análise de duas Fonoaudiólogas, com experiência na área do Processamento Auditivo Central, buscando obter uma análise de fidedignidade interavaliadores. RESULTADOS: O teste de correlação indicou um construto quase perfeito (Rho=0,843 para orelha direita e Rho=0,823 para orelha esquerda). Na análise de critério, percebeu-se que ambos os grupos apresentaram resultados satisfatórios (G1 = 99,6 a 100%; G2 = 96 a 96,5%). Já a análise de fidedignidade demonstrou que o protocolo é de fácil análise, pois ambos os profissionais apresentaram respostas unânimes. CONCLUSÃO: Foi possível obter evidências de validade e fidedignidade para o instrumento de Fala Comprimida com Figuras. A análise de construto evidenciou que o instrumento mede a mesma variável que o teste padrão outro, com correlação quase perfeita. Na análise de critério, ambos os grupos apresentaram desempenho semelhante, demonstrando que o instrumento não parece diferenciar populações com e sem transtorno fonológico leve. A análise de fidedignidade interavaliador demonstrou que o protocolo é de fácil análise e pontuação.
Asunto(s)
Trastorno Fonológico , Habla , Niño , Humanos , Habla/fisiología , Reproducibilidad de los Resultados , Medición de la Producción del Habla , FonéticaRESUMEN
This paper presents a unified model for combining beamforming and blind source separation (BSS). The validity of the model's assumptions is confirmed by recovering target speech information in noise accurately using Oracle information. Using real static human-robot interaction (HRI) data, the proposed combination of BSS with the minimum-variance distortionless response beamformer provides a greater signal-to-noise ratio (SNR) than previous parallel and cascade systems that combine BSS and beamforming. In the difficult-to-model HRI dynamic environment, the system provides a SNR gain that was 2.8 dB greater than the results obtained with the cascade combination, where the parallel combination is infeasible.
Asunto(s)
Robótica , Humanos , Relación Señal-Ruido , HablaRESUMEN
When engaged in a conversation, one receives auditory information from the other's speech but also from their own speech. However, this information is processed differently by an effect called Speech-Induced Suppression. Here, we studied brain representation of acoustic properties of speech in natural unscripted dialogues, using electroencephalography (EEG) and high-quality speech recordings from both participants. Using encoding techniques, we were able to reproduce a broad range of previous findings on listening to another's speech, and achieving even better performances when predicting EEG signal in this complex scenario. Furthermore, we found no response when listening to oneself, using different acoustic features (spectrogram, envelope, etc.) and frequency bands, evidencing a strong effect of SIS. The present work shows that this mechanism is present, and even stronger, during natural dialogues. Moreover, the methodology presented here opens the possibility of a deeper understanding of the related mechanisms in a wider range of contexts.
Asunto(s)
Electroencefalografía , Habla , Humanos , Habla/fisiología , Estimulación Acústica/métodos , Electroencefalografía/métodos , Encéfalo , Mapeo Encefálico/métodosRESUMEN
BACKGROUND: Palatal lengthening is becoming a first-line treatment choice for cleft patients with velopharyngeal insufficiency (VPI). As cleft palate-related surgical outcomes are age dependent, speech outcomes may be similarly affected by patient age at the time of treatment. The primary goal of this study is to determine whether there are age-related speech outcome differences when double opposing buccinator myomucosal flaps are used as part of a palatal lengthening protocol and whether these outcome differences preclude utilization of this technique for specific patient age groups. METHODS: A retrospective study was performed on consecutive nonsyndromic patients with VPI who underwent treatment using double opposing buccinator myomucosal flaps at our hospital between 2014 and 2021. Patients who completed the 15-month follow-up were stratified by age. Group A aged between 2 and 7 years (n = 14), group B aged 8 and 18 years (n = 23), and group C aged older than 18 years (n = 25) were included. Standardized perceptual speech evaluations and nasopharyngoscopy were performed. Hypernasality, soft palate mobility, and lateral palatal wall mobility were assessed both preoperatively and at a 15-month postoperative interval. Complications were also recorded. The χ2 test was used for statistical comparison. RESULTS: All of the age-stratified patient groups in this study showed significant improvement in hypernasality, soft palate mobility, and lateral wall mobility (P < 0.01), with no statistically significant differences between the different patient age groups. Overall speech success was achieved in 69.4% of patients. Patients in group A achieved 78.6% speech success, patients in group B achieved 78.3% speech success, and patients in group C achieved 56% speech success, with no statistically significant differences being shown regarding speech success between the different patient age groups (P > 0.05). CONCLUSIONS: Regardless of age, palatal lengthening via double opposing buccinator myomucosal flaps similarly improves speech outcomes.