TClustVID: A Novel Machine Learning Classification Model to Investigate Topics and Sentiment inCOVID-19 Tweets

Md. Shahriare Satu; Md. Imran Khan; Mufti Mahmud; Shahadat Uddin; Matthew A Summers; Julian M. W. Quinn; Mohammad Ali Moni

Este articulo es un Preprint

Los preprints son informes de investigación preliminares que no han sido certificados por revisión por pares. No deben considerarse para guiar la práctica clínica o los comportamientos relacionados con la salud y no deben publicarse en los medios como información establecida.

Los preprints publicados en línea permiten a los autores recibir comentarios rápidamente, y toda la comunidad científica puede evaluar de forma independiente el trabajo y responder adecuadamente. Estos comentarios se publican junto con los preprints para que cualquiera pueda leer y servir como una revisión pospublicación.

TClustVID: A Novel Machine Learning Classification Model to Investigate Topics and Sentiment inCOVID-19 Tweets

Md. Shahriare Satu; Md. Imran Khan; Mufti Mahmud; Shahadat Uddin; Matthew A Summers; Julian M. W. Quinn; Mohammad Ali Moni.

Afiliación

Md. Shahriare Satu; Faculty Member, Department of Management Information Systems, Noakhali Science and Technology University
Md. Imran Khan; Gono Bishwabidylay
Mufti Mahmud; Dept. of Computing & Technology, Nottingham Trent University
Shahadat Uddin; The University of Sydney
Matthew A Summers; Garvan Institute of Medical Research
Julian M. W. Quinn; Garvan Institute of Medical Research
Mohammad Ali Moni; University of New South Wales

Preprint en En | PREPRINT-MEDRXIV | ID: ppmedrxiv-20167973

Artículo de revista
Un artículo publicado en revista científica está disponible y probablemente es basado en este preprint, por medio del reconocimiento de similitud realizado por una máquina. La confirmación humana aún está pendiente.
Ver artículo de revista

ABSTRACT

ABSTRACT

COVID-19, caused by the SARS-Cov2, varies greatly in its severity but represent serious respiratory symptoms with vascular and other complications, particularly in older adults. The disease can be spread by both symptomatic and asymptomatic infected individuals, and remains uncertainty over key aspects of its infectivity, no effective remedy yet exists and this disease causes severe economic effects globally. For these reasons, COVID-19 is the subject of intense and widespread discussion on social media platforms including Facebook and Twitter. These public forums substantially impact on public opinions in some cases and exacerbate widespread panic and misinformation spread during the crisis. Thus, this work aimed to design an intelligent clustering-based classification and topics extracting model (named TClustVID) that analyze COVID-19-related public tweets to extract significant sentiments with high accuracy. We gathered COVID-19 Twitter datasets from the IEEE Dataport repository and employed a range of data preprocessing methods to clean the raw data, then applied tokenization and produced a word-to-index dictionary. Thereafter, different classifications were employed to Twitter datasets which enabled exploration of the performance of traditional and TclustVID classification methods. TClustVID showed higher performance compared to the traditional classifiers determined by clustering criteria. Finally, we extracted significant topic clusters from TClustVID, split them into positive, neutral and negative clusters and implemented latent dirichlet allocation for extraction of popular COVID-19 topics. This approach identified common prevailing public opinions and concerns related to COVID-19, as well as attitudes to infection prevention strategies held by people from different countries concerning the current pandemic situation.

Licencia

cc_by_nc_nd

Texto completo

Añadir a Mi BVS

Imprimir

XML

Buscar en Google

Texto completo: 1 Colección: 09-preprints Base de datos: PREPRINT-MEDRXIV Tipo de estudio: Prognostic_studies Idioma: En Año: 2020 Tipo del documento: Preprint

Texto completo

Añadir a Mi BVS

Imprimir

XML

Buscar en Google

Texto completo: 1 Colección: 09-preprints Base de datos: PREPRINT-MEDRXIV Tipo de estudio: Prognostic_studies Idioma: En Año: 2020 Tipo del documento: Preprint