A Multi-Label Text Classifier at Publication Level Based on "PubMedBERT + TextRNN" for Cancer Literature.
Stud Health Technol Inform
; 316: 374-375, 2024 Aug 22.
Article
en En
| MEDLINE
| ID: mdl-39176755
ABSTRACT
There is a rapid growth in the volume of data in the cancer field and fine-grained classification is in high demand especially for interdisciplinary and collaborative research. There is thus a need to establish a multi-label classifier with higher resolution to reduce the burden of screening articles for clinical relevance. This research trains a multi-label classifier with scalability for classifying literature on cancer research directly at the publication level. Firstly, a corpus was divided into a training set and a testing set at a ratio of 73. Secondly, we compared the performance of classifiers developed by "PubMedBERT + TextRNN" and "BioBERT + TextRNN" with ICRP CT. Finally, the classifier was obtained based on the optimal combination "PubMedBERT + TextRNN", with P= 0.952014, R=0.936696, F1=0.931664. The quantitative comparisons demonstrate that the resulting classifier is fit for high-resolution classification of cancer literature at the publication level to support accurate retrieving and academic statistics.
Palabras clave
Texto completo:
1
Colección:
01-internacional
Base de datos:
MEDLINE
Asunto principal:
Neoplasias
Límite:
Humans
Idioma:
En
Revista:
Stud Health Technol Inform
Asunto de la revista:
INFORMATICA MEDICA
/
PESQUISA EM SERVICOS DE SAUDE
Año:
2024
Tipo del documento:
Article
País de afiliación:
China
Pais de publicación:
Países Bajos