Exploiting Unlabeled Texts with Clustering-based Instance Selection for Medical Relation Classification.
AMIA Annu Symp Proc
; 2017: 1060-1069, 2017.
Article
en En
| MEDLINE
| ID: mdl-29854174
Classifying relations between pairs of medical concepts in clinical texts is a crucial task to acquire empirical evidence relevant to patient care. Due to limited labeled data and extremely unbalanced class distributions, medical relation classification systems struggle to achieve good performance on less common relation types, which capture valuable information that is important to identify. Our research aims to improve relation classification using weakly supervised learning. We present two clustering-based instance selection methods that acquire a diverse and balanced set of additional training instances from unlabeled data. The first method selects one representative instance from each cluster containing only unlabeled data. The second method selects a counterpart for each training instance using clusters containing both labeled and unlabeled data. These new instance selection methods for weakly supervised learning achieve substantial recall gains for the minority relation classes compared to supervised learning, while yielding comparable performance on the majority relation classes.
Texto completo:
1
Colección:
01-internacional
Base de datos:
MEDLINE
Asunto principal:
Almacenamiento y Recuperación de la Información
/
Registros Electrónicos de Salud
/
Aprendizaje Automático Supervisado
Tipo de estudio:
Prognostic_studies
Límite:
Humans
Idioma:
En
Revista:
AMIA Annu Symp Proc
Asunto de la revista:
INFORMATICA MEDICA
Año:
2017
Tipo del documento:
Article
País de afiliación:
Estados Unidos
Pais de publicación:
Estados Unidos