Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 3 de 3
Filtrar
Más filtros











Base de datos
Intervalo de año de publicación
1.
Artículo en Inglés | MEDLINE | ID: mdl-31879734

RESUMEN

In this study, we examined a deep learning method for de-identification of clinical notes at UF Health under a cross-institute setting. We developed deep learning models using 2014 i2b2/UTHealth corpus and evaluated the performance using clinical notes collected from UF Health. We compared four pre-trained word embeddings, including two embeddings from the general domain and two embeddings from the clinical domain. We also explored linguistic features (i.e., word shape and part-of-speech) to further improve the performance of de-identification. The experimental results show that the performance of deep learning models trained using i2b2/UTHealth corpus significantly dropped (strict and relax F1 scores dropped from 0.9547 and 0.9646 to 0.8360 and 0.8870) when applied to another corpus from a different institution (UF Health). Linguistic features, including word shapes and part-of-speech, could further improve the performance of de-identification in cross-institute settings (improved to 0.8527 and 0.9052).

2.
BMC Med Inform Decis Mak ; 19(Suppl 5): 232, 2019 12 05.
Artículo en Inglés | MEDLINE | ID: mdl-31801524

RESUMEN

BACKGROUND: De-identification is a critical technology to facilitate the use of unstructured clinical text while protecting patient privacy and confidentiality. The clinical natural language processing (NLP) community has invested great efforts in developing methods and corpora for de-identification of clinical notes. These annotated corpora are valuable resources for developing automated systems to de-identify clinical text at local hospitals. However, existing studies often utilized training and test data collected from the same institution. There are few studies to explore automated de-identification under cross-institute settings. The goal of this study is to examine deep learning-based de-identification methods at a cross-institute setting, identify the bottlenecks, and provide potential solutions. METHODS: We created a de-identification corpus using a total 500 clinical notes from the University of Florida (UF) Health, developed deep learning-based de-identification models using 2014 i2b2/UTHealth corpus, and evaluated the performance using UF corpus. We compared five different word embeddings trained from the general English text, clinical text, and biomedical literature, explored lexical and linguistic features, and compared two strategies to customize the deep learning models using UF notes and resources. RESULTS: Pre-trained word embeddings using a general English corpus achieved better performance than embeddings from de-identified clinical text and biomedical literature. The performance of deep learning models trained using only i2b2 corpus significantly dropped (strict and relax F1 scores dropped from 0.9547 and 0.9646 to 0.8568 and 0.8958) when applied to another corpus annotated at UF Health. Linguistic features could further improve the performance of de-identification in cross-institute settings. After customizing the models using UF notes and resource, the best model achieved the strict and relaxed F1 scores of 0.9288 and 0.9584, respectively. CONCLUSIONS: It is necessary to customize de-identification models using local clinical text and other resources when applied in cross-institute settings. Fine-tuning is a potential solution to re-use pre-trained parameters and reduce the training time to customize deep learning-based de-identification models trained using clinical corpus from a different institution.


Asunto(s)
Anonimización de la Información , Aprendizaje Profundo , Confidencialidad , Registros Electrónicos de Salud , Humanos , Lingüística , Procesamiento de Lenguaje Natural
3.
Sci Rep ; 6: 38959, 2016 12 13.
Artículo en Inglés | MEDLINE | ID: mdl-27958343

RESUMEN

Capsid assembly and stability of hepatitis B virus (HBV) core protein (HBc) particles depend on balanced electrostatic interactions between encapsidated nucleic acids and an arginine-rich domain (ARD) of HBc in the capsid interior. Arginine-deficient ARD mutants preferentially encapsidated spliced viral RNA and shorter DNA, which can be fully or partially rescued by reducing the negative charges from acidic residues or serine phosphorylation of HBc, dose-dependently. Similarly, empty capsids without RNA encapsidation can be generated by ARD hyper-phosphorylation in insect, bacteria, and human hepatocytes. De-phosphorylation of empty capsids by phosphatase induced capsid disassembly. Empty capsids can convert into RNA-containing capsids by increasing HBc serine de-phosphorylation. In an HBV replicon system, we observed a reciprocal relationship between viral and non-viral RNA encapsidation, suggesting both non-viral RNA and serine-phosphorylation could serve as a charge balance buffer in maintaining electrostatic homeostasis. In addition, by comparing the biochemistry assay results between a replicon and a non-replicon system, we observed a correlation between HBc de-phosphorylation and viral replication. Balanced electrostatic interactions may be important to other icosahedral particles in nature.


Asunto(s)
Cápside/metabolismo , ADN Viral/metabolismo , Virus de la Hepatitis B/metabolismo , Fosfoserina/metabolismo , ARN Viral/metabolismo , Sustitución de Aminoácidos , Línea Celular Tumoral , ADN Viral/genética , Virus de la Hepatitis B/genética , Homeostasis , Humanos , Mutación Missense , ARN Viral/genética , Electricidad Estática
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA