Application of Bayesian probabilistic linkage model in birth and death data linking / 上海预防医学
Shanghai Journal of Preventive Medicine
; (12): 98-103, 2024.
Article
en Zh
| WPRIM
| ID: wpr-1012662
Biblioteca responsable:
WPRO
ABSTRACT
ObjectiveTo elucidate the principles and methods of the Bayesian probabilistic linkage model, and to demonstrate the effect of applying the model in linking birth and death data. MethodsThrough the Shanghai birth and death registration system, data of 199 025 infants born in 2017 and 1 512 infants who died in 2017 and 2018 were collected. After cleaning the data, the data were divided into monthly blocks and fully linked. The Jaro-Winkler algorithm and Euclidean distance were employed to measure the similarity of fields for matching. A Bayesian probabilistic linkage model was constructed and the linking effect was evaluated using a confusion matrix. ResultsUsing the Bayesian probabilistic linkage model, the birth and death data of infants were effectively linked, revealing that 36.71% of infants who died in Shanghai were born outside the city, and the probability of infant death was 2.6‰. The confusion matrix of the test set showed a recall rate of 0.86, precision of 0.76, and an F-score of 0.81. ConclusionThe practical application of Bayesian probabilistic linkage demonstrates a good model performance, enabling the establishment of birth-death cohorts that more accurately reflect the true levels of infant mortality. Utilizing this technique to integrate data from different departments can effectively improve research efficiency in the field of public health.
Texto completo:
1
Base de datos:
WPRIM
Idioma:
Zh
Revista:
Shanghai Journal of Preventive Medicine
Año:
2024
Tipo del documento:
Article