Your browser doesn't support javascript.
loading
Utilizing big data without domain knowledge impacts public health decision-making.
Zhang, Miao; Rahman, Salman; Mhasawade, Vishwali; Chunara, Rumi.
Afiliación
  • Zhang M; Department of Computer Science and Engineering, Tandon School of Engineering, Brooklyn, NY 11201.
  • Rahman S; Department of Computer Science and Engineering, Tandon School of Engineering, Brooklyn, NY 11201.
  • Mhasawade V; Department of Computer Science and Engineering, Tandon School of Engineering, Brooklyn, NY 11201.
  • Chunara R; Department of Computer Science and Engineering, Tandon School of Engineering, Brooklyn, NY 11201.
Proc Natl Acad Sci U S A ; 121(39): e2402387121, 2024 Sep 24.
Article en En | MEDLINE | ID: mdl-39288180
ABSTRACT
New data sources and AI methods for extracting information are increasingly abundant and relevant to decision-making across societal applications. A notable example is street view imagery, available in over 100 countries, and purported to inform built environment interventions (e.g., adding sidewalks) for community health outcomes. However, biases can arise when decision-making does not account for data robustness or relies on spurious correlations. To investigate this risk, we analyzed 2.02 million Google Street View (GSV) images alongside health, demographic, and socioeconomic data from New York City. Findings demonstrate robustness challenges; built environment characteristics inferred from GSV labels at the intracity level often do not align with ground truth. Moreover, as average individual-level behavior of physical inactivity significantly mediates the impact of built environment features by census tract, intervention on features measured by GSV would be misestimated without proper model specification and consideration of this mediation mechanism. Using a causal framework accounting for these mediators, we determined that intervening by improving 10% of samples in the two lowest tertiles of physical inactivity would lead to a 4.17 (95% CI 3.84-4.55) or 17.2 (95% CI 14.4-21.3) times greater decrease in the prevalence of obesity or diabetes, respectively, compared to the same proportional intervention on the number of crosswalks by census tract. This study highlights critical issues of robustness and model specification in using emergent data sources, showing the data may not measure what is intended, and ignoring mediators can result in biased intervention effect estimates.
Asunto(s)
Palabras clave

Texto completo: 1 Colección: 01-internacional Base de datos: MEDLINE Asunto principal: Salud Pública / Toma de Decisiones / Macrodatos Límite: Female / Humans / Male País/Región como asunto: America do norte Idioma: En Revista: Proc Natl Acad Sci U S A Año: 2024 Tipo del documento: Article Pais de publicación: Estados Unidos

Texto completo: 1 Colección: 01-internacional Base de datos: MEDLINE Asunto principal: Salud Pública / Toma de Decisiones / Macrodatos Límite: Female / Humans / Male País/Región como asunto: America do norte Idioma: En Revista: Proc Natl Acad Sci U S A Año: 2024 Tipo del documento: Article Pais de publicación: Estados Unidos