RESUMEN
This study assessed the impact of organic loading rate (OLR) on methane (CH4) production in the anaerobic co-digestion (AcoD) of sugarcane vinasse and molasses (SVM) (1:1 ratio) within a thermophilic fluidized bed reactor (AFBR). The OLR ranged from 5 to 27.5 kg COD.m-3.d-1, with a fixed hydraulic retention time (HRT) of 24 h. Organic matter removal varied from 56 to 84%, peaking at an OLR of 5 kg COD.m-3.d-1. Maximum CH4 yield (MY) (272.6 mL CH4.g-1CODrem) occurred at an OLR of 7.5 kg COD.m-3.d-1, while the highest CH4 production rate (MPR) (4.0 L CH4.L-1.d-1) and energy potential (E.P.) (250.5 kJ.d-1) were observed at an OLR of 20 kg COD.m-3.d-1. The AFBR exhibited stability across all OLR. At 22.5 kg COD.m-3.d-1, a decrease in MY indicated methanogenesis imbalance and inhibitory organic compound accumulation. OLR influenced microbial populations, with Firmicutes and Thermotogota constituting 43.9% at 7.5 kg COD.m-3.d-1, and Firmicutes dominating (52.7%) at 27.5 kg COD.m-3.d-1. Methanosarcina (38.9%) and hydrogenotrophic Methanothermobacter (37.6%) were the prevalent archaea at 7.5 kg COD.m-3.d-1 and 27.5 kg COD.m-3.d-1, respectively. Therefore, this study demonstrates that the organic loading rate significantly influences the efficiency of methane production and the stability of microbial communities during the anaerobic co-digestion of sugarcane vinasse and molasses, indicating that optimized conditions can maximize energy yield and maintain methanogenic balance.
RESUMEN
Background: COVID-19 dynamics are driven by a complex interplay of factors including population behaviour, new variants, vaccination and immunity from prior infections. We quantify drivers of SARS-CoV-2 transmission in the Dominican Republic, an upper-middle income country of 10.8 million people. We then assess the impact of the vaccination campaign implemented in February 2021, primarily using CoronaVac, in saving lives and averting hospitalisations. Methods: We fit an age-structured, multi-variant transmission dynamic model to reported deaths, hospital bed occupancy, and seroprevalence data until December 2021, and simulate epidemic trajectories under different counterfactual scenarios. Findings: We estimate that vaccination averted 7210 hospital admissions (95% credible interval, CrI: 6830-7600), 2180 intensive care unit admissions (95% CrI: 2080-2280) and 766 deaths (95% CrI: 694-859) in the first 6 months of the campaign. If no vaccination had occurred, we estimate that an additional decrease of 10-20% in population mobility would have been required to maintain equivalent death and hospitalisation outcomes. We also found that early vaccination with CoronaVac was preferable to delayed vaccination using a product with higher efficacy. Interpretation: SARS-CoV-2 transmission dynamics in the Dominican Republic were driven by a substantial accumulation of immunity during the first two years of the pandemic but, despite this, vaccination was essential in enabling a return to pre-pandemic mobility levels without considerable additional morbidity and mortality. Funding: Medical Research Council, Wellcome Trust, Royal Society, US CDC and Australian National Health and Medical Research Council.
RESUMEN
PURPOSE: Parametric regression models have been the main statistical method for identifying average treatment effects. Causal machine learning models showed promising results in estimating heterogeneous treatment effects in causal inference. Here we aimed to compare the application of causal random forest (CRF) and linear regression modelling (LRM) to estimate the effects of organisational factors on ICU efficiency. METHODS: A retrospective analysis of 277,459 patients admitted to 128 Brazilian and Uruguayan ICUs over three years. ICU efficiency was assessed using the average standardised efficiency ratio (ASER), measured as the average of the standardised mortality ratio (SMR) and the standardised resource use (SRU) according to the SAPS-3 score. Using a causal inference framework, we estimated and compared the conditional average treatment effect (CATE) of seven common structural and organisational factors on ICU efficiency using LRM with interaction terms and CRF. RESULTS: The hospital mortality was 14 %; median ICU and hospital lengths of stay were 2 and 7 days, respectively. Overall median SMR was 0.97 [IQR: 0.76,1.21], median SRU was 1.06 [IQR: 0.79,1.30] and median ASER was 0.99 [IQR: 0.82,1.21]. Both CRF and LRM showed that the average number of nurses per ten beds was independently associated with ICU efficiency (CATE [95 %CI]: -0.13 [-0.24, -0.01] and -0.09 [-0.17,-0.01], respectively). Finally, CRF identified some specific ICUs with a significant CATE in exposures that did not present a significant average effect. CONCLUSION: In general, both methods were comparable to identify organisational factors significantly associated with CATE on ICU efficiency. CRF however identified specific ICUs with significant effects, even when the average effect was nonsignificant. This can assist healthcare managers in further in-dept evaluation of process interventions to improve ICU efficiency.
Asunto(s)
Mortalidad Hospitalaria , Unidades de Cuidados Intensivos , Humanos , Unidades de Cuidados Intensivos/organización & administración , Estudios Retrospectivos , Modelos Lineales , Femenino , Masculino , Brasil , Tiempo de Internación/estadística & datos numéricos , Eficiencia Organizacional , Persona de Mediana Edad , Aprendizaje Automático , Uruguay , Anciano , Adulto , Bosques AleatoriosRESUMEN
Background: The mechanisms through which acculturation influences the onset of cognitive impairment and dementia are not well understood, especially among older Hispanics. Objective: To investigate whether inflammation and psycho-behavioral factors mediate the relationship between acculturation and incident dementia among older Mexican Americans. Methods: We analyzed the Sacramento Area Latino Study on Aging (1998-2007, SALSA), a longitudinal study (Nâ=â1,194) with 10 years of follow-up, and used g-computation for mediation analysis with pooled logistic regression to evaluate whether acculturation (assessed by the Revised Acculturation Rating Scale for Mexican Americans [ARSMA-II]) affected dementia or cognitive impairment but not dementia (CIND) through inflammation (i.e., interleukin 6 [IL-6], tumor necrosis factor-α (TNF-α), high-sensitivity C-reactive protein [hs-CRP]), smoking, alcohol consumption, and depressive symptoms. The potential mediators were assessed at baseline. Results: The 10-year average adjusted risk ratio (aRR) for the effect of high U.S. acculturation and dementia/CIND was 0.66, 95% CI (0.36, 1.30). The indirect effects were: IL-6 (aRRâ=â0.98, 95% CI (0.88, 1.05)); TNF-α (aRR:0.99, 95% CI (0.93, 1.05)); hs-CRP: (aRRâ=â1.21, 95% CI (0.84, 1.95)); current smoking: aRRâ=â0.97, 95% CI (0.84, 1.16); daily/weekly alcohol consumption (aRRâ=â1.00, 95% CI (0.96, 1.05)); and depressive symptom score (aRRâ=â1.03, 95% CI (0.95, 1.26)). Hs-CRP yielded a proportion mediated of -26%, suggesting that hs-CRP could suppress the potential effect of high U.S. acculturation. The other factors explored resulted in little to no mediation. Conclusions: The effect of acculturation on time to incident dementia/CIND varied over time. Our study suggests that inflammation could suppress the effect between high U.S. acculturation and dementia risk.
Asunto(s)
Aculturación , Demencia , Inflamación , Americanos Mexicanos , Humanos , Demencia/etnología , Demencia/epidemiología , Demencia/psicología , Americanos Mexicanos/psicología , Americanos Mexicanos/estadística & datos numéricos , Masculino , Femenino , Anciano , Inflamación/sangre , Inflamación/etnología , Inflamación/psicología , Estudios Longitudinales , Anciano de 80 o más Años , Incidencia , Factores de Riesgo , Proteína C-Reactiva/metabolismo , Depresión/etnología , Depresión/psicología , Depresión/epidemiología , Interleucina-6/sangreRESUMEN
Background: We aimed to determine the effectiveness of switching to bictegravir in maintaining an undetectable viral load (<50â copies/mL) among people with HIV (PWH) as compared with continuing dolutegravir-, efavirenz-, or raltegravir-based antiretroviral therapy using nationwide observational data from Mexico. Methods: We emulated 3 target trials comparing switching to bictegravir vs continuing with dolutegravir, efavirenz, or raltegravir. Eligibility criteria were PWH aged ≥16 years with a viral load <50â copies/mL and at least 3 months of current antiretroviral therapy (dolutegravir, efavirenz, or raltegravir) between July 2019 and September 2021. Weekly target trials were emulated during the study period, and individuals were included in every emulation if they continued to be eligible. The main outcome was the probability of an undetectable viral load at 3 months, which was estimated via an adjusted logistic regression model. Estimated probabilities were compared via differences, and 95% CIs were calculated via bootstrap. Outcomes were also ascertained at 12 months, and sensitivity analyses were performed to test our analytic choices. Results: We analyzed data from 3 028 619 PWH (63 581 unique individuals). The probability of an undetectable viral load at 3 months was 2.9% (95% CI, 1.9%-3.8%), 1.3% (95% CI, .9%-1.6%), and 1.2% (95% CI, .8%-1.7%) higher when switching to bictegravir vs continuing with dolutegravir, efavirenz, and raltegravir, respectively. Similar results were observed at 12 months and in other sensitivity analyses. Conclusions: Our findings suggest that switching to bictegravir could be more effective in maintaining viral suppression than continuing with dolutegravir, efavirenz, or raltegravir.
RESUMEN
Many techniques have been proposed to model space-varying observation processes with a nonstationary spatial covariance structure and/or anisotropy, usually on a geostatistical framework. Nevertheless, there is an increasing interest in point process applications, and methodologies that take nonstationarity into account are welcomed. In this sense, this work proposes an extension of a class of spatial Cox process using spatial deformation. The proposed method enables the deformation behavior to be data-driven, through a multivariate latent Gaussian process. Inference leads to intractable posterior distributions that are approximated via MCMC. The convergence of algorithms based on the Metropolis-Hastings steps proved to be slow, and the computational efficiency of the Bayesian updating scheme was improved by adopting Hamiltonian Monte Carlo (HMC) methods. Our proposal was also compared against an alternative anisotropic formulation. Studies based on synthetic data provided empirical evidence of the benefit brought by the adoption of nonstationarity through our anisotropic structure. A real data application was conducted on the spatial spread of the Spodoptera frugiperda pest in a corn-producing agricultural area in southern Brazil. Once again, the proposed method demonstrated its benefit over alternatives.
RESUMEN
This paper introduces a new latent variable probabilistic framework for representing spectral data of high spatial and spectral dimensionality, such as hyperspectral images. We use a generative Bayesian model to represent the image formation process and provide interpretable and efficient inference and learning methods. Surprisingly, our approach can be implemented with simple tools and does not require extensive training data, detailed pixel-by-pixel labeling, or significant computational resources. Numerous experiments with simulated data and real benchmark scenarios show encouraging image classification performance. These results validate the unique ability of our framework to discriminate complex hyperspectral images, irrespective of the presence of highly discriminative spectral signatures.
RESUMEN
Function and structure are strongly coupled in obligated oligomers such as Triosephosphate isomerase (TIM). In animals and fungi, TIM monomers are inactive and unstable. Previously, we used ancestral sequence reconstruction to study TIM evolution and found that before these lineages diverged, the last opisthokonta common ancestor of TIM (LOCATIM) was an obligated oligomer that resembles those of extant TIMs. Notably, calorimetric evidence indicated that ancestral TIM monomers are more structured than extant ones. To further increase confidence about the function, structure, and stability of the LOCATIM, in this work, we applied two different inference methodologies and the worst plausible case scenario for both of them, to infer four sequences of this ancestor and test the robustness of their physicochemical properties. The extensive biophysical characterization of the four reconstructed sequences of LOCATIM showed very similar hydrodynamic and spectroscopic properties, as well as ligand-binding energetics and catalytic parameters. Their 3D structures were also conserved. Although differences were observed in melting temperature, all LOCATIMs showed reversible urea-induced unfolding transitions, and for those that reached equilibrium, high conformational stability was estimated (ΔGTot = 40.6-46.2 kcal/mol). The stability of the inactive monomeric intermediates was also high (ΔGunf = 12.6-18.4 kcal/mol), resembling some protozoan TIMs rather than the unstable monomer observed in extant opisthokonts. A comparative analysis of the 3D structure of ancestral and extant TIMs shows a correlation between the higher stability of the ancestral monomers with the presence of several hydrogen bonds located in the "bottom" part of the barrel.
Asunto(s)
Triosa-Fosfato Isomerasa , Triosa-Fosfato Isomerasa/química , Triosa-Fosfato Isomerasa/genética , Triosa-Fosfato Isomerasa/metabolismo , Animales , Evolución Molecular , Multimerización de Proteína , Modelos Moleculares , Estabilidad de EnzimasRESUMEN
Although Genome Wide Analysis (GWAS) have been widely used to understand the genetic architecture of complex quantitative traits, interpreting their results in terms of the biological processes that determine those traits has been difficult or even lacking, because of the variability in responses to the tests of hypotheses within a trait, species, and breed or cross, and the lack of follow-up studies. It is then essential employing appropriate statistical tests that point out to the causal genes responsible of the relevant fraction of the genetic variability observed. We briefly review the main theoretical aspects of the two schools of causal inference (Rubin's Causal Model, RCM, and Pearl's causal inference, PCI). RCM approachs the hypothesis testing from a randomization perspective by considering a wider space of the observation, i.e. the "potential outcomes", rather than the narrower space that results from defining "treatment" effects after observing the data. Next, we discuss the assumptions involved to meet the requirements of randomization for RCM with observational data (non-designed experiments) with special emphasis on the Stable Unit Treatment Analysis (SUTVA). Due to the presence of "confounders" (i.e. systematic fixed effects, environmental permanent effects, interaction among genes, etc.), causal average treatment effects are viewed through the familiar lens of normal linear (or mixed) models. To overcome the difficulties of association analyses, a tests of causal effects is introduced using independent predicted residual breeding values from animal models of genetic evaluation that avoids the effects of population structure and confounder effects. An independent section discusses the issue of whether the additive effects defined at the "gene" level by R. A. Fisher and popularized in D. S. Falconer's textbook of quantitative genetics can be termed causal from either RCM or PCI.
RESUMEN
A central challenge in hypothesis testing (HT) lies in determining the optimal balance between Type I (false positive) and Type II (non-detection or false negative) error probabilities. Analyzing these errors' exponential rate of convergence, known as error exponents, provides crucial insights into system performance. Error exponents offer a lens through which we can understand how operational restrictions, such as resource constraints and impairments in communications, affect the accuracy of distributed inference in networked systems. This survey presents a comprehensive review of key results in HT, from the foundational Stein's Lemma to recent advancements in distributed HT, all unified through the framework of error exponents. We explore asymptotic and non-asymptotic results, highlighting their implications for designing robust and efficient networked systems, such as event detection through lossy wireless sensor monitoring networks, collective perception-based object detection in vehicular environments, and clock synchronization in distributed environments, among others. We show that understanding the role of error exponents provides a valuable tool for optimizing decision-making and improving the reliability of networked systems.
RESUMEN
The modality is an important topic for modelling. Using parametric models is an efficient way when real data set shows trimodality. In this paper, we propose a new class of trimodal probability distributions, that is, probability distributions that have up to three modes. Trimodality itself is achieved by applying a proper transformation to density function of certain continuous probability distributions. At first, we obtain preliminary results for an arbitrary density function g ( x ) and, next, we focus on the Gaussian case, studying trimodal Gaussian model more deeply. The Gaussian distribution is applied to produce the trimodal form of Gaussian known as normal distribution. The tractability of analytical expression of normal distribution and properties of the trimodal normal distribution are important reasons why we choose normal distribution. Furthermore, the existing distributions should be improved to be capable of modelling efficiently when there exists a trimodal form in a data set. After new density function is proposed, estimating its parameters is important. Since Mathematica 12.0 software has optimization tools and important modelling techniques, computational steps are performed using this software. The bootstrapped form of real data sets are applied to show the modelling ability of the proposed distribution when real data sets show trimodality.
RESUMEN
We consider unsupervised classification by means of a latent multinomial variable which categorizes a scalar response into one of the L components of a mixture model which incorporates scalar and functional covariates. This process can be thought as a hierarchical model with the first level modelling a scalar response according to a mixture of parametric distributions and the second level modelling the mixture probabilities by means of a generalized linear model with functional and scalar covariates. The traditional approach of treating functional covariates as vectors not only suffers from the curse of dimensionality, since functional covariates can be measured at very small intervals leading to a highly parametrized model, but also does not take into account the nature of the data. We use basis expansions to reduce the dimensionality and a Bayesian approach for estimating the parameters while providing predictions of the latent classification vector. The method is motivated by two data examples that are not easily handled by existing methods. The first example concerns identifying placebo responders on a clinical trial (normal mixture model) and the other predicting illness for milking cows (zero-inflated mixture of the Poisson model).
RESUMEN
Countries within the tropics face ongoing challenges in completing or updating their national forest inventories (NFIs), critical for estimating aboveground biomass (AGB) and for forest-related greenhouse gas (GHG) accounting. While previous studies have explored the integration of map information with local reference data to fill in data gaps, limited attention has been given to the specific challenges presented by the clustered plot designs frequently employed by NFIs when combined with remote sensing-based biomass map units. This research addresses these complexities by conducting four country case-studies, encompassing a variety of NFI characteristics within a range of AGB densities. Examining four country case-studies (Peru, Guyana, Tanzania, Mozambique), we assess the potential of European Space Agency's Climate Change Initiative (CCI) global biomass maps to increase precision in (sub)national AGB estimates. We compare a baseline approach using NFI field-based data with a model-assisted scenario incorporating a locally calibrated CCI biomass map as auxiliary information. The original CCI biomass maps systematically underestimate AGB in three of the four countries at both the country and stratum level, with particularly weak agreement at finer map resolution. However, after calibration with country-specific NFI data, stratum and country-level AGB estimates from the model-assisted scenario align well with those obtained solely from field-based data and official country reports. Introducing maps as a source of auxiliary information fairly increased the precision of stratum and country-wise AGB estimates, offering greater confidence in estimating AGB for GHG reporting purposes. Considering the challenges tropical countries face with implementing their NFIs, it is sensible to explore the potential benefits of biomass maps for climate change reporting mechanisms across biomes. While country-specific NFI design assumptions guided our model-assisted inference strategies, this study also uncovers transferable insights from the application of global biomass maps with NFI data, providing valuable lessons for climate research and policy communities.
Asunto(s)
Biomasa , Cambio Climático , Monitoreo del Ambiente , Monitoreo del Ambiente/métodos , Bosques , Tanzanía , Clima Tropical , Mozambique , Guyana , Gases de Efecto Invernadero/análisisRESUMEN
A priority of nutrition science is to identify dietary determinants of health and disease to inform effective public health policies, guidelines, and clinical interventions. Yet, conflicting findings in synthesizing evidence from randomized trials and observational studies have contributed to confusion and uncertainty. Often, heterogeneity can be explained by the fact that seemingly similar bodies of evidence are asking very different questions. Improving the alignment within and between research domains begins with investigators clearly defining their diet and disease questions; however, nutritional exposures are complex and often require a greater degree of specificity. First, dietary data are compositional, meaning a change in a food may imply a compensatory change of other foods. Second, dietary data are multidimensional; that is, the primary components (ie, foods) comprise subcomponents (eg, nutrients), and subcomponents can be present in multiple primary components. Third, because diet is a lifelong exposure, the composition of a study population's background diet has implications for the interpretation of the exposure and the transportability of effect estimates. Collectively clarifying these key aspects of inherently complex dietary exposures when conducting research will facilitate appropriate evidence synthesis, improve certainty of evidence, and improve the ability of these efforts to inform policy and decision-making.
Asunto(s)
Dieta , Ciencias de la Nutrición , Humanos , Dieta/normas , Proyectos de InvestigaciónRESUMEN
Leptospirosis is a global zoonotic disease caused by spirochete bacteria of the genus Leptospira. The disease exhibits a notable incidence in tropical and developing countries, and in Colombia, environmental, economic, social, and cultural conditions favor disease transmission, directly impacting both mortality and morbidity rates. Our objective was to establish the pooled lagged effect of runoff on leptospirosis cases in Colombia. For our study, we included the top 20 Colombian municipalities with the highest number of leptospirosis cases. Monthly cases of leptospirosis, confirmed by laboratory tests and spanning from 2007 to 2022, were obtained from the National Public Health Surveillance System. Additionally, we collected monthly runoff and atmospheric and oceanic data from remote sensors. Multidimensional poverty index values for each municipality were sourced from the Terridata repository. We employed causal inference and distributed lag nonlinear models to estimate the lagged effect of runoff on leptospirosis cases. Municipality-specific estimates were combined through meta-analysis to derive a single estimate for all municipalities under study. The pooled results for the 20 municipalities suggest a lagged effect for the 0 to 2, and 0-3 months of runoff on leptospirosis when the runoff is < 120 g/m2. No effect was identified for longer lagged periods (0-1, 0 to 4, 0 to 5, and 0-6 months) or higher runoff values. Incorporation of the multidimensional poverty index into the meta-analysis of runoff contributed to the models for the lagged periods of 0-3, and 0-4 months.
RESUMEN
Ensuring that the proposed probabilistic model accurately represents the problem is a critical step in statistical modeling, as choosing a poorly fitting model can have significant repercussions on the decision-making process. The primary objective of statistical modeling often revolves around predicting new observations, highlighting the importance of assessing the model's accuracy. However, current methods for evaluating predictive ability typically involve model comparison, which may not guarantee a good model selection. This work presents an accuracy measure designed for evaluating a model's predictive capability. This measure, which is straightforward and easy to understand, includes a decision criterion for model rejection. The development of this proposal adopts a Bayesian perspective of inference, elucidating the underlying concepts and outlining the necessary procedures for application. To illustrate its utility, the proposed methodology was applied to real-world data, facilitating an assessment of its practicality in real-world scenarios.
RESUMEN
The edible chiton Chiton articulatus is a commercially important mollusk found in the rocky intertidal zones of the Mexican tropical Pacific. Despite the intense harvesting in Acapulco Bay, Mexico, knowledge of its growth patterns is limited, hindering the development of effective management strategies. This study investigated the growth dynamics of C. articulatus using a multi-model inference approach based on size structure data collected in four sampling periods covering four decades. Results revealed continuous recruitment throughout the year, contributing to population resilience. The species exhibited growth plasticity, highlighting its adaptive potential. We found complex temporal patterns influenced mainly by climatic events. The El Niño event sowed higher growth rates and lower asymptotic length, while La Niña events showed the opposite pattern. This research provides insights into the growth dynamics of C. articulatus, highlighting the need for holistic management strategies for this commercially important species in the face of environmental change.
Asunto(s)
Poliplacóforos , Dinámica Poblacional , Animales , México , Poliplacóforos/fisiología , Poliplacóforos/crecimiento & desarrolloRESUMEN
The article proposes a new regression based on the generalized odd log-logistic family for interval-censored data. The survival times are not observed for this type of data, and the event of interest occurs at some random interval. This family can be used in interval modeling since it generalizes some popular lifetime distributions in addition to its ability to present various forms of the risk function. The estimation of the parameters is addressed by the classical and Bayesian methods. We examine the behavior of the estimates for some sample sizes and censorship percentages. Selection criteria, likelihood ratio tests, residual analysis, and graphical techniques assess the goodness of fit of the fitted models. The usefulness of the proposed models is red shown by means of two real data sets.
RESUMEN
The identification of orthologous genes is relevant for comparative genomics, phylogenetic analysis, and functional annotation. There are many computational tools for the prediction of orthologous groups as well as web-based resources that offer orthology datasets for download and online analysis. This chapter presents a simple and practical guide to the process of orthologous group prediction, using a dataset of 10 prokaryotic proteomes as example. The orthology methods covered are OrthoMCL, COGtriangles, OrthoFinder2, and OMA. The authors compare the number of orthologous groups predicted by these various methods, and present a brief workflow for the functional annotation and reconstruction of phylogenies from inferred single-copy orthologous genes. The chapter also demonstrates how to explore two orthology databases: eggNOG6 and OrthoDB.
Asunto(s)
Genómica , Filogenia , Genómica/métodos , Biología Computacional/métodos , Programas Informáticos , Células Procariotas/metabolismo , Bases de Datos Genéticas , Anotación de Secuencia Molecular/métodos , Familia de Multigenes , Genoma BacterianoRESUMEN
BACKGROUND: The survival advantage of neoadjuvant systemic therapy (NST) for breast cancer patients remains controversial, especially when considering the heterogeneous characteristics of individual patients. OBJECTIVE: To discern the variability in responses to breast cancer treatment at the individual level and propose personalized treatment recommendations utilizing deep learning (DL). METHODS: Six models were developed to offer individualized treatment suggestions. Outcomes for patients whose actual treatments aligned with model recommendations were compared to those whose did not. The influence of certain baseline features of patients on NST selection was visualized and quantified by multivariate logistic regression and Poisson regression analyses. RESULTS: Our study included 94,487 female breast cancer patients. The Balanced Individual Treatment Effect for Survival data (BITES) model outperformed other models in performance, showing a statistically significant protective effect with inverse probability treatment weighting (IPTW)-adjusted baseline features [IPTW-adjusted hazard ratio: 0.51, 95% confidence interval (CI), 0.41-0.64; IPTW-adjusted risk difference: 21.46, 95% CI 18.90-24.01; IPTW-adjusted difference in restricted mean survival time: 21.51, 95% CI 19.37-23.80]. Adherence to BITES recommendations is associated with reduced breast cancer mortality and fewer adverse effects. BITES suggests that patients with TNM stage IIB, IIIB, triple-negative subtype, a higher number of positive axillary lymph nodes, and larger tumors are most likely to benefit from NST. CONCLUSIONS: Our results demonstrated the potential of BITES to aid in clinical treatment decisions and offer quantitative treatment insights. In our further research, these models should be validated in clinical settings and additional patient features as well as outcome measures should be studied in depth.