Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 346
Filtrar
1.
J Appl Stat ; 51(11): 2178-2196, 2024.
Artículo en Inglés | MEDLINE | ID: mdl-39157271

RESUMEN

This paper aims to evaluate the statistical association between exposure to air pollution and forced expiratory volume in the first second (FEV1) in both asthmatic and non-asthmatic children and teenagers, in which the response variable FEV1 was repeatedly measured on a monthly basis, characterizing a longitudinal experiment. Due to the nature of the data, an robust linear mixed model (RLMM), combined with a robust principal component analysis (RPCA), is proposed to handle the multicollinearity among the covariates and the impact of extreme observations (high levels of air contaminants) on the estimates. The Huber and Tukey loss functions are considered to obtain robust estimators of the parameters in the linear mixed model (LMM). A finite sample size investigation is conducted under the scenario where the covariates follow linear time series models with and without additive outliers (AO). The impact of the time-correlation and the outliers on the estimates of the fixed effect parameters in the LMM is investigated. In the real data analysis, the robust model strategy evidenced that RPCA exhibits three principal component (PC), mainly related to relative humidity (Hmd), particulate matter with a diameter smaller than 10 µm (PM10) and particulate matter with a diameter smaller than 2.5 µm (PM2.5).

2.
Prog Brain Res ; 287: 1-24, 2024.
Artículo en Inglés | MEDLINE | ID: mdl-39097349

RESUMEN

In a recent study employing time production, a number of participants presented aberrant data, which normally would have marked them as being outliers. Given the ongoing discussion in the literature regarding the illusory nature of the flow of time, in this paper we consider whether their data may indicate discontinuity in time perception. We analyze the log-log plots for these outliers, investigating to what degree linearity is preserved for all the data points, as opposed to achieving a better fit using bisegmental regression. The current results, though preliminary, can contribute to the debate regarding the non-linearity of subjective time. It would seem that with longer target durations, the ongoing experience of time can be either one of a subjective slowing down of time (longer time units, increase in slope), or of a subjective speeding up of time (shorter time units, decrease in slope).


Asunto(s)
Psicofísica , Percepción del Tiempo , Humanos , Percepción del Tiempo/fisiología , Factores de Tiempo
3.
Sci Rep ; 14(1): 18542, 2024 Aug 09.
Artículo en Inglés | MEDLINE | ID: mdl-39122861

RESUMEN

In the mechanical cutting industry, trial production is used for predicting and evaluating the quality of product processes before batch production, and it can be expressed through the qualification rate. However, it cannot objectively and comprehensively evaluate the quality of product processes. This study optimizes the analysis of outliers and stability in mathematical statistics to better apply it in the mechanical cutting industry; then, it combines them with process capability analysis. Simultaneously, considering the non-normal distribution of process parameters, a batch production-prediction model is proposed. The reliability of batch production-prediction model is verified by the diameter, roundness and roughness of structural common samples. Meanwhile, for other mechanical parts in the mechanical cutting industry, the model proposed in this paper can be used to quickly and accurately predict and evaluate batch production.

5.
Sci Rep ; 14(1): 17599, 2024 Jul 30.
Artículo en Inglés | MEDLINE | ID: mdl-39080303

RESUMEN

The linear regression is critical for data modelling, especially for scientists. Nevertheless, with the plenty of high-dimensional data, there are data with more explanatory variables than the number of observations. In such circumstances, traditional approaches fail. This paper proposes a modified sparse regression model that solves the problem of heterogeneity using seaweed big data as a use case. The modified heterogeneity models for ridge, LASSO and Elastic net were used to model the data. Robust estimations M Bi-Square, M Hampel, M Huber, MM and S were used. Based on the results, the hybrid model of sparse regression for before, after, and modified heterogeneity robust regression with the 45 high ranking variables and a 2-sigma limit can be used efficiently and effectively to reduce the outliers. The obtained results confirm that the hybrid model of the modified sparse LASSO with the M Bi-Square estimator for the 45 high ranking parameters performed better compared with other existing methods.

6.
Behav Res Methods ; 56(7): 8132-8154, 2024 10.
Artículo en Inglés | MEDLINE | ID: mdl-39048860

RESUMEN

When investigating unobservable, complex traits, data collection and aggregation processes can introduce distinctive features to the data such as boundedness, measurement error, clustering, outliers, and heteroscedasticity. Failure to collectively address these features can result in statistical challenges that prevent the investigation of hypotheses regarding these traits. This study aimed to demonstrate the efficacy of the Bayesian beta-proportion generalized linear latent and mixed model (beta-proportion GLLAMM) (Rabe-Hesketh et al., Psychometrika, 69(2), 167-90, 2004a, Journal of Econometrics, 128(2), 301-23, 2004c, 2004b; Skrondal and Rabe-Hesketh 2004) in handling data features when exploring research hypotheses concerning speech intelligibility. To achieve this objective, the study reexamined data from transcriptions of spontaneous speech samples initially collected by Boonen et al. (Journal of Child Language, 50(1), 78-103, 2023). The data were aggregated into entropy scores. The research compared the prediction accuracy of the beta-proportion GLLAMM with the normal linear mixed model (LMM) (Holmes et al., 2019) and investigated its capacity to estimate a latent intelligibility from entropy scores. The study also illustrated how hypotheses concerning the impact of speaker-related factors on intelligibility can be explored with the proposed model. The beta-proportion GLLAMM was not free of challenges; its implementation required formulating assumptions about the data-generating process and knowledge of probabilistic programming languages, both central to Bayesian methods. Nevertheless, results indicated the superiority of the model in predicting empirical phenomena over the normal LMM, and its ability to quantify a latent potential intelligibility. Additionally, the proposed model facilitated the exploration of hypotheses concerning speaker-related factors and intelligibility. Ultimately, this research has implications for researchers and data analysts interested in quantitatively measuring intricate, unobservable constructs while accurately predicting the empirical phenomena.


Asunto(s)
Teorema de Bayes , Entropía , Inteligibilidad del Habla , Humanos , Inteligibilidad del Habla/fisiología , Modelos Lineales , Modelos Estadísticos , Interpretación Estadística de Datos
7.
Sci Rep ; 14(1): 13529, 2024 Jun 12.
Artículo en Inglés | MEDLINE | ID: mdl-38866829

RESUMEN

In real-life situations, we have to analyze the data that contains the atypical observations, and the presence of outliers has adverse effects on the performance of ordinary least square estimates. In this situation, redescedning M-estimators, proposed by Huber (1964), are used to tackle the effects of outliers to increase the efficiency of least square estimates. In this study, we introduce a redescending M-estimator designed to generate robust estimates by mitigating the influence of outlier observations, even when the tuning constant is set to low values. This innovative estimator exhibits enhanced linearity at its core and maintains continuity throughout its range. Our proposed estimator stands out for its novelty, simplicity, differentiability, and practical applicability across real-world scenarios. The results of the proposed redescedning M-estimators are compared with existing robust estimators using an extensive simulation study. Two examples based on real-life data are also added to validate the performance of the suggested function. The formulated redescedning M-estimator produced efficient results as compared to all the considered redescedning M-estimators.

8.
Planta ; 260(1): 32, 2024 Jun 19.
Artículo en Inglés | MEDLINE | ID: mdl-38896307

RESUMEN

MAIN CONCLUSION: By studying Cistus albidus shrubs in their natural habitat, we show that biological outliers can help us to understand the causes and consequences of maximum photochemical efficiency decreases in plants, thus reinforcing the importance of integrating these often-neglected data into scientific practice. Outliers are individuals with exceptional traits that are often excluded of data analysis. However, this may result in very important mistakes not accurately capturing the true trajectory of the population, thereby limiting our understanding of a given biological process. Here, we studied the role of biological outliers in understanding the causes and consequences of maximum photochemical efficiency decreases in plants, using the semi-deciduous shrub C. albidus growing in a Mediterranean-type ecosystem. We assessed interindividual variability in winter, spring and summer maximum PSII photochemical efficiency in a population of C. albidus growing under Mediterranean conditions. A strong correlation was observed between maximum PSII photochemical efficiency (Fv/Fm ratio) and leaf water desiccation. While decreases in maximum PSII photochemical efficiency did not result in any damage at the organ level during winter, reductions in the Fv/Fm ratio were associated to leaf mortality during summer. However, all plants could recover after rainfalls, thus maximum PSII photochemical efficiency decreases did not result in an increased mortality at the organism level, despite extreme water deficit and temperatures exceeding 40ºC during the summer. We conclude that, once methodological outliers are excluded, not only biological outliers must not be excluded from data analysis, but focusing on them is crucial to understand the causes and consequences of maximum PSII photochemical efficiency decreases in plants.


Asunto(s)
Cistus , Complejo de Proteína del Fotosistema II , Hojas de la Planta , Estaciones del Año , Complejo de Proteína del Fotosistema II/metabolismo , Hojas de la Planta/fisiología , Hojas de la Planta/metabolismo , Cistus/fisiología , Fotosíntesis , Ecosistema , Agua , Temperatura , Clorofila/metabolismo
9.
Stat Med ; 43(20): 3778-3791, 2024 Sep 10.
Artículo en Inglés | MEDLINE | ID: mdl-38899515

RESUMEN

Meta-analysis is an essential tool to comprehensively synthesize and quantitatively evaluate results of multiple clinical studies in evidence-based medicine. In many meta-analyses, the characteristics of some studies might markedly differ from those of the others, and these outlying studies can generate biases and potentially yield misleading results. In this article, we provide effective robust statistical inference methods using generalized likelihoods based on the density power divergence. The robust inference methods are designed to adjust the influences of outliers through the use of modified estimating equations based on a robust criterion, even when multiple and serious influential outliers are present. We provide the robust estimators, statistical tests, and confidence intervals via the generalized likelihoods for the fixed-effect and random-effects models of meta-analysis. We also assess the contribution rates of individual studies to the robust overall estimators that indicate how the influences of outlying studies are adjusted. Through simulations and applications to two recently published systematic reviews, we demonstrate that the overall conclusions and interpretations of meta-analyses can be markedly changed if the robust inference methods are applied and that only the conventional inference methods might produce misleading evidence. These methods would be recommended to be used at least as a sensitivity analysis method in the practice of meta-analysis. We have also developed an R package, robustmeta, that implements the robust inference methods.


Asunto(s)
Metaanálisis como Asunto , Modelos Estadísticos , Humanos , Funciones de Verosimilitud , Simulación por Computador , Interpretación Estadística de Datos , Sesgo , Intervalos de Confianza
10.
Biomedicines ; 12(5)2024 Apr 23.
Artículo en Inglés | MEDLINE | ID: mdl-38790902

RESUMEN

Angiotensin-converting enzyme (ACE) metabolizes a number of important peptides participating in blood pressure regulation and vascular remodeling. Elevated ACE expression in tissues (which is generally reflected by blood ACE levels) is associated with an increased risk of cardiovascular diseases. Elevated blood ACE is also a marker for granulomatous diseases. Decreased blood ACE activity is becoming a new risk factor for Alzheimer's disease. We applied our novel approach-ACE phenotyping-to characterize pairs of tissues (lung, heart, lymph nodes) and serum ACE in 50 patients. ACE phenotyping includes (1) measurement of ACE activity with two substrates (ZPHL and HHL); (2) calculation of the ratio of hydrolysis of these substrates (ZPHL/HHL ratio); (3) determination of ACE immunoreactive protein levels using mAbs to ACE; and (4) ACE conformation with a set of mAbs to ACE. The ACE phenotyping approach in screening format with special attention to outliers, combined with analysis of sequencing data, allowed us to identify patient with a unique ACE phenotype related to decreased ability of inhibition of ACE activity by albumin, likely due to competition with high CCL18 in this patient for binding to ACE. We also confirmed recently discovered gender differences in sialylation of some glycosylation sites of ACE. ACE phenotyping is a promising new approach for the identification of ACE phenotype outliers with potential clinical significance, making it useful for screening in a personalized medicine approach.

11.
Behav Res Methods ; 56(7): 7280-7306, 2024 10.
Artículo en Inglés | MEDLINE | ID: mdl-38811517

RESUMEN

A methodological problem in most reaction time (RT) studies is that some measured RTs may be outliers-that is, they may be very fast or very slow for reasons unconnected to the task-related processing of interest. Numerous ad hoc methods have been suggested to discriminate between such outliers and the valid RTs of interest, but it is extremely difficult to determine how well these methods work in practice because virtually nothing is known about the actual characteristics of outliers in real RT datasets. This article proposes a new method of pooling cumulative distribution function values for examining empirical RT distributions to assess both the proportions of outliers and their latencies relative to those of the valid RTs. As the method is developed, its strengths and weaknesses are examined using simulations based on previously suggested ad hoc models for RT outliers with particular assumed proportions and distributions of valid RTs and outliers. The method is then applied to several large RT datasets from lexical decision tasks, and the results provide the first empirically based description of outlier RTs. For these datasets, fewer than 1% of the RTs seem to be outliers, and the median outlier latency appears to be approximately 4-6 standard deviations of RT above the mean of the valid RT distribution.


Asunto(s)
Toma de Decisiones , Tiempo de Reacción , Tiempo de Reacción/fisiología , Humanos , Toma de Decisiones/fisiología , Modelos Estadísticos , Simulación por Computador
12.
ISA Trans ; 151: 164-173, 2024 Aug.
Artículo en Inglés | MEDLINE | ID: mdl-38811310

RESUMEN

The existence of dynamic outliers poses a significant challenge to the Kalman filter (KF). In addressing this challenge, this paper presents an innovative solution: Firstly, by analyzing a period of measurement information to more accurately identify state and measurement dynamic outliers, the system's capacity to adapt to dynamic changes is significantly improved. Next, noise is modeled as a Gaussian-Student's t mixture distribution (GSTM), with mixed model parameters inferred using the variational Bayesian (VB) method based on measurement information, cleverly integrated into the Moving Horizon Estimation (MHE) framework, thus enhancing the flexibility and accuracy of the noise model. Lastly, the optimal window size was identified through simulation experiment analysis to further increase the estimation accuracy. Simulation results demonstrate that the proposed filter exhibits stronger robustness in resisting dynamic outliers compared to existing filters.

13.
Eur J Intern Med ; 127: 105-111, 2024 Sep.
Artículo en Inglés | MEDLINE | ID: mdl-38735801

RESUMEN

BACKGROUND: the burden of acute complex patients, increasingly older and poli-pathological, accessing to Emergency Departments (ED) leads up hospital overcrowding and the outlying phenomenon. These issues highlight the need for new adequate patients' management strategies. The aim of this study is to analyse the effects on in-hospital patient flow and clinical outcomes of a high-technology and time-limited Medical Admission Unit (MAU) run by internists. METHODS: all consecutive patients admitted to MAU from Dec-2017 to Nov-2019 were included in the study. The admissions number from ED and hospitalization rate, the overall in-hospital mortality rate in medical department, the total days of hospitalization and the overall outliers bed days were compared to those from the previous two years. RESULTS: 2162 patients were admitted in MAU, 2085(95.6%) from ED, 476(22.0%) were directly discharged, 88(4.1%) died and 1598(73.9%) were transferred to other wards, with a median in-MAU time of stay of 64.5 [0.2-344.2] hours. Comparing the 24 months before, despite the increase in admissions/year from ED in medical department (3842 ± 106 in Dec2015-Nov2017 vs 4062 ± 100 in Dec2017-Nov2019, p<0.001), the number of the outlier bed days has been reduced, especially in surgical department (11.46 ± 6.25% in Dec2015-Nov2017 vs 6.39 ± 3.08% in Dec2017-Nov2019, p=0.001), and mortality in medical area has dropped from 8.74 ± 0.37% to 7.29 ± 0.57%, p<0.001. CONCLUSIONS: over two years, a patient-centred and problem-oriented approach in a medical admission buffer unit run by internists has ensured a constant flow of acute patients with positive effects on clinical risk and quality of care reducing medical outliers and in-hospital mortality.


Asunto(s)
Servicio de Urgencia en Hospital , Mortalidad Hospitalaria , Tiempo de Internación , Admisión del Paciente , Humanos , Masculino , Femenino , Anciano , Persona de Mediana Edad , Tiempo de Internación/estadística & datos numéricos , Servicio de Urgencia en Hospital/estadística & datos numéricos , Admisión del Paciente/estadística & datos numéricos , Anciano de 80 o más Años , Hospitalización/estadística & datos numéricos , Estudios Retrospectivos , Aglomeración , Adulto
14.
BMC Med Res Methodol ; 24(1): 89, 2024 Apr 15.
Artículo en Inglés | MEDLINE | ID: mdl-38622516

RESUMEN

BACKGROUND: Outliers, data points that significantly deviate from the norm, can have a substantial impact on statistical inference and provide valuable insights in data analysis. Multiple methods have been developed for outlier detection, however, almost all available approaches fail to consider the spatial dependence and heterogeneity in spatial data. Spatial data has diverse formats and semantics, requiring specialized outlier detection methodology to handle these unique properties. For now, there is limited research exists on robust spatial outlier detection methods designed specifically under the spatial error model (SEM) structure. METHOD: We propose the Spatial-Θ-Iterative Procedure for Outlier Detection (Spatial-Θ-IPOD), which utilizes a mean-shift vector to identify outliers within the SEM. Our method enables an effective detection of spatial outliers while also providing robust coefficient estimates. To assess the performance of our approach, we conducted extensive simulations and applied it to a real-world empirical study using life expectancy data from multiple countries. RESULTS: Simulation results showed that the masking and JD (Joint Detection) indicators of our Spatial-Θ-IPOD method outperformed several commonly used methods, even in high-dimensional scenarios, demonstrating stable performance. Conversely, the Θ-IPOD method proved to be ineffective in detecting outliers when spatial correlation was present. Moreover, our model successfully provided reliable coefficient estimation alongside outlier detection. The proposed method consistently outperformed other models (both robust and non-robust) in most cases. In the empirical study, our proposed model successfully detected outliers and provided valuable insights in the modeling process. CONCLUSIONS: Our proposed Spatial-Θ-IPOD offers an effective solution for detecting spatial outliers for SEM while providing robust coefficient estimates. Notably, our approach showcases its relative superiority even in the presence of high leverage points. By successfully identifying outliers, our method enhances the overall understanding of the data and provides valuable insights for further analysis.

15.
J Peripher Nerv Syst ; 29(2): 202-212, 2024 Jun.
Artículo en Inglés | MEDLINE | ID: mdl-38581130

RESUMEN

BACKGROUND: Caused by duplications of the gene encoding peripheral myelin protein 22 (PMP22), Charcot-Marie-Tooth disease type 1A (CMT1A) is the most common hereditary neuropathy. Despite this shared genetic origin, there is considerable variability in clinical severity. It is hypothesized that genetic modifiers contribute to this heterogeneity, the identification of which may reveal novel therapeutic targets. In this study, we present a comprehensive analysis of clinical examination results from 1564 CMT1A patients sourced from a prospective natural history study conducted by the RDCRN-INC (Inherited Neuropathy Consortium). Our primary objective is to delineate extreme phenotype profiles (mild and severe) within this patient cohort, thereby enhancing our ability to detect genetic modifiers with large effects. METHODS: We have conducted large-scale statistical analyses of the RDCRN-INC database to characterize CMT1A severity across multiple metrics. RESULTS: We defined patients below the 10th (mild) and above the 90th (severe) percentiles of age-normalized disease severity based on the CMT Examination Score V2 and foot dorsiflexion strength (MRC scale). Based on extreme phenotype categories, we defined a statistically justified recruitment strategy, which we propose to use in future modifier studies. INTERPRETATION: Leveraging whole genome sequencing with base pair resolution, a future genetic modifier evaluation will include single nucleotide association, gene burden tests, and structural variant analysis. The present work not only provides insight into the severity and course of CMT1A, but also elucidates the statistical foundation and practical considerations for a cost-efficient and straightforward patient enrollment strategy that we intend to conduct on additional patients recruited globally.


Asunto(s)
Enfermedad de Charcot-Marie-Tooth , Enfermedad de Charcot-Marie-Tooth/genética , Enfermedad de Charcot-Marie-Tooth/fisiopatología , Humanos , Adulto , Masculino , Femenino , Persona de Mediana Edad , Adolescente , Adulto Joven , Índice de Severidad de la Enfermedad , Niño , Proteínas de la Mielina/genética , Selección de Paciente , Fenotipo , Anciano , Genes Modificadores , Preescolar
16.
Sensors (Basel) ; 24(8)2024 Apr 10.
Artículo en Inglés | MEDLINE | ID: mdl-38676027

RESUMEN

The variety of equipment implementing laser triangulation technology for 3D scanning makes it difficult to analyse their performance, comparability, and traceability. In this study, three laser triangulation sensors arranged in different configurations are analysed using high precision spheres made of different materials and surface finishes. Three types of reference parameters were used: diameter, form error, and standard deviation of the point cloud. The experimentation was based on studying the quality of the point clouds generated by the three sensors, which enabled us to find and quantify an edge effect in the horizon of the scanned surface. A procedure to reach the optimal filtering conditions was proposed, and a chart of recommended usage of each sphere (material and finish) was created for the different types of sensors. This filter enables removal of both spurious points and those few points that spoil the form error, greatly improving the quality of the measurement.

17.
Heliyon ; 10(8): e28934, 2024 Apr 30.
Artículo en Inglés | MEDLINE | ID: mdl-38681655

RESUMEN

Various authors have put their sincere efforts into proposing ratio estimators for estimating the population's mean and variance under different situations and sampling methods. But the problem arises when data is unstable, imprecise, ambiguous, incomplete and vague. In such situations, classical methods of estimation do not yield precise results, as these methods are not meant for such problems. Given this difficulty, Neutrosophic statistics are the only alternative as it deals with indeterminacy. So in this study, we have proposed a generalized Neutrosophic robust ratio type estimator which can be used to provide good results in such situations, as well as in the case of the presence of outliers. For the evaluation point of view, we have made use of real data set and simulation study to check the efficacy of our suggested estimators over the mentioned existed estimators.

18.
Psychol Sci ; 35(4): 328-344, 2024 Apr.
Artículo en Inglés | MEDLINE | ID: mdl-38483515

RESUMEN

With the rapid spread of information via social media, individuals are prone to misinformation exposure that they may utilize when forming beliefs. Over five experiments (total N = 815 adults, recruited through Amazon Mechanical Turk in the United States), we investigated whether people could ignore quantitative information when they judged for themselves that it was misreported. Participants recruited online viewed sets of values sampled from Gaussian distributions to estimate the underlying means. They attempted to ignore invalid information, which were outlier values inserted into the value sequences. Results indicated participants were able to detect outliers. Nevertheless, participants' estimates were still biased in the direction of the outlier, even when they were most certain that they detected invalid information. The addition of visual warning cues and different task scenarios did not fully eliminate systematic over- and underestimation. These findings suggest that individuals may incorporate invalid information they meant to ignore when forming beliefs.


Asunto(s)
Comunicación , Señales (Psicología) , Adulto , Humanos , Estados Unidos
19.
Water Res ; 255: 121499, 2024 May 15.
Artículo en Inglés | MEDLINE | ID: mdl-38552494

RESUMEN

Recently, there has been a significant advancement in the water quality index (WQI) models utilizing data-driven approaches, especially those integrating machine learning and artificial intelligence (ML/AI) technology. Although, several recent studies have revealed that the data-driven model has produced inconsistent results due to the data outliers, which significantly impact model reliability and accuracy. The present study was carried out to assess the impact of data outliers on a recently developed Irish Water Quality Index (IEWQI) model, which relies on data-driven techniques. To the author's best knowledge, there has been no systematic framework for evaluating the influence of data outliers on such models. For the purposes of assessing the outlier impact of the data outliers on the water quality (WQ) model, this was the first initiative in research to introduce a comprehensive approach that combines machine learning with advanced statistical techniques. The proposed framework was implemented in Cork Harbour, Ireland, to evaluate the IEWQI model's sensitivity to outliers in input indicators to assess the water quality. In order to detect the data outlier, the study utilized two widely used ML techniques, including Isolation Forest (IF) and Kernel Density Estimation (KDE) within the dataset, for predicting WQ with and without these outliers. For validating the ML results, the study used five commonly used statistical measures. The performance metric (R2) indicates that the model performance improved slightly (R2 increased from 0.92 to 0.95) in predicting WQ after removing the data outlier from the input. But the IEWQI scores revealed that there were no statistically significant differences among the actual values, predictions with outliers, and predictions without outliers, with a 95 % confidence interval at p < 0.05. The results of model uncertainty also revealed that the model contributed <1 % uncertainty to the final assessment results for using both datasets (with and without outliers). In addition, all statistical measures indicated that the ML techniques provided reliable results that can be utilized for detecting outliers and their impacts on the IEWQI model. The findings of the research reveal that although the data outliers had no significant impact on the IEWQI model architecture, they had moderate impacts on the rating schemes' of the model. This finding indicated that detecting the data outliers could improve the accuracy of the IEWQI model in rating WQ as well as be helpful in mitigating the model eclipsing problem. In addition, the results of the research provide evidence of how the data outliers influenced the data-driven model in predicting WQ and reliability, particularly since the study confirmed that the IEWQI model's could be effective for accurately rating WQ despite the presence of the data outliers in the input. It could occur due to the spatio-temporal variability inherent in WQ indicators. However, the research assesses the influence of data input outliers on the IEWQI model and underscores important areas for future investigation. These areas include expanding temporal analysis using multi-year data, examining spatial outlier patterns, and evaluating detection methods. Moreover, it is essential to explore the real-world impacts of revised rating categories, involve stakeholders in outlier management, and fine-tune model parameters. Analysing model performance across varying temporal and spatial resolutions and incorporating additional environmental data can significantly enhance the accuracy of WQ assessment. Consequently, this study offers valuable insights to strengthen the IEWQI model's robustness and provides avenues for enhancing its utility in broader WQ assessment applications. Moreover, the study successfully adopted the framework for evaluating how data input outliers affect the data-driven model, such as the IEWQI model. The current study has been carried out in Cork Harbour for only a single year of WQ data. The framework should be tested across various domains for evaluating the response of the IEWQI model's in terms of the spatio-temporal resolution of the domain. Nevertheless, the study recommended that future research should be conducted to adjust or revise the IEWQI model's rating schemes and investigate the practical effects of data outliers on updated rating categories. However, the study provides potential recommendations for enhancing the IEWQI model's adaptability and reveals its effectiveness in expanding its applicability in more general WQ assessment scenarios.

20.
Mol Oncol ; 18(6): 1460-1485, 2024 Jun.
Artículo en Inglés | MEDLINE | ID: mdl-38468448

RESUMEN

Multiple strategies are continuously being explored to expand the drug target repertoire in solid tumors. We devised a novel computational workflow for transcriptome-wide gene expression outlier analysis that allows the systematic identification of both overexpression and underexpression events in cancer cells. Here, it was applied to expression values obtained through RNA sequencing in 226 colorectal cancer (CRC) cell lines that were also characterized by whole-exome sequencing and microarray-based DNA methylation profiling. We found cell models displaying an abnormally high or low expression level for 3533 and 965 genes, respectively. Gene expression abnormalities that have been previously associated with clinically relevant features of CRC cell lines were confirmed. Moreover, by integrating multi-omics data, we identified both genetic and epigenetic alternations underlying outlier expression values. Importantly, our atlas of CRC gene expression outliers can guide the discovery of novel drug targets and biomarkers. As a proof of concept, we found that CRC cell lines lacking expression of the MTAP gene are sensitive to treatment with a PRMT5-MTA inhibitor (MRTX1719). Finally, other tumor types may also benefit from this approach.


Asunto(s)
Neoplasias Colorrectales , Regulación Neoplásica de la Expresión Génica , Transcriptoma , Humanos , Neoplasias Colorrectales/genética , Neoplasias Colorrectales/tratamiento farmacológico , Neoplasias Colorrectales/patología , Neoplasias Colorrectales/metabolismo , Regulación Neoplásica de la Expresión Génica/efectos de los fármacos , Línea Celular Tumoral , Transcriptoma/genética , Perfilación de la Expresión Génica , Metilación de ADN/genética
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA