Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 10 de 10
Filtrar
1.
Artículo en Inglés | MEDLINE | ID: mdl-38970685

RESUMEN

Scientific fake papers, containing manipulated or completely fabricated data, are a problem that has reached dramatic dimensions. Companies known as paper mills (or more bluntly as "criminal science publishing gangs") produce and sell such fake papers on a large scale. The main drivers of the fake paper flood are the pressure in academic systems and (monetary) incentives to publish in respected scientific journals and sometimes the personal desire for increased "prestige." Published fake papers cause substantial scientific, economic, and social damage. There are numerous information sources that deal with this topic from different points of view. This review aims to provide an overview of these information sources until June 2024. Much more original research with larger datasets is needed, for example on the extent and impact of the fake paper problem and especially on how to detect them, as many findings are based more on small datasets, anecdotal evidence, and assumptions. A long-term solution would be to overcome the mantra of publication metrics for evaluating scientists in academia.

2.
J Clin Epidemiol ; 170: 111365, 2024 Jun.
Artículo en Inglés | MEDLINE | ID: mdl-38631528

RESUMEN

OBJECTIVES: To describe statistical tools available for assessing publication integrity of groups of randomized controlled trials (RCTs). STUDY DESIGN AND SETTING: Narrative review. RESULTS: Freely available statistical tools have been developed that compare the observed distributions of baseline variables with the expected distributions that would occur if successful randomization occurred. For continuous variables, the tools assess baseline means, baseline P values, and the occurrence of identical means and/or standard deviation. For categorical variables, they assess baseline P values, frequency counts for individual or all variables, numbers of trial participants randomized or withdrawing, and compare reported with independently calculated P values. The tools have been used to identify publication integrity concerns in RCTs from individual groups, and performed at an acceptable level in discriminating intentionally fabricated baseline summary data from genuine RCTs. The tools can be used when concerns have been raised about RCT(s) from an individual/group and when the whole body of their work is being examined, when conducting systematic reviews, and could be adapted to aid screening of RCTs at journal submission. CONCLUSION: Statistical tools are useful for the assessment of publication integrity of groups of RCTs.


Asunto(s)
Ensayos Clínicos Controlados Aleatorios como Asunto , Ensayos Clínicos Controlados Aleatorios como Asunto/normas , Ensayos Clínicos Controlados Aleatorios como Asunto/estadística & datos numéricos , Ensayos Clínicos Controlados Aleatorios como Asunto/métodos , Humanos , Interpretación Estadística de Datos , Edición/normas , Proyectos de Investigación/normas , Sesgo de Publicación/estadística & datos numéricos
3.
J Clin Epidemiol ; 154: 117-124, 2023 02.
Artículo en Inglés | MEDLINE | ID: mdl-36584733

RESUMEN

BACKGROUND AND OBJECTIVES: Comparing observed and expected distributions of baseline continuous variables in randomized controlled trials (RCTs) can be used to assess publication integrity. We explored whether baseline categorical variables could also be used. METHODS: The observed and expected (binomial) distribution of all baseline categorical variables were compared in four sets of RCTs: two controls, and two with publication integrity concerns. We also compared baseline calculated and reported P-values. RESULTS: The observed and expected distributions of baseline categorical variables were similar in the control datasets, both for frequency counts (and percentages) and for between-group differences in frequency counts. However, in both sets of RCTs with publication integrity concerns, about twice as many variables as expected had between-group differences in frequency counts of one or 2, and far fewer variables than expected had between-group differences of >4 (P < 0.001 for both datasets). Furthermore, about one in six reported P-values for baseline categorial variables differed by > 0.1 from the calculated P-value in trials with publication integrity concerns. CONCLUSION: Comparing the observed and expected distributions and reported and calculated P-values of baseline categorical variables may help in the assessment of publication integrity of a body of RCTs.


Asunto(s)
Ensayos Clínicos Controlados Aleatorios como Asunto , Estadística como Asunto , Humanos
4.
J Clin Epidemiol ; 136: 180-188, 2021 08.
Artículo en Inglés | MEDLINE | ID: mdl-34000386

RESUMEN

OBJECTIVE: To examine the proposition that identical summary statistics (mean and/or SD) in different randomized controlled trials (RCT) or clinical cohorts can be explained by common or homogeneous source populations. STUDY DESIGN: We estimated the probability of identical summary data in studies with high proportions of identical summary statistics, in simulations, and in control datasets. RESULTS: The probability of both an identical mean and an identical SD for a variable in separate RCT is low (<~3%), unless the variable is rounded to 1 significant figure. In two RCT with identical summary statistics for 16 of 39 shared variables, simulations indicated the probability of the observed matches was <1 in 100,000. In 34 clinical cohorts with publication integrity concerns, the proportion of summary statistics from variables reported in ≥10 studies that were identical in ≥2 cohorts were high (42% for means, 52% for SD, and 29% for both), and improbable based on simulations and comparisons to control datasets. CONCLUSIONS: The likelihood of multiple identical summary statistics within an individual RCT or across a body of RCT or cohort studies by the same research group is low, especially when both the mean, and the SD are identical, unless the variables are rounded to 1 significant figure.


Asunto(s)
Estudios de Cohortes , Exactitud de los Datos , Interpretación Estadística de Datos , Manejo de Datos/estadística & datos numéricos , Probabilidad , Ensayos Clínicos Controlados Aleatorios como Asunto/estadística & datos numéricos , Humanos
5.
J Clin Epidemiol ; 131: 22-29, 2021 03.
Artículo en Inglés | MEDLINE | ID: mdl-33227448

RESUMEN

OBJECTIVES: Comparing observed and expected distributions of categorical outcome variables in randomized controlled trials (RCTs) has been previously used to assess publication integrity. We applied this technique to withdrawals from RCTs. STUDY DESIGN AND SETTING: We compared the observed distribution of withdrawals with the expected binomial distribution in six sets of RCTs: four control sets and two sets with concerns about their publication integrity. RESULTS: In the control data sets (n = 13, 115, 71, and 36 trials, respectively), the observed distributions of withdrawals were consistent with the expected distributions, both for the numbers of withdrawals per trial arm and for the differences in withdrawals between trial arms in two-arm RCTs. In contrast, in both sets of RCTs with concerns regarding publication integrity (n = 151 and 35 trials, respectively), there were striking differences between the observed and expected distributions of trial withdrawals. Two-arm RCTs from the two sets with publication integrity concerns were 2.6 (95% confidence interval 2.0-3.3) times more likely to have a difference of 0 or 1 withdrawals between trial arms than control RCTs (P < 0.001). Simulating a 50% higher rate of withdrawals in active treatment arms in the largest set of control RCTs still produced an observed distribution of withdrawals per trial arm consistent with the expected distribution. CONCLUSION: Comparing the observed and expected distribution of trial withdrawals may be a useful technique when considering publication integrity of a body of RCTs.


Asunto(s)
Perdida de Seguimiento , Ensayos Clínicos Controlados Aleatorios como Asunto/estadística & datos numéricos , Humanos
6.
Eur J Obstet Gynecol Reprod Biol ; 249: 72-83, 2020 Jun.
Artículo en Inglés | MEDLINE | ID: mdl-32381348

RESUMEN

While updating a systematic review on the topic of ovulation of induction, we observed unusual similarities in a number of randomised controlled trials (RCTs) published by two authors from the same institute in the same disease spectrum in a short period of time. We therefore undertook a focused analysis of the data integrity of all RCTs published by the two authors. We made pairwise comparisons to find identical or similar values in baseline characteristics and outcome tables between trials. We also assessed whether baseline characteristics were compatible with chance, using Monte Carlo simulations and Kolmogorov-Smirnov test. For 35 trials published between September 2006 and January 2016, we found a large number of similarities in both the baseline characteristics and outcomes of 26. Analysis of the baseline characteristics of the trials indicated that their distribution was unlikely to be the result of proper randomisation. The procedures demonstrated in this paper may help to assess data integrity in future attempts to verify the authenticity of published RCTs.


Asunto(s)
Exactitud de los Datos , Ginecología , Obstetricia , Ensayos Clínicos Controlados Aleatorios como Asunto/normas , Salud de la Mujer/estadística & datos numéricos , Femenino , Humanos , Método de Montecarlo , Mala Conducta Científica
8.
J Clin Epidemiol ; 112: 67-76, 2019 08.
Artículo en Inglés | MEDLINE | ID: mdl-31125614

RESUMEN

OBJECTIVE: Comparing observed and expected distributions of baseline variables in randomized controlled trials (RCTs) has been used to investigate possible research misconduct, although the validity of this approach has been questioned. We explored this technique and introduced a novel metric to compare P values from baseline variables between treatment arms. STUDY DESIGN AND SETTING: We compared observed with expected distributions of baseline P values using a one-way chi-square test and by comparing the area under the curve (AUC) of the cumulative distribution function in 13 RCTs conducted by our group, two groups of RCTs known to contain fabricated data, and simulations. RESULTS: In our 13 RCTs, the distribution of P values from baseline continuous variables was consistent with the expected theoretical uniform distribution (P = 0.19, difference from expected AUC -0.03, 95% confidence interval [-0.04, 0.04]). For categorical variables, the P value distribution was not uniform. The distributions of P values from RCTs with fabricated data were highly unusual and not consistent with the uniform distribution for continuous variables, nor with the expected distribution for categorical variables, nor with the distribution of P values in genuine RCTs. CONCLUSIONS: Assessing baseline P values in groups of RCTs can identify highly unusual distributions that might raise or reinforce concerns about randomization and data integrity.


Asunto(s)
Exactitud de los Datos , Variaciones Dependientes del Observador , Ensayos Clínicos Controlados Aleatorios como Asunto , Proyectos de Investigación , Área Bajo la Curva , Distribución de Chi-Cuadrado , Interpretación Estadística de Datos , Humanos , Ensayos Clínicos Controlados Aleatorios como Asunto/ética , Ensayos Clínicos Controlados Aleatorios como Asunto/métodos , Ensayos Clínicos Controlados Aleatorios como Asunto/normas , Reproducibilidad de los Resultados , Mala Conducta Científica/estadística & datos numéricos
9.
BMJ Open ; 8(5): e022079, 2018 05 09.
Artículo en Inglés | MEDLINE | ID: mdl-29743333

RESUMEN

OBJECTIVE: Newcomb-Benford's Law (NBL) proposes a regular distribution for first digits, second digits and digit combinations applicable to many different naturally occurring sources of data. Testing deviations from NBL is used in many datasets as a screening tool for identifying data trustworthiness problems. This study aims to compare public available waiting lists (WL) data from Finland and Spain for testing NBL as an instrument to flag up potential manipulation in WLs. DESIGN: Analysis of the frequency of Finnish and Spanish WLs first digits to determine if their distribution is similar to the pattern documented by NBL. Deviations from the expected first digit frequency were analysed using Pearson's χ2, mean absolute deviation and Kuiper tests. SETTING/PARTICIPANTS: Publicly available WL data from Finland and Spain, two countries with universal health insurance and National Health Systems but characterised by different levels of transparency and good governance standards. MAIN OUTCOME MEASURES: Adjustment of the observed distribution of the numbers reported in Finnish and Spanish WL data to the expected distribution according to NBL. RESULTS: WL data reported by the Finnish health system fits first digit NBL according to all statistical tests used (p=0.6519 in χ2 test). For Spanish data, this hypothesis was rejected in all tests (p<0.0001 in χ2 test). CONCLUSIONS: Testing deviations from NBL distribution can be a useful tool to identify problems with WL data trustworthiness and signalling the need for further testing.


Asunto(s)
Estadística como Asunto/métodos , Listas de Espera , Finlandia , Humanos , Programas Nacionales de Salud , Probabilidad , Proyectos de Investigación , España , Cobertura Universal del Seguro de Salud
10.
Forensic Sci Int ; 282: 24-34, 2018 Jan.
Artículo en Inglés | MEDLINE | ID: mdl-29149684

RESUMEN

OBJECTIVES: This paper is based on the analysis of the database of operations from a macro-case on money laundering orchestrated between a core company and a group of its suppliers, 26 of which had already been identified by the police as fraudulent companies. In the face of a well-founded suspicion that more companies have perpetrated criminal acts and in order to make better use of what are very limited police resources, we aim to construct a tool to detect money laundering criminals. METHODS: We combine Benford's Law and machine learning algorithms (logistic regression, decision trees, neural networks, and random forests) to find patterns of money laundering criminals in the context of a real Spanish court case. RESULTS: After mapping each supplier's set of accounting data into a 21-dimensional space using Benford's Law and applying machine learning algorithms, additional companies that could merit further scrutiny are flagged up. CONCLUSIONS: A new tool to detect money laundering criminals is proposed in this paper. The tool is tested in the context of a real case.

SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA