Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 74
Filtrar
1.
PLoS One ; 18(4): e0284078, 2023.
Artículo en Inglés | MEDLINE | ID: mdl-37053261

RESUMEN

Non-negative matrix factorization (NMF) efficiently reduces high dimensionality for many-objective ranking problems. In multi-objective optimization, as long as only three or four conflicting viewpoints are present, an optimal solution can be determined by finding the Pareto front. When the number of the objectives increases, the multi-objective problem evolves into a many-objective optimization task, where the Pareto front becomes oversaturated. The key idea is that NMF aggregates the objectives so that the Pareto front can be applied, while the Sum of Ranking Differences (SRD) method selects the objectives that have a detrimental effect on the aggregation, and validates the findings. The applicability of the method is illustrated by the ranking of 1176 universities based on 46 variables of the CWTS Leiden Ranking 2020 database. The performance of NMF is compared to principal component analysis (PCA) and sparse non-negative matrix factorization-based solutions. The results illustrate that PCA incorporates negatively correlated objectives into the same principal component. On the contrary, NMF only allows non-negative correlations, which enable the proper use of the Pareto front. With the combination of NMF and SRD, a non-biased ranking of the universities based on 46 criteria is established, where Harvard, Rockefeller and Stanford Universities are determined as the first three. To evaluate the ranking capabilities of the methods, measures based on Relative Entropy (RE) and Hypervolume (HV) are proposed. The results confirm that the sparse NMF method provides the most informative ranking. The results highlight that academic excellence can be improved by decreasing the proportion of unknown open-access publications and short distance collaborations. The proportion of gender indicators barely correlate with scientific impact. More authors, long-distance collaborations, publications that have more scientific impact and citations on average highly influence the university ranking in a positive direction.


Asunto(s)
Algoritmos , Humanos , Universidades , Análisis de Componente Principal
2.
J Chem Inf Model ; 62(14): 3415-3425, 2022 07 25.
Artículo en Inglés | MEDLINE | ID: mdl-35834424

RESUMEN

Molecular dynamics (MD) is a core methodology of molecular modeling and computational design for the study of the dynamics and temporal evolution of molecular systems. MD simulations have particularly benefited from the rapid increase of computational power that has characterized the past decades of computational chemical research, being the first method to be successfully migrated to the GPU infrastructure. While new-generation MD software is capable of delivering simulations on an ever-increasing scale, relatively less effort is invested in developing postprocessing methods that can keep up with the quickly expanding volumes of data that are being generated. Here, we introduce a new idea for sampling frames from large MD trajectories, based on the recently introduced framework of extended similarity indices. Our approach presents a new, linearly scaling alternative to the traditional approach of applying a clustering algorithm that usually scales as a quadratic function of the number of frames. When showcasing its usage on case studies with different system sizes and simulation lengths, we have registered speedups of up to 2 orders of magnitude, as compared to traditional clustering algorithms. The conformational diversity of the selected frames is also noticeably higher, which is a further advantage for certain applications, such as the selection of structural ensembles for ligand docking. The method is available open-source at https://github.com/ramirandaq/MultipleComparisons.


Asunto(s)
Simulación de Dinámica Molecular , Proteínas , Algoritmos , Análisis por Conglomerados , Proteínas/química , Programas Informáticos
3.
Front Chem ; 10: 852893, 2022.
Artículo en Inglés | MEDLINE | ID: mdl-35755260

RESUMEN

The screening of compounds for ADME-Tox targets plays an important role in drug design. QSPR models can increase the speed of these specific tasks, although the performance of the models highly depends on several factors, such as the applied molecular descriptors. In this study, a detailed comparison of the most popular descriptor groups has been carried out for six main ADME-Tox classification targets: Ames mutagenicity, P-glycoprotein inhibition, hERG inhibition, hepatotoxicity, blood-brain-barrier permeability, and cytochrome P450 2C9 inhibition. The literature-based, medium-sized binary classification datasets (all above 1,000 molecules) were used for the model building by two common algorithms, XGBoost and the RPropMLP neural network. Five molecular representation sets were compared along with their joint applications: Morgan, Atompairs, and MACCS fingerprints, and the traditional 1D and 2D molecular descriptors, as well as 3D molecular descriptors, separately. The statistical evaluation of the model performances was based on 18 different performance parameters. Although all the developed models were close to the usual performance of QSPR models for each specific ADME-Tox target, the results clearly showed the superiority of the traditional 1D, 2D, and 3D descriptors in the case of the XGBoost algorithm. It is worth trying the classical tools in single model building because the use of 2D descriptors can produce even better models for almost every dataset than the combination of all the examined descriptor sets.

4.
J Comput Aided Mol Des ; 36(3): 157-173, 2022 03.
Artículo en Inglés | MEDLINE | ID: mdl-35288838

RESUMEN

Extended (or n-ary) similarity indices have been recently proposed to extend the comparative analysis of binary strings. Going beyond the traditional notion of pairwise comparisons, these novel indices allow comparing any number of objects at the same time. This results in a remarkable efficiency gain with respect to other approaches, since now we can compare N molecules in O(N) instead of the common quadratic O(N2) timescale. This favorable scaling has motivated the application of these indices to diversity selection, clustering, phylogenetic analysis, chemical space visualization, and post-processing of molecular dynamics simulations. However, the current formulation of the n-ary indices is limited to vectors with binary or categorical inputs. Here, we present the further generalization of this formalism so it can be applied to numerical data, i.e. to vectors with continuous components. We discuss several ways to achieve this extension and present their analytical properties. As a practical example, we apply this formalism to the problem of feature selection in QSAR and prove that the extended continuous similarity indices provide a convenient way to discern between several sets of descriptors.


Asunto(s)
Diseño de Fármacos , Relación Estructura-Actividad Cuantitativa , Filogenia
5.
PLoS One ; 17(2): e0264277, 2022.
Artículo en Inglés | MEDLINE | ID: mdl-35213620

RESUMEN

The Promethee-GAIA method is a multicriteria decision support technique that defines the aggregated ranks of multiple criteria and visualizes them based on Principal Component Analysis (PCA). In the case of numerous criteria, the PCA biplot-based visualization do not perceive how a criterion influences the decision problem. The central question is how the Promethee-GAIA-based decision-making process can be improved to gain more interpretable results that reveal more characteristic inner relationships between the criteria. To improve the Promethee-GAIA method, we suggest three techniques that eliminate redundant criteria as well as clearly outline, which criterion belongs to which factor and explore the similarities between criteria. These methods are the following: A) Principal factoring with rotation and communality analysis (P-PFA), B) the integration of Sparse PCA into the Promethee II method (P-sPCA), and C) the Sum of Ranking Differences method (P-SRD). The suggested methods are presented through an I4.0+ dataset that measures the Industry 4.0 readiness of NUTS 2-classified regions. The proposed methods are useful tools for handling multicriteria ranking problems, if the number of criteria is numerous.


Asunto(s)
Técnicas de Apoyo para la Decisión , Modelos Teóricos , Análisis Factorial , Industrias
6.
Comput Struct Biotechnol J ; 19: 3628-3639, 2021.
Artículo en Inglés | MEDLINE | ID: mdl-34257841

RESUMEN

Quantification of similarities between protein sequences or DNA/RNA strands is a (sub-)task that is ubiquitously present in bioinformatics workflows, and is usually accomplished by pairwise comparisons of sequences, utilizing simple (e.g. percent identity) or more intricate concepts (e.g. substitution scoring matrices). Complex tasks (such as clustering) rely on a large number of pairwise comparisons under the hood, instead of a direct quantification of set similarities. Based on our recently introduced framework that enables multiple comparisons of binary molecular fingerprints (i.e., direct calculation of the similarity of fingerprint sets), here we introduce novel symmetric similarity indices for analogous calculations on sets of character sequences with more than two (t) possible items (e.g. DNA/RNA sequences with t = 4, or protein sequences with t = 20). The features of these new indices are studied in detail with analysis of variance (ANOVA), and demonstrated with three case studies of protein/DNA sequences with varying degrees of similarity (or evolutionary proximity). The Python code for the extended many-item similarity indices is publicly available at: https://github.com/ramirandaq/tn_Comparisons.

7.
Mol Divers ; 25(3): 1409-1424, 2021 Aug.
Artículo en Inglés | MEDLINE | ID: mdl-34110577

RESUMEN

In this review, we outline the current trends in the field of machine learning-driven classification studies related to ADME (absorption, distribution, metabolism and excretion) and toxicity endpoints from the past six years (2015-2021). The study focuses only on classification models with large datasets (i.e. more than a thousand compounds). A comprehensive literature search and meta-analysis was carried out for nine different targets: hERG-mediated cardiotoxicity, blood-brain barrier penetration, permeability glycoprotein (P-gp) substrate/inhibitor, cytochrome P450 enzyme family, acute oral toxicity, mutagenicity, carcinogenicity, respiratory toxicity and irritation/corrosion. The comparison of the best classification models was targeted to reveal the differences between machine learning algorithms and modeling types, endpoint-specific performances, dataset sizes and the different validation protocols. Based on the evaluation of the data, we can say that tree-based algorithms are (still) dominating the field, with consensus modeling being an increasing trend in drug safety predictions. Although one can already find classification models with great performances to hERG-mediated cardiotoxicity and the isoenzymes of the cytochrome P450 enzyme family, these targets are still central to ADMET-related research efforts.


Asunto(s)
Diseño de Fármacos , Aprendizaje Automático , Modelos Moleculares , Relación Estructura-Actividad Cuantitativa , Algoritmos , Efectos Colaterales y Reacciones Adversas Relacionados con Medicamentos , Canal de Potasio ERG1/química , Canal de Potasio ERG1/genética , Humanos , Redes Neurales de la Computación , Farmacocinética , Máquina de Vectores de Soporte , Distribución Tisular
8.
Food Res Int ; 143: 110309, 2021 05.
Artículo en Inglés | MEDLINE | ID: mdl-33992329

RESUMEN

In recent decades, eye-movement detection technology has improved significantly, and eye-trackers are available not only as standalone research tools but also as computer peripherals. This rapid spread gives further opportunities to measure the eye-movements of participants. The current paper provides classification models for the prediction of food choice and selects the best one. Four choice sets were presented to 112 volunteered participants, each choice set consisting of four different choice tasks, resulting in altogether sixteen choice tasks. The choice sets followed the 2-, 4-, 6- and 8-alternative forced-choice paradigm. Tobii X2-60 eye-tracker and Tobii Studio software were used to capture and export gazing data, respectively. After variable filtering, thirteen classification models were elaborated and tested; moreover, eight performance parameters were computed. The models were compared based on the performance parameters using the sum of ranking differences algorithm. The algorithm ranks and groups the models by comparing the ranks of their performance metrics to a predefined gold standard. Techniques based on decision trees were superior in all cases, regardless of the choice tasks and food product categories. Among the classifiers, Quinlan's C4.5 and cost-sensitive decision trees proved to be the best-performing ones. Future studies should focus on the fine-tuning of these models as well as their applications with mobile eye-trackers.


Asunto(s)
Algoritmos , Movimientos Oculares , Humanos , Programas Informáticos
9.
J Cheminform ; 13(1): 33, 2021 Apr 23.
Artículo en Inglés | MEDLINE | ID: mdl-33892799

RESUMEN

Despite being a central concept in cheminformatics, molecular similarity has so far been limited to the simultaneous comparison of only two molecules at a time and using one index, generally the Tanimoto coefficent. In a recent contribution we have not only introduced a complete mathematical framework for extended similarity calculations, (i.e. comparisons of more than two molecules at a time) but defined a series of novel idices. Part 1 is a detailed analysis of the effects of various parameters on the similarity values calculated by the extended formulas. Their features were revealed by sum of ranking differences and ANOVA. Here, in addition to characterizing several important aspects of the newly introduced similarity metrics, we will highlight their applicability and utility in real-life scenarios using datasets with popular molecular fingerprints. Remarkably, for large datasets, the use of extended similarity measures provides an unprecedented speed-up over "traditional" pairwise similarity matrix calculations. We also provide illustrative examples of a more direct algorithm based on the extended Tanimoto similarity to select diverse compound sets, resulting in much higher levels of diversity than traditional approaches. We discuss the inner and outer consistency of our indices, which are key in practical applications, showing whether the n-ary and binary indices rank the data in the same way. We demonstrate the use of the new n-ary similarity metrics on t-distributed stochastic neighbor embedding (t-SNE) plots of datasets of varying diversity, or corresponding to ligands of different pharmaceutical targets, which show that our indices provide a better measure of set compactness than standard binary measures. We also present a conceptual example of the applicability of our indices in agglomerative hierarchical algorithms. The Python code for calculating the extended similarity metrics is freely available at: https://github.com/ramirandaq/MultipleComparisons.

10.
J Cheminform ; 13(1): 32, 2021 Apr 23.
Artículo en Inglés | MEDLINE | ID: mdl-33892802

RESUMEN

Quantification of the similarity of objects is a key concept in many areas of computational science. This includes cheminformatics, where molecular similarity is usually quantified based on binary fingerprints. While there is a wide selection of available molecular representations and similarity metrics, there were no previous efforts to extend the computational framework of similarity calculations to the simultaneous comparison of more than two objects (molecules) at the same time. The present study bridges this gap, by introducing a straightforward computational framework for comparing multiple objects at the same time and providing extended formulas for as many similarity metrics as possible. In the binary case (i.e. when comparing two molecules pairwise) these are naturally reduced to their well-known formulas. We provide a detailed analysis on the effects of various parameters on the similarity values calculated by the extended formulas. The extended similarity indices are entirely general and do not depend on the fingerprints used. Two types of variance analysis (ANOVA) help to understand the main features of the indices: (i) ANOVA of mean similarity indices; (ii) ANOVA of sum of ranking differences (SRD). Practical aspects and applications of the extended similarity indices are detailed in the accompanying paper: Miranda-Quintana et al. J Cheminform. 2021. https://doi.org/10.1186/s13321-021-00504-4 . Python code for calculating the extended similarity metrics is freely available at: https://github.com/ramirandaq/MultipleComparisons .

11.
Mol Inform ; 40(7): e2060017, 2021 07.
Artículo en Inglés | MEDLINE | ID: mdl-33891369

RESUMEN

Similarity measures are widely used in various areas from taxonomy to cheminformatics. To this end, a large number of similarity and distance measures (or, collectively, comparative measures) have been introduced, with only a few studies directed to revealing their inner relationships. We present a thorough analytical study of the conditions leading to two comparative measures providing equivalent results over a given set of molecules. A key part of this work is the introduction of a novel way to study the consistency between comparative measures: the differential consistency analysis (DCA). This tool reveals how the consistency can be established in an analytical way with minimal (or no) assumptions. We found that the consensus between Tanimoto and the Cosine coefficients improved by choosing a reference whose similarity to the rest of the molecules varies less, or by representing the molecules in a way that does not depend strongly on their size (i. e. bit frequency in the chosen fingerprint representation). The presented derivations are just some generic examples; DCA can be applied widely and for all binary similarity coefficients introduced so far, independently from the molecular representations.


Asunto(s)
Descubrimiento de Drogas , Quimioinformática
12.
Molecules ; 26(4)2021 Feb 19.
Artículo en Inglés | MEDLINE | ID: mdl-33669834

RESUMEN

Applied datasets can vary from a few hundred to thousands of samples in typical quantitative structure-activity/property (QSAR/QSPR) relationships and classification. However, the size of the datasets and the train/test split ratios can greatly affect the outcome of the models, and thus the classification performance itself. We compared several combinations of dataset sizes and split ratios with five different machine learning algorithms to find the differences or similarities and to select the best parameter settings in nonbinary (multiclass) classification. It is also known that the models are ranked differently according to the performance merit(s) used. Here, 25 performance parameters were calculated for each model, then factorial ANOVA was applied to compare the results. The results clearly show the differences not just between the applied machine learning algorithms but also between the dataset sizes and to a lesser extent the train/test split ratios. The XGBoost algorithm could outperform the others, even in multiclass modeling. The performance parameters reacted differently to the change of the sample set size; some of them were much more sensitive to this factor than the others. Moreover, significant differences could be detected between train/test split ratios as well, exerting a great effect on the test validation of our models.


Asunto(s)
Algoritmos , Bases de Datos como Asunto , Relación Estructura-Actividad Cuantitativa , Intervalos de Confianza , Aprendizaje Automático
13.
Food Chem ; 344: 128617, 2021 May 15.
Artículo en Inglés | MEDLINE | ID: mdl-33221108

RESUMEN

Finding optimal solutions usually requires multicriteria optimization. The sum of ranking differences (SRD) algorithm can efficiently solve such problems. Its principles and earlier applications will be discussed here, along with meta-analyses of papers published in various subfields of food science, such as analytics in food chemistry, food engineering, food technology, food microbiology, quality control, and sensory analysis. Carefully selected real case studies give an overview of the wide range of applications for multicriteria optimizations, using a free, easy-to-use and validated method. Results are presented and discussed in a way that helps scientists and practitioners, who are less familiar with multicriteria optimization, to integrate the method into their research projects. The utility of SRD, optionally coupled with other statistical methods such as ANOVA, is demonstrated on altogether twelve case studies, covering diverse method comparison and data evaluation scenarios from various subfields of food science.


Asunto(s)
Toma de Decisiones , Tecnología de Alimentos , Algoritmos , Análisis de Varianza , Análisis de los Alimentos/métodos , Microbiología de Alimentos , Calidad de los Alimentos , Proteínas de Plantas/química , Proteínas de Plantas/metabolismo , Compuestos Orgánicos Volátiles/análisis , Vino/análisis
14.
Anal Bioanal Chem ; 412(19): 4619-4628, 2020 Jul.
Artículo en Inglés | MEDLINE | ID: mdl-32472144

RESUMEN

Extracellular vesicles (EVs) are lipid bilayer-bounded particles that are actively synthesized and released by cells. The main components of EVs are lipids, proteins, and nucleic acids and their composition is characteristic to their type and origin, and it reveals the physiological and pathological conditions of the parent cells. The concentration and protein composition of EVs closely relate to their functions; therefore, total protein determination can assist in EV-based diagnostics and disease prognosis. Here, we present a simple, reagent-free method based on attenuated total reflection Fourier transform infrared (ATR-FTIR) spectroscopy to quantify the protein content of EV samples without any further sample preparation. After calibration with bovine serum albumin, the protein concentration of red blood cell-derived EVs (REVs) were investigated by ATR-FTIR spectroscopy. The integrated area of the amide I band was calculated from the IR spectra of REVs, which was proportional to the protein quantity in the sample' regardless of its secondary structure. A spike test and a dilution test were performed to determine the ability to use ATR-FTIR spectroscopy for protein quantification in EV samples, which resulted in linearity with R2 values as high as 0.992 over the concentration range of 0.08 to 1 mg/mL. Additionally, multivariate calibration with the partial least squares (PLS) regression method was carried out on the bovine serum albumin and EV spectra. R2 values were 0.94 for the calibration and 0.91 for the validation set. The results indicate that ATR-FTIR measurements provide a reliable method for reagent-free protein quantification of EVs. Graphical abstract.


Asunto(s)
Eritrocitos/química , Vesículas Extracelulares/química , Proteínas/análisis , Espectroscopía Infrarroja por Transformada de Fourier/métodos , Animales , Bovinos , Humanos , Indicadores y Reactivos , Análisis de los Mínimos Cuadrados , Albúmina Sérica Bovina/análisis
15.
PLoS One ; 15(3): e0229209, 2020.
Artículo en Inglés | MEDLINE | ID: mdl-32203513

RESUMEN

Sum of Ranking Differences is an innovative statistical method that ranks competing solutions based on a reference point. The latter might arise naturally, or can be aggregated from the data. We provide two case studies to feature both possibilities. Apportionment and districting are two critical issues that emerge in relation to democratic elections. Theoreticians invented clever heuristics to measure malapportionment and the compactness of the shape of the constituencies, yet, there is no unique best method in either cases. Using data from Norway and the US we rank the standard methods both for the apportionment and for the districting problem. In case of apportionment, we find that all the classical methods perform reasonably well, with subtle but significant differences. By a small margin the Leximin method emerges as a winner, but-somewhat unexpectedly-the non-regular Imperiali method ties for first place. In districting, the Lee-Sallee index and a novel parametric method the so-called Moment Invariant performs the best, although the latter is sensitive to the function's chosen parameter.


Asunto(s)
Legislación como Asunto , Política , Personal Administrativo , Humanos , Modelos Estadísticos , Modelos Teóricos , Noruega , Estados Unidos
16.
ACS Omega ; 5(7): 3670-3677, 2020 Feb 25.
Artículo en Inglés | MEDLINE | ID: mdl-32118182

RESUMEN

Quantitation of surface roughness is difficult, if subtle, but significant differences cause an uncommon variance. We used atomic force microscopy to measure the surface roughness of polyethylene terephthalate (PET) fibers before and after a 30 s plasma treatment of 300 W. Samples were measured multiple times at different locations, in four scan sizes. The surface roughness was expressed in terms of nine roughness parameters. Despite the large number of data, simple statistics was not able to detect significant differences in roughness before and after plasma treatment. A factorial analysis of variance (ANOVA) of the normalized data and a sum of ranking differences analysis using four types of data preprocessing and their factorial ANOVA confirmed that (i) the plasma treatment had roughened the PET fiber surface; (ii) the roughness increases with the scanned area in the measured range; and (iii) what the best roughness parameters are in discriminating between surfaces before and after treatment. Although the compared roughness estimators were on different scales, a roughness estimation of the nanoscale surfaces was feasible, where other methods fail. The presented methodology can be applied widely and unambiguously for highly different method comparison tasks.

17.
Chemosphere ; 238: 124566, 2020 Jan.
Artículo en Inglés | MEDLINE | ID: mdl-31446272

RESUMEN

How far-reaching is the influence of the urban area over the mineral composition of the Russula cyanoxantha mushroom? We studied the metal uptake behavior of this fungus relying on the soil properties. We sampled mushroom and soil from six forests according to an urbanization gradient, and two city parks in Cluj-Napoca (Romania). The elements were quantified using inductively coupled plasma - optical emission spectroscopy (ICP-OES). The concentrations of some elements differed significantly (p < 0.05) in the samples from the city (0.39 ±â€¯0.35 mg kg-1 for cadmium (Cd), 0.40 ±â€¯0.19 mg kg-1 for chromium (Cr), 69.1 ±â€¯29.9 mg kg-1 for iron (Fe), 10.9 ±â€¯1.3 mg kg-1 for manganese (Mn), 0.76 ±â€¯0.45 mg kg-1 for titanium (Ti) compared with the samples from the forests (3.15-14.1 mg kg-1 Cd, < 0.18 mg kg-1 for Cr, 22.6-34.5 mg kg-1 for Fe, 15.9-19.1 mg kg-1 for Mn, 0.19-0.36 mg kg-1 for Ti). We observed a definite negative trend in the mineral accumulation potential of this fungus along the urbanization gradient. The fungus turned from a cadmium-accumulator to a cadmium-excluder. This highlights a positive environmental influence of the urbanization over the toxic metal uptake of R. cyanoxantha. The hypothesis, that the urban soil pollution would increase the metal content of the mushroom was disproved. The possible explanation might be the elevated carbonate content of the urban soil, which is known to immobilize the metals in the soil.


Asunto(s)
Agaricales/química , Monitoreo del Ambiente/métodos , Metales Pesados/análisis , Contaminantes del Suelo/análisis , Suelo/química , Cadmio/análisis , Cromo/análisis , Ciudades , Hierro/análisis , Manganeso/análisis , Rumanía , Urbanización
18.
Foods ; 10(1)2020 Dec 30.
Artículo en Inglés | MEDLINE | ID: mdl-33396655

RESUMEN

Recently, 1H NMR (nuclear magnetic resonance) spectroscopy was presented as a viable option for the quality assurance of foods and beverages, such as wine products. Here, a complex chemometric analysis of red and white wine samples was carried out based on their 1H NMR spectra. Extreme gradient boosting (XGBoost) machine learning algorithm was applied for the wine variety classification with an iterative double cross-validation loop, developed during the present work. In the case of red wines, Cabernet Franc, Merlot and Blue Frankish samples were successfully classified. Three very common white wine varieties were selected and classified: Chardonnay, Sauvignon Blanc and Riesling. The models were robust and were validated against overfitting with iterative randomization tests. Moreover, four novel partial least-squares (PLS) regression models were constructed to predict the major quantitative parameters of the wines: density, total alcohol, total sugar and total SO2 concentrations. All the models performed successfully, with R2 values above 0.80 in almost every case, providing additional information about the wine samples for the quality control of the products. 1H NMR spectra combined with chemometric modeling can be a good and reliable candidate for the replacement of the time-consuming traditional standards, not just in wine analysis, but also in other aspects of food science.

19.
Data Brief ; 27: 104572, 2019 Dec.
Artículo en Inglés | MEDLINE | ID: mdl-31656835

RESUMEN

How far-reaching is the influence of the urban area over the mineral composition of the Russula cyanoxantha mushroom? To answer this question, we monitored the metal uptake behavior of this fungus relying on the soil properties. We sampled mushroom and soil from six forests according to an urbanization gradient, and two city parks in Cluj-Napoca (Romania). The elements were quantified using inductively coupled plasma - optical emission spectroscopy (ICP-OES). The concentrations of some elements differed significantly (p < 0.05) in the samples from the city (0.39 ± 0.35 mg kg-1 for cadmium (Cd), 0.40 ± 0.19 mg kg-1 for chromium (Cr), 69.1 ± 29.9 mg kg-1 for iron (Fe), 10.9 ± 1.3 mg kg-1 for manganese (Mn), 0.76 ± 0.45 mg kg-1 for titanium (Ti)) compared with the samples from the forests (3.15-14.1 mg kg-1 Cd, < 0.18 mg kg-1 for Cr, 22.6-34.5 mg kg-1 for Fe, 15.9-19.1 mg kg-1 for Mn, 0.19-0.36 mg kg-1 for Ti). We observed a definite negative trend in the mineral accumulation potential of this fungus along the urbanization gradient. The fungus turned from a cadmium-accumulator to a cadmium-excluder. This highlights a positive environmental influence of the urbanization over the toxic metal uptake of R. cyanoxantha. The hypothesis, that the urban soil pollution would increase the metal content of the mushroom was disproved. The possible explanation might be the elevated carbonate content of the urban soil, which is known to immobilize the metals in the soil.

20.
Molecules ; 24(15)2019 Aug 01.
Artículo en Inglés | MEDLINE | ID: mdl-31374986

RESUMEN

Machine learning classification algorithms are widely used for the prediction and classification of the different properties of molecules such as toxicity or biological activity. the prediction of toxic vs. non-toxic molecules is important due to testing on living animals, which has ethical and cost drawbacks as well. The quality of classification models can be determined with several performance parameters. which often give conflicting results. In this study, we performed a multi-level comparison with the use of different performance metrics and machine learning classification methods. Well-established and standardized protocols for the machine learning tasks were used in each case. The comparison was applied to three datasets (acute and aquatic toxicities) and the robust, yet sensitive, sum of ranking differences (SRD) and analysis of variance (ANOVA) were applied for evaluation. The effect of dataset composition (balanced vs. imbalanced) and 2-class vs. multiclass classification scenarios was also studied. Most of the performance metrics are sensitive to dataset composition, especially in 2-class classification problems. The optimal machine learning algorithm also depends significantly on the composition of the dataset.


Asunto(s)
Algoritmos , Benchmarking , Aprendizaje Automático
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA