Pesquisa | Portal Regional da BVS

1.

Author Correction: Enhanced radiation use efficiency and grain filling rate as the main drivers of grain yield genetic gains in the CIMMYT elite spring wheat yield trial.

Gerard, Guillermo; Mondal, Suchismita; Piñera-Chávez, Francisco; Rivera-Amado, Carolina; Molero, Gemma; Crossa, Jose; Huerta-Espino, Julio; Velu, Govindan; Braun, Hans; Singh, Ravi; Crespo-Herrera, Leonardo.

Sci Rep ; 14(1): 17526, 2024 Jul 30.

Artigo em Inglês | MEDLINE | ID: mdl-39079986

2.

Enhanced radiation use efficiency and grain filling rate as the main drivers of grain yield genetic gains in the CIMMYT elite spring wheat yield trial.

Gerard, Guillermo; Mondal, Suchismita; Piñera-Chávez, Francisco; Rivera-Amado, Carolina; Molero, Gemma; Crossa, Jose; Huerta-Espino, Julio; Velu, Govindan; Braun, Hans; Singh, Ravi; Crespo-Herrera, Leonardo.

Sci Rep ; 14(1): 10975, 2024 05 14.

Artigo em Inglês | MEDLINE | ID: mdl-38744876

RESUMO

Common wheat (Triticum aestivum L.) is a major staple food crop, providing a fifth of food calories and proteins to the world's human population. Despite the impressive growth in global wheat production in recent decades, further increases in grain yield are required to meet future demands. Here we estimated genetic gain and genotype stability for grain yield (GY) and determined the trait associations that contributed uniquely or in combination to increased GY, through a retrospective analysis of top-performing genotypes selected from the elite spring wheat yield trial (ESWYT) evaluated internationally during a 14-year period (2003 to 2016). Fifty-six ESWYT genotypes and four checks were sown under optimally irrigated conditions in three phenotyping trials during three consecutive growing seasons (2018-2019 to 2020-2021) at Norman E. Borlaug Research Station, Ciudad Obregon, Mexico. The mean GY rose from 6.75 (24th ESWYT) to 7.87 t ha-1 (37th ESWYT), representing a cumulative increase of 1.12 t ha-1. The annual genetic gain for GY was estimated at 0.96% (65 kg ha-1 year-1) accompanied by a positive trend in genotype stability over time. The GY progress was mainly associated with increases in biomass (BM), grain filling rate (GFR), total radiation use efficiency (RUE_total), grain weight per spike (GWS), and reduction in days to heading (DTH), which together explained 95.5% of the GY variation. Regression lines over the years showed significant increases of 0.015 kg m-2 year-1 (p < 0.01), 0.074 g m-2 year-1 (p < 0.05), and 0.017 g MJ-1 year-1 (p < 0.001) for BM, GFR, and RUE_total, respectively. Grain weight per spike exhibited a positive but no significant trend (0.014 g year-1, p = 0.07), whereas a negative tendency for DTH was observed (- 0.43 days year-1, p < 0.001). Analysis of the top ten highest-yielding genotypes revealed differential GY-associated trait contributions, demonstrating that improved GY can be attained through different mechanisms and indicating that no single trait criterion is adopted by CIMMYT breeders for developing new superior lines. We conclude that CIMMYT's Bread Wheat Breeding Program has continued to deliver adapted and more productive wheat genotypes to National partners worldwide, mainly driven by enhancing RUE_total and GFR and that future yield increases could be achieved by intercrossing genetically diverse top performer genotypes.

Assuntos

Grão Comestível , Genótipo , Triticum , Triticum/genética , Triticum/crescimento & desenvolvimento , Grão Comestível/genética , Grão Comestível/crescimento & desenvolvimento , Fenótipo , Estações do Ano , México

3.

Feature engineering of environmental covariates improves plant genomic-enabled prediction.

Montesinos-López, Osval A; Crespo-Herrera, Leonardo; Pierre, Carolina Saint; Cano-Paez, Bernabe; Huerta-Prado, Gloria Isabel; Mosqueda-González, Brandon Alejandro; Ramos-Pulido, Sofia; Gerard, Guillermo; Alnowibet, Khalid; Fritsche-Neto, Roberto; Montesinos-López, Abelardo; Crossa, José.

Front Plant Sci ; 15: 1349569, 2024.

Artigo em Inglês | MEDLINE | ID: mdl-38812738

RESUMO

Introduction: Because Genomic selection (GS) is a predictive methodology, it needs to guarantee high-prediction accuracies for practical implementations. However, since many factors affect the prediction performance of this methodology, its practical implementation still needs to be improved in many breeding programs. For this reason, many strategies have been explored to improve the prediction performance of this methodology. Methods: When environmental covariates are incorporated as inputs in the genomic prediction models, this information only sometimes helps increase prediction performance. For this reason, this investigation explores the use of feature engineering on the environmental covariates to enhance the prediction performance of genomic prediction models. Results and discussion: We found that across data sets, feature engineering helps reduce prediction error regarding only the inclusion of the environmental covariates without feature engineering by 761.625% across predictors. These results are very promising regarding the potential of feature engineering to enhance prediction accuracy. However, since a significant gain in prediction accuracy was observed in only some data sets, further research is required to guarantee a robust feature engineering strategy to incorporate the environmental covariates.

4.

Genomic Prediction from Multi-Environment Trials of Wheat Breeding.

García-Barrios, Guillermo; Crespo-Herrera, Leonardo; Cruz-Izquierdo, Serafín; Vitale, Paolo; Sandoval-Islas, José Sergio; Gerard, Guillermo Sebastián; Aguilar-Rincón, Víctor Heber; Corona-Torres, Tarsicio; Crossa, José; Pacheco-Gil, Rosa Angela.

Genes (Basel) ; 15(4)2024 03 27.

Artigo em Inglês | MEDLINE | ID: mdl-38674352

RESUMO

Genomic prediction relates a set of markers to variability in observed phenotypes of cultivars and allows for the prediction of phenotypes or breeding values of genotypes on unobserved individuals. Most genomic prediction approaches predict breeding values based solely on additive effects. However, the economic value of wheat lines is not only influenced by their additive component but also encompasses a non-additive part (e.g., additive × additive epistasis interaction). In this study, genomic prediction models were implemented in three target populations of environments (TPE) in South Asia. Four models that incorporate genotype × environment interaction (G × E) and genotype × genotype (GG) were tested: Factor Analytic (FA), FA with genomic relationship matrix (FA + G), FA with epistatic relationship matrix (FA + GG), and FA with both genomic and epistatic relationship matrices (FA + G + GG). Results show that the FA + G and FA + G + GG models displayed the best and a similar performance across all tests, leading us to infer that the FA + G model effectively captures certain epistatic effects. The wheat lines tested in sites in different TPE were predicted with different precisions depending on the cross-validation employed. In general, the best prediction accuracy was obtained when some lines were observed in some sites of particular TPEs and the worse genomic prediction was observed when wheat lines were never observed in any site of one TPE.

Assuntos

Epistasia Genética , Interação Gene-Ambiente , Genoma de Planta , Genômica , Modelos Genéticos , Melhoramento Vegetal , Triticum , Triticum/genética , Melhoramento Vegetal/métodos , Genômica/métodos , Genótipo , Fenótipo

5.

Multispectral and thermal infrared data, visual scores for severity of common rust symptoms, and genotypic single nucleotide polymorphism data of three F2-derived biparental doubled-haploid maize populations.

Loladze, Alexander; Rodrigues, Francelino; Petroli, Cesar D; Muñoz-Zavala, Carlos; Naranjo, Sergio; Vicente, Felix San; Gerard, Bruno; Montesinos-Lopez, Osval A; Crossa, Jose; Martini, Johannes W R.

Data Brief ; 54: 110300, 2024 Jun.

Artigo em Inglês | MEDLINE | ID: mdl-38586147

RESUMO

Three F2-derived biparental doubled haploid (DH) maize populations were generated for genetic mapping of resistance to common rust. Each of the three populations has the same susceptible parent, but a different resistance donor parent. Population 1 and 3 consist of 320 lines each, population 2 consists of 260 lines. The DH lines were evaluated for their susceptibility to common rust in two years and with two replications in each year. For phenotyping, a visual score (VS) for susceptibility was assigned. Additionally, unmanned aerial vehicle (UAV) derived multispectral and thermal infrared data was recorded and combined in different vegetation indices ("remote sensing", RS). The DH lines were genotyped with the DarTseq method, to obtain data on single nucleotide polymorphisms (SNPs). After quality control, 9051 markers remained. Missing values were "imputed" by the empirical mean of the marker scores of the respective locus. We used the data for comparison of genome-wide association studies and genomic prediction when based on different phenotyping methods, that is either VS or RS data. The data may be interesting for reuse for instance for benchmarking genomic prediction models, for phytopathological studies addressing common rust, or for specifications of vegetation indices.

6.

Modeling within and between Sub-Genomes Epistasis of Synthetic Hexaploid Wheat for Genome-Enabled Prediction of Diseases.

Cuevas, Jaime; González-Diéguez, David; Dreisigacker, Susanne; Martini, Johannes W R; Crespo-Herrera, Leo; Lozano-Ramirez, Nerida; Singh, Pawan K; He, Xinyao; Huerta, Julio; Crossa, Jose.

Genes (Basel) ; 15(3)2024 02 20.

Artigo em Inglês | MEDLINE | ID: mdl-38540321

RESUMO

Common wheat (Triticum aestivum) is a hexaploid crop comprising three diploid sub-genomes labeled A, B, and D. The objective of this study is to investigate whether there is a discernible influence pattern from the D sub-genome with epistasis in genomic models for wheat diseases. Four genomic statistical models were employed; two models considered the linear genomic relationship of the lines. The first model (G) utilized all molecular markers, while the second model (ABD) utilized three matrices representing the A, B, and D sub-genomes. The remaining two models incorporated epistasis, one (GI) using all markers and the other (ABDI) considering markers in sub-genomes A, B, and D, including inter- and intra-sub-genome interactions. The data utilized pertained to three diseases: tan spot (TS), septoria nodorum blotch (SNB), and spot blotch (SB), for synthetic hexaploid wheat (SHW) lines. The results (variance components) indicate that epistasis makes a substantial contribution to explaining genomic variation, accounting for approximately 50% in SNB and SB and only 29% for TS. In this contribution of epistasis, the influence of intra- and inter-sub-genome interactions of the D sub-genome is crucial, being close to 50% in TS and higher in SNB (60%) and SB (60%). This increase in explaining genomic variation is reflected in an enhancement of predictive ability from the G model (additive) to the ABDI model (additive and epistasis) by 9%, 5%, and 1% for SNB, SB, and TS, respectively. These results, in line with other studies, underscore the significance of the D sub-genome in disease traits and suggest a potential application to be explored in the future regarding the selection of parental crosses based on sub-genomes.

Assuntos

Ascomicetos , Triticum , Triticum/genética , Epistasia Genética , Fenótipo , Ascomicetos/genética

7.

Data Augmentation Enhances Plant-Genomic-Enabled Predictions.

Montesinos-López, Osval A; Solis-Camacho, Mario Alberto; Crespo-Herrera, Leonardo; Saint Pierre, Carolina; Huerta Prado, Gloria Isabel; Ramos-Pulido, Sofia; Al-Nowibet, Khalid; Fritsche-Neto, Roberto; Gerard, Guillermo; Montesinos-López, Abelardo; Crossa, José.

Genes (Basel) ; 15(3)2024 02 24.

Artigo em Inglês | MEDLINE | ID: mdl-38540344

RESUMO

Genomic selection (GS) is revolutionizing plant breeding. However, its practical implementation is still challenging, since there are many factors that affect its accuracy. For this reason, this research explores data augmentation with the goal of improving its accuracy. Deep neural networks with data augmentation (DA) generate synthetic data from the original training set to increase the training set and to improve the prediction performance of any statistical or machine learning algorithm. There is much empirical evidence of their success in many computer vision applications. Due to this, DA was explored in the context of GS using 14 real datasets. We found empirical evidence that DA is a powerful tool to improve the prediction accuracy, since we improved the prediction accuracy of the top lines in the 14 datasets under study. On average, across datasets and traits, the gain in prediction performance of the DA approach regarding the Conventional method in the top 20% of lines in the testing set was 108.4% in terms of the NRMSE and 107.4% in terms of the MAAPE, but a worse performance was observed on the whole testing set. We encourage more empirical evaluations to support our findings.

Assuntos

Genoma de Planta , Genômica , Fenótipo , Aprendizado de Máquina , Redes Neurais de Computação

8.

Use of remote sensing for linkage mapping and genomic prediction for common rust resistance in maize.

Loladze, Alexander; Rodrigues, Francelino A; Petroli, Cesar D; Muñoz-Zavala, Carlos; Naranjo, Sergio; San Vicente, Felix; Gerard, Bruno; Montesinos-Lopez, Osval A; Crossa, Jose; Martini, Johannes W R.

Field Crops Res ; 308: 109281, 2024 Mar 15.

Artigo em Inglês | MEDLINE | ID: mdl-38495466

RESUMO

Breeding for disease resistance is a central component of strategies implemented to mitigate biotic stress impacts on crop yield. Conventionally, genotypes of a plant population are evaluated through a labor-intensive process of assigning visual scores (VS) of susceptibility (or resistance) by specifically trained staff, which limits manageable volumes and repeatability of evaluation trials. Remote sensing (RS) tools have the potential to streamline phenotyping processes and to deliver more standardized results at higher through-put. Here, we use a two-year evaluation trial of three newly developed biparental populations of maize doubled haploid lines (DH) to compare the results of genomic analyses of resistance to common rust (CR) when phenotyping is either based on conventional VS or on RS-derived (vegetation) indices. As a general observation, for each population × year combination, the broad sense heritability of VS was greater than or very close to the maximum heritability across all RS indices. Moreover, results of linkage mapping as well as of genomic prediction (GP), suggest that VS data was of a higher quality, indicated by higher -logp values in the linkage studies and higher predictive abilities for genomic prediction. Nevertheless, despite the qualitative differences between the phenotyping methods, each successfully identified the same genomic region on chromosome 10 as being associated with disease resistance. This region is likely related to the known CR resistance locus Rp1. Our results indicate that RS technology can be used to streamline genetic evaluation processes for foliar disease resistance in maize. In particular, RS can potentially reduce costs of phenotypic evaluations and increase trialing capacities.

9.

Deep learning methods improve genomic prediction of wheat breeding.

Montesinos-López, Abelardo; Crespo-Herrera, Leonardo; Dreisigacker, Susanna; Gerard, Guillermo; Vitale, Paolo; Saint Pierre, Carolina; Govindan, Velu; Tarekegn, Zerihun Tadesse; Flores, Moisés Chavira; Pérez-Rodríguez, Paulino; Ramos-Pulido, Sofía; Lillemo, Morten; Li, Huihui; Montesinos-López, Osval A; Crossa, Jose.

Front Plant Sci ; 15: 1324090, 2024.

Artigo em Inglês | MEDLINE | ID: mdl-38504889

RESUMO

In the field of plant breeding, various machine learning models have been developed and studied to evaluate the genomic prediction (GP) accuracy of unseen phenotypes. Deep learning has shown promise. However, most studies on deep learning in plant breeding have been limited to small datasets, and only a few have explored its application in moderate-sized datasets. In this study, we aimed to address this limitation by utilizing a moderately large dataset. We examined the performance of a deep learning (DL) model and compared it with the widely used and powerful best linear unbiased prediction (GBLUP) model. The goal was to assess the GP accuracy in the context of a five-fold cross-validation strategy and when predicting complete environments using the DL model. The results revealed the DL model outperformed the GBLUP model in terms of GP accuracy for two out of the five included traits in the five-fold cross-validation strategy, with similar results in the other traits. This indicates the superiority of the DL model in predicting these specific traits. Furthermore, when predicting complete environments using the leave-one-environment-out (LOEO) approach, the DL model demonstrated competitive performance. It is worth noting that the DL model employed in this study extends a previously proposed multi-modal DL model, which had been primarily applied to image data but with small datasets. By utilizing a moderately large dataset, we were able to evaluate the performance and potential of the DL model in a context with more information and challenging scenario in plant breeding.

10.

A marker weighting approach for enhancing within-family accuracy in genomic prediction.

Montesinos-López, Osval A; Crespo-Herrera, Leonardo; Xavier, Alencar; Godwa, Manje; Beyene, Yoseph; Pierre, Carolina Saint; de la Rosa-Santamaria, Roberto; Salinas-Ruiz, Josafhat; Gerard, Guillermo; Vitale, Paolo; Dreisigacker, Susanne; Lillemo, Morten; Grignola, Fernando; Sarinelli, Martin; Pozzo, Ezequiel; Quiroga, Marco; Montesinos-López, Abelardo; Crossa, José.

G3 (Bethesda) ; 14(2)2024 Feb 07.

Artigo em Inglês | MEDLINE | ID: mdl-38079160

RESUMO

Genomic selection is revolutionizing plant breeding. However, its practical implementation is still very challenging, since predicted values do not necessarily have high correspondence to the observed phenotypic values. When the goal is to predict within-family, it is not always possible to obtain reasonable accuracies, which is of paramount importance to improve the selection process. For this reason, in this research, we propose the Adversaria-Boruta (AB) method, which combines the virtues of the adversarial validation (AV) method and the Boruta feature selection method. The AB method operates primarily by minimizing the disparity between training and testing distributions. This is accomplished by reducing the weight assigned to markers that display the most significant differences between the training and testing sets. Therefore, the AB method built a weighted genomic relationship matrix that is implemented with the genomic best linear unbiased predictor (GBLUP) model. The proposed AB method is compared using 12 real data sets with the GBLUP model that uses a nonweighted genomic relationship matrix. Our results show that the proposed AB method outperforms the GBLUP by 8.6, 19.7, and 9.8% in terms of Pearson's correlation, mean square error, and normalized root mean square error, respectively. Our results support that the proposed AB method is a useful tool to improve the prediction accuracy of a complete family, however, we encourage other investigators to evaluate the AB method to increase the empirical evidence of its potential.

Assuntos

Modelos Genéticos , Polimorfismo de Nucleotídeo Único , Genoma , Genômica/métodos , Modelos Lineares , Fenótipo , Genótipo

11.

Multivariate Genomic Hybrid Prediction with Kernels and Parental Information.

Montesinos-López, Osval A; Crossa, José; Saint Pierre, Carolina; Gerard, Guillermo; Valenzo-Jiménez, Marco Alberto; Vitale, Paolo; Valladares-Cellis, Patricia Edwigis; Buenrostro-Mariscal, Raymundo; Montesinos-López, Abelardo; Crespo-Herrera, Leonardo.

Int J Mol Sci ; 24(18)2023 Sep 07.

Artigo em Inglês | MEDLINE | ID: mdl-37762107

RESUMO

Genomic selection (GS) plays a pivotal role in hybrid prediction. It can enhance the selection of parental lines, accurately predict hybrid performance, and harness hybrid vigor. Likewise, it can optimize breeding strategies by reducing field trial requirements, expediting hybrid development, facilitating targeted trait improvement, and enhancing adaptability to diverse environments. Leveraging genomic information empowers breeders to make informed decisions and significantly improve the efficiency and success rate of hybrid breeding programs. In order to improve the genomic ability performance, we explored the incorporation of parental phenotypic information as covariates under a multi-trait framework. Approach 1, referred to as Pmean, directly utilized parental phenotypic information without any preprocessing. While approach 2, denoted as BV, replaced the direct use of phenotypic values of both parents with their respective breeding values. While an improvement in prediction performance was observed in both approaches, with a minimum 4.24% reduction in the normalized root mean square error (NRMSE), the direct incorporation of parental phenotypic information in the Pmean approach slightly outperformed the BV approach. We also compared these two approaches using linear and nonlinear kernels, but no relevant gain was observed. Finally, our results increase empirical evidence confirming that the integration of parental phenotypic information helps increase the prediction performance of hybrids.

Assuntos

Hibridização Genética , Modelos Genéticos , Genoma de Planta , Fenótipo , Genômica/métodos , Melhoramento Vegetal

12.

A novel method for genomic-enabled prediction of cultivars in new environments.

Montesinos-López, Osval A; Ramos-Pulido, Sofia; Hernández-Suárez, Carlos Moisés; Mosqueda González, Brandon Alejandro; Valladares-Anguiano, Felícitas Alejandra; Vitale, Paolo; Montesinos-López, Abelardo; Crossa, José.

Front Plant Sci ; 14: 1218151, 2023.

Artigo em Inglês | MEDLINE | ID: mdl-37564390

RESUMO

Introduction: Genomic selection (GS) has gained global importance due to its potential to accelerate genetic progress and improve the efficiency of breeding programs. Objectives of the research: In this research we proposed a method to improve the prediction accuracy of tested lines in new (untested) environments. Method-1: The new method trained the model with a modified response variable (a difference of response variables) that decreases the lack of a non-stationary distribution between the training and testing and improved the prediction accuracy. Comparing new and conventional method: We compared the prediction accuracy of the conventional genomic best linear unbiased prediction (GBLUP) model (M1) including (or not) genotype × environment interaction (GE) (M1_GE; M1_NO_GE) versus the proposed method (M2) on several data sets. Results and discussion: The gain in prediction accuracy of M2, versus M1_GE, M1_NO_GE in terms of Pearson´s correlation was of at least 4.3%, while in terms of percentage of top-yielding lines captured when was selected the 10% (Best10) and 20% (Best20) of lines was at least of 19.5%, while in terms of Normalized Root Mean Squared Error (NRMSE) was of at least of 42.29%.

13.

Do feature selection methods for selecting environmental covariables enhance genomic prediction accuracy?

Montesinos-López, Osval A; Crespo-Herrera, Leonardo; Saint Pierre, Carolina; Bentley, Alison R; de la Rosa-Santamaria, Roberto; Ascencio-Laguna, José Alejandro; Agbona, Afolabi; Gerard, Guillermo S; Montesinos-López, Abelardo; Crossa, José.

Front Genet ; 14: 1209275, 2023.

Artigo em Inglês | MEDLINE | ID: mdl-37554404

RESUMO

Genomic selection (GS) is transforming plant and animal breeding, but its practical implementation for complex traits and multi-environmental trials remains challenging. To address this issue, this study investigates the integration of environmental information with genotypic information in GS. The study proposes the use of two feature selection methods (Pearson's correlation and Boruta) for the integration of environmental information. Results indicate that the simple incorporation of environmental covariates may increase or decrease prediction accuracy depending on the case. However, optimal incorporation of environmental covariates using feature selection significantly improves prediction accuracy in four out of six datasets between 14.25% and 218.71% under a leave one environment out cross validation scenario in terms of Normalized Root Mean Squared Error, but not relevant gain was observed in terms of Pearson´s correlation. In two datasets where environmental covariates are unrelated to the response variable, feature selection is unable to enhance prediction accuracy. Therefore, the study provides empirical evidence supporting the use of feature selection to improve the prediction power of GS.

14.

Genomic Prediction of Resistance to Tan Spot, Spot Blotch and Septoria Nodorum Blotch in Synthetic Hexaploid Wheat.

García-Barrios, Guillermo; Crossa, José; Cruz-Izquierdo, Serafín; Aguilar-Rincón, Víctor Heber; Sandoval-Islas, J Sergio; Corona-Torres, Tarsicio; Lozano-Ramírez, Nerida; Dreisigacker, Susanne; He, Xinyao; Singh, Pawan Kumar; Pacheco-Gil, Rosa Angela.

Int J Mol Sci ; 24(13)2023 Jun 22.

Artigo em Inglês | MEDLINE | ID: mdl-37445683

RESUMO

Genomic prediction combines molecular and phenotypic data in a training population to predict the breeding values of individuals that have only been genotyped. The use of genomic information in breeding programs helps to increase the frequency of favorable alleles in the populations of interest. This study evaluated the performance of BLUP (Best Linear Unbiased Prediction) in predicting resistance to tan spot, spot blotch and Septoria nodorum blotch in synthetic hexaploid wheat. BLUP was implemented in single-trait and multi-trait models with three variations: (1) the pedigree relationship matrix (A-BLUP), (2) the genomic relationship matrix (G-BLUP), and (3) a combination of the two matrices (A+G BLUP). In all three diseases, the A-BLUP model had a lower performance, and the G-BLUP and A+G BLUP were statistically similar (p ≥ 0.05). The prediction accuracy with the single trait was statistically similar (p ≥ 0.05) to the multi-trait accuracy, possibly due to the low correlation of severity between the diseases.

Assuntos

Doenças das Plantas , Triticum , Humanos , Triticum/genética , Doenças das Plantas/genética , Melhoramento Vegetal , Genoma , Genômica , Fenótipo , Genótipo , Modelos Genéticos

15.

Statistical Machine-Learning Methods for Genomic Prediction Using the SKM Library.

Montesinos López, Osval A; Mosqueda González, Brandon Alejandro; Montesinos López, Abelardo; Crossa, José.

Genes (Basel) ; 14(5)2023 04 28.

Artigo em Inglês | MEDLINE | ID: mdl-37239363

RESUMO

Genomic selection (GS) is revolutionizing plant breeding. However, because it is a predictive methodology, a basic understanding of statistical machine-learning methods is necessary for its successful implementation. This methodology uses a reference population that contains both the phenotypic and genotypic information of genotypes to train a statistical machine-learning method. After optimization, this method is used to make predictions of candidate lines for which only genotypic information is available. However, due to a lack of time and appropriate training, it is difficult for breeders and scientists of related fields to learn all the fundamentals of prediction algorithms. With smart or highly automated software, it is possible for these professionals to appropriately implement any state-of-the-art statistical machine-learning method for its collected data without the need for an exhaustive understanding of statistical machine-learning methods and programing. For this reason, we introduce state-of-the-art statistical machine-learning methods using the Sparse Kernel Methods (SKM) R library, with complete guidelines on how to implement seven statistical machine-learning methods that are available in this library for genomic prediction (random forest, Bayesian models, support vector machine, gradient boosted machine, generalized linear models, partial least squares, feed-forward artificial neural networks). This guide includes details of the functions required to implement each of the methods, as well as others for easily implementing different tuning strategies, cross-validation strategies, and metrics to evaluate the prediction performance and different summary functions that compute it. A toy dataset illustrates how to implement statistical machine-learning methods and facilitate their use by professionals who do not possess a strong background in machine learning and programing.

Assuntos

Melhoramento Vegetal , Software , Teorema de Bayes , Genômica/métodos , Aprendizado de Máquina

16.

Optimizing Sparse Testing for Genomic Prediction of Plant Breeding Crops.

Montesinos-López, Osval A; Saint Pierre, Carolina; Gezan, Salvador A; Bentley, Alison R; Mosqueda-González, Brandon A; Montesinos-López, Abelardo; van Eeuwijk, Fred; Beyene, Yoseph; Gowda, Manje; Gardner, Keith; Gerard, Guillermo S; Crespo-Herrera, Leonardo; Crossa, José.

Genes (Basel) ; 14(4)2023 04 17.

Artigo em Inglês | MEDLINE | ID: mdl-37107685

RESUMO

While sparse testing methods have been proposed by researchers to improve the efficiency of genomic selection (GS) in breeding programs, there are several factors that can hinder this. In this research, we evaluated four methods (M1-M4) for sparse testing allocation of lines to environments under multi-environmental trails for genomic prediction of unobserved lines. The sparse testing methods described in this study are applied in a two-stage analysis to build the genomic training and testing sets in a strategy that allows each location or environment to evaluate only a subset of all genotypes rather than all of them. To ensure a valid implementation, the sparse testing methods presented here require BLUEs (or BLUPs) of the lines to be computed at the first stage using an appropriate experimental design and statistical analyses in each location (or environment). The evaluation of the four cultivar allocation methods to environments of the second stage was done with four data sets (two large and two small) under a multi-trait and uni-trait framework. We found that the multi-trait model produced better genomic prediction (GP) accuracy than the uni-trait model and that methods M3 and M4 were slightly better than methods M1 and M2 for the allocation of lines to environments. Some of the most important findings, however, were that even under a scenario where we used a training-testing relation of 15-85%, the prediction accuracy of the four methods barely decreased. This indicates that genomic sparse testing methods for data sets under these scenarios can save considerable operational and financial resources with only a small loss in precision, which can be shown in our cost-benefit analysis.

Assuntos

Modelos Genéticos , Melhoramento Vegetal , Melhoramento Vegetal/métodos , Genoma de Planta/genética , Fenótipo , Genômica , Produtos Agrícolas/genética

17.

Multimodal deep learning methods enhance genomic prediction of wheat breeding.

Montesinos-López, Abelardo; Rivera, Carolina; Pinto, Francisco; Piñera, Francisco; Gonzalez, David; Reynolds, Mathew; Pérez-Rodríguez, Paulino; Li, Huihui; Montesinos-López, Osval A; Crossa, Jose.

G3 (Bethesda) ; 13(5)2023 05 02.

Artigo em Inglês | MEDLINE | ID: mdl-36869747

RESUMO

While several statistical machine learning methods have been developed and studied for assessing the genomic prediction (GP) accuracy of unobserved phenotypes in plant breeding research, few methods have linked genomics and phenomics (imaging). Deep learning (DL) neural networks have been developed to increase the GP accuracy of unobserved phenotypes while simultaneously accounting for the complexity of genotype-environment interaction (GE); however, unlike conventional GP models, DL has not been investigated for when genomics is linked with phenomics. In this study we used 2 wheat data sets (DS1 and DS2) to compare a novel DL method with conventional GP models. Models fitted for DS1 were GBLUP, gradient boosting machine (GBM), support vector regression (SVR) and the DL method. Results indicated that for 1 year, DL provided better GP accuracy than results obtained by the other models. However, GP accuracy obtained for other years indicated that the GBLUP model was slightly superior to the DL. DS2 is comprised only of genomic data from wheat lines tested for 3 years, 2 environments (drought and irrigated) and 2-4 traits. DS2 results showed that when predicting the irrigated environment with the drought environment, DL had higher accuracy than the GBLUP model in all analyzed traits and years. When predicting drought environment with information on the irrigated environment, the DL model and GBLUP model had similar accuracy. The DL method used in this study is novel and presents a strong degree of generalization as several modules can potentially be incorporated and concatenated to produce an output for a multi-input data structure.

Assuntos

Aprendizado Profundo , Triticum , Triticum/genética , Melhoramento Vegetal/métodos , Modelos Genéticos , Fenótipo , Genômica/métodos , Genótipo

18.

Sparse multi-trait genomic prediction under balanced incomplete block design.

Montesinos-López, Osval A; Mosqueda-González, Brandon A; Salinas-Ruiz, Josafat; Montesinos-López, Abelardo; Crossa, José.

Plant Genome ; 16(2): e20305, 2023 06.

Artigo em Inglês | MEDLINE | ID: mdl-36815225

RESUMO

Sparse testing is essential to increase the efficiency of the genomic selection methodology, as the same efficiency (in this case prediction power) can be obtained while using less genotypes evaluated in the fields. For this reason, it is important to evaluate the existing methods for performing the allocation of lines to environments. With this goal, four methods (M1-M4) to allocate lines to environments were evaluated under the context of a multi-trait genomic prediction problem: M1 denotes the allocation of a fraction (subset) of lines in all locations, M2 denotes the allocation of a fraction of lines with some shared lines in locations but not arranged based on the balanced incomplete block design (BIBD) principle, M3 denotes the random allocation of a subset of lines to locations, and M4 denotes the allocation of a subset of lines to locations using the BIBD principle. The evaluation was done using seven real multi-environment data sets common in plant breeding programs. We found that the best method was M4 and the worst was M1, while no important differences were found between M3 and M4. We concluded that M4 and M3 are efficient in the context of sparse testing for multi-trait prediction.

Assuntos

Genoma de Planta , Melhoramento Vegetal , Fenótipo , Genótipo , Genômica

19.

Integrating Parental Phenotypic Data Enhances Prediction Accuracy of Hybrids in Wheat Traits.

Montesinos-López, Osval A; Bentley, Alison R; Saint Pierre, Carolina; Crespo-Herrera, Leonardo; Salinas Ruiz, Josafhat; Valladares-Celis, Patricia Edwigis; Montesinos-López, Abelardo; Crossa, José.

Genes (Basel) ; 14(2)2023 02 02.

Artigo em Inglês | MEDLINE | ID: mdl-36833322

RESUMO

Genomic selection (GS) is a methodology that is revolutionizing plant breeding because it can select candidate genotypes without phenotypic evaluation in the field. However, its practical implementation in hybrid prediction remains challenging since many factors affect its accuracy. The main objective of this study was to research the genomic prediction accuracy of wheat hybrids by adding covariates with the hybrid parental phenotypic information to the model. Four types of different models (MA, MB, MC, and MD) with one covariate (same trait to be predicted) (MA_C, MB_C, MC_C, and MD_C) or several covariates (of the same trait and other correlated traits) (MA_AC, MB_AC, MC_AC, and MD_AC) were studied. We found that the four models with parental information outperformed models without parental information in terms of mean square error by at least 14.1% (MA vs. MA_C), 5.5% (MB vs. MB_C), 51.4% (MC vs. MC_C), and 6.4% (MD vs. MD_C) when parental information of the same trait was used and by at least 13.7% (MA vs. MA_AC), 5.3% (MB vs. MB_AC), 55.1% (MC vs. MC_AC), and 6.0% (MD vs. MD_AC) when parental information of the same trait and other correlated traits were used. Our results also show a large gain in prediction accuracy when covariates were considered using the parental phenotypic information, as opposed to marker information. Finally, our results empirically demonstrate that a significant improvement in prediction accuracy was gained by adding parental phenotypic information as covariates; however, this is expensive since, in many breeding programs, the parental phenotypic information is unavailable.

Assuntos

Modelos Genéticos , Triticum , Triticum/genética , Polimorfismo de Nucleotídeo Único , Melhoramento Vegetal , Fenótipo

20.

Enviromic-based kernels may optimize resource allocation with multi-trait multi-environment genomic prediction for tropical Maize.

Gevartosky, Raysa; Carvalho, Humberto Fanelli; Costa-Neto, Germano; Montesinos-López, Osval A; Crossa, José; Fritsche-Neto, Roberto.

BMC Plant Biol ; 23(1): 10, 2023 Jan 05.

Artigo em Inglês | MEDLINE | ID: mdl-36604618

RESUMO

BACKGROUND: Success in any genomic prediction platform is directly dependent on establishing a representative training set. This is a complex task, even in single-trait single-environment conditions and tends to be even more intricated wherein additional information from envirotyping and correlated traits are considered. Here, we aimed to design optimized training sets focused on genomic prediction, considering multi-trait multi-environment trials, and how those methods may increase accuracy reducing phenotyping costs. For that, we considered single-trait multi-environment trials and multi-trait multi-environment trials for three traits: grain yield, plant height, and ear height, two datasets, and two cross-validation schemes. Next, two strategies for designing optimized training sets were conceived, first considering only the genomic by environment by trait interaction (GET), while a second including large-scale environmental data (W, enviromics) as genomic by enviromic by trait interaction (GWT). The effective number of individuals (genotypes × environments × traits) was assumed as those that represent at least 98% of each kernel (GET or GWT) variation, in which those individuals were then selected by a genetic algorithm based on prediction error variance criteria to compose an optimized training set for genomic prediction purposes. RESULTS: The combined use of genomic and enviromic data efficiently designs optimized training sets for genomic prediction, improving the response to selection per dollar invested by up to 145% when compared to the model without enviromic data, and even more when compared to cross validation scheme with 70% of training set or pure phenotypic selection. Prediction models that include G × E or enviromic data + G × E yielded better prediction ability. CONCLUSIONS: Our findings indicate that a genomic by enviromic by trait interaction kernel associated with genetic algorithms is efficient and can be proposed as a promising approach to designing optimized training sets for genomic prediction when the variance-covariance matrix of traits is available. Additionally, great improvements in the genetic gains per dollar invested were observed, suggesting that a good allocation of resources can be deployed by using the proposed approach.

Assuntos

Interação Gene-Ambiente , Zea mays , Zea mays/genética , Genoma de Planta/genética , Modelos Genéticos , Seleção Genética , Fenótipo , Genótipo , Genômica/métodos , Alocação de Recursos

RESUMO

Assuntos

RESUMO

RESUMO

Assuntos

RESUMO

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

RESUMO

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

RESUMO

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

ENVIAR RESULTADO:

SELEÇÃO DE REFERÊNCIAS

DETALHE DA PESQUISA