Machine learning approaches to identify core and dispensable genes in pangenomes.
Plant Genome
; 15(1): e20135, 2022 03.
Article
en En
| MEDLINE
| ID: mdl-34533282
A gene in a given taxonomic group is either present in every individual (core) or absent in at least a single individual (dispensable). Previous pangenomic studies have identified certain functional differences between core and dispensable genes. However, identifying if a gene belongs to the core or dispensable portion of the genome requires the construction of a pangenome, which involves sequencing the genomes of many individuals. Here we aim to leverage the previously characterized core and dispensable gene content for two grass species [Brachypodium distachyon (L.) P. Beauv. and Oryza sativa L.] to construct a machine learning model capable of accurately classifying genes as core or dispensable using only a single annotated reference genome. Such a model may mitigate the need for pangenome construction, an expensive hurdle especially in orphan crops, which often lack the adequate genomic resources.
Texto completo:
1
Colección:
01-internacional
Base de datos:
MEDLINE
Asunto principal:
Oryza
/
Genómica
Idioma:
En
Revista:
Plant Genome
Año:
2022
Tipo del documento:
Article
País de afiliación:
Estados Unidos
Pais de publicación:
Estados Unidos