Pesquisa | Portal Regional da BVS

A phased genome assembly of a Colombian Trypanosoma cruzi TcI strain and the evolution of gene families.

Hoyos Sanchez, Maria Camila; Ospina Zapata, Hader Sebastian; Suarez, Brayhan Dario; Ospina, Carlos; Barbosa, Hamilton Julian; Carranza Martinez, Julio Cesar; Vallejo, Gustavo Adolfo; Urrea Montes, Daniel; Duitama, Jorge.

Sci Rep ; 14(1): 2054, 2024 01 24.

Artigo em Inglês | MEDLINE | ID: mdl-38267502

RESUMO

Chagas is an endemic disease in tropical regions of Latin America, caused by the parasite Trypanosoma cruzi. High intraspecies variability and genome complexity have been challenges to assemble high quality genomes needed for studies in evolution, population genomics, diagnosis and drug development. Here we present a chromosome-level phased assembly of a TcI T. cruzi strain (Dm25). While 29 chromosomes show a large collinearity with the assembly of the Brazil A4 strain, three chromosomes show both large heterozygosity and large divergence, compared to previous assemblies of TcI T. cruzi strains. Nucleotide and protein evolution statistics indicate that T. cruzi Marinkellei separated before the diversification of T. cruzi in the known DTUs. Interchromosomal paralogs of dispersed gene families and histones appeared before but at the same time have a more strict purifying selection, compared to other repeat families. Previously unreported large tandem arrays of protein kinases and histones were identified in this assembly. Over one million variants obtained from Illumina reads aligned to the primary assembly clearly separate the main DTUs. We expect that this new assembly will be a valuable resource for further studies on evolution and functional genomics of Trypanosomatids.

Assuntos

Doença de Chagas , Trypanosoma cruzi , Humanos , Trypanosoma cruzi/genética , Colômbia , Histonas , Brasil

New algorithms for accurate and efficient de novo genome assembly from long DNA sequencing reads.

Gonzalez-Garcia, Laura; Guevara-Barrientos, David; Lozano-Arce, Daniela; Gil, Juanita; Díaz-Riaño, Jorge; Duarte, Erick; Andrade, Germán; Bojacá, Juan Camilo; Hoyos-Sanchez, Maria Camila; Chavarro, Christian; Guayazan, Natalia; Chica, Luis Alberto; Buitrago Acosta, Maria Camila; Bautista, Edwin; Trujillo, Miller; Duitama, Jorge.

Life Sci Alliance ; 6(5)2023 05.

Artigo em Inglês | MEDLINE | ID: mdl-36813568

RESUMO

Building de novo genome assemblies for complex genomes is possible thanks to long-read DNA sequencing technologies. However, maximizing the quality of assemblies based on long reads is a challenging task that requires the development of specialized data analysis techniques. We present new algorithms for assembling long DNA sequencing reads from haploid and diploid organisms. The assembly algorithm builds an undirected graph with two vertices for each read based on minimizers selected by a hash function derived from the k-mer distribution. Statistics collected during the graph construction are used as features to build layout paths by selecting edges, ranked by a likelihood function. For diploid samples, we integrated a reimplementation of the ReFHap algorithm to perform molecular phasing. We ran the implemented algorithms on PacBio HiFi and Nanopore sequencing data taken from haploid and diploid samples of different species. Our algorithms showed competitive accuracy and computational efficiency, compared with other currently used software. We expect that this new development will be useful for researchers building genome assemblies for different species.

Assuntos

Algoritmos , Sequenciamento de Nucleotídeos em Larga Escala , Sequenciamento de Nucleotídeos em Larga Escala/métodos , Análise de Sequência de DNA/métodos , Genoma , Software

RESUMO

Assuntos

RESUMO

Assuntos

ENVIAR RESULTADO:

SELEÇÃO DE REFERÊNCIAS

DETALHE DA PESQUISA