RESUMO
Building de novo genome assemblies for complex genomes is possible thanks to long-read DNA sequencing technologies. However, maximizing the quality of assemblies based on long reads is a challenging task that requires the development of specialized data analysis techniques. We present new algorithms for assembling long DNA sequencing reads from haploid and diploid organisms. The assembly algorithm builds an undirected graph with two vertices for each read based on minimizers selected by a hash function derived from the k-mer distribution. Statistics collected during the graph construction are used as features to build layout paths by selecting edges, ranked by a likelihood function. For diploid samples, we integrated a reimplementation of the ReFHap algorithm to perform molecular phasing. We ran the implemented algorithms on PacBio HiFi and Nanopore sequencing data taken from haploid and diploid samples of different species. Our algorithms showed competitive accuracy and computational efficiency, compared with other currently used software. We expect that this new development will be useful for researchers building genome assemblies for different species.
Assuntos
Algoritmos , Sequenciamento de Nucleotídeos em Larga Escala , Sequenciamento de Nucleotídeos em Larga Escala/métodos , Análise de Sequência de DNA/métodos , Genoma , SoftwareRESUMO
In Colombia, late blight is considered one of the most limiting diseases on potato and tomato production. Recently, a new Phytophthora species, P. betacei, was described infecting tree tomato crops in the south of Colombia. However, the distribution and the host range of this new emerging pathogen in the country are unknown. The main aims of this study were to determine if this novel species is confined to the south of Colombia, to assess if P. betacei represents a genetically uniform clone across Colombia and to determine if in all regions there is a clear differentiation between the two Phytophthora species. Therefore, we characterized Phytophthora isolates obtained from tree tomato and potato crops in a central region of Colombia and compared them with the strains from the south. Initially, we evaluated the genetic differentiation among Phytophthora strains obtained from tree tomato and potato crops using simple sequence repeat markers. Results showed a strong population structure between P. infestans and P. betacei. However, we did not detect any genetic differentiation within P. infestans or P. betacei populations from different regions. Furthermore, we detected significant morphological differences among the species based on growth and sporangial morphology measurements. We also showed that strains of Phytophthora spp. are predominantly of the A1 mating type and belong to EC-1 and EC-3 clonal lineages for P. infestans and P. betacei, respectively. Our results describe the expanded geographical range of the new species of P. betacei in the central region of Colombia.