Cooperative and out-of-core execution of the irregular wavefront propagation pattern on hybrid machines with Intel<sup>â</sup> Xeon Phi&#8482;.

Gomes, Jeremias; de Melo, Alba C M A; Kong, Jun; Kurc, Tahsin; Saltz, Joel H; Teodoro, George

Cooperative and out-of-core execution of the irregular wavefront propagation pattern on hybrid machines with Intel^â Xeon Phi™.

Gomes, Jeremias; de Melo, Alba C M A; Kong, Jun; Kurc, Tahsin; Saltz, Joel H; Teodoro, George.

Afiliação

Gomes J; Department of Computer Science, University of Brasília, Brasília-DF, Brazil.
de Melo ACMA; Department of Computer Science, University of Brasília, Brasília-DF, Brazil.
Kong J; Department of Biomedical Informatics, Emory University, Atlanta, GA, USA.
Kurc T; Department of Biomedical Informatics, Stony Brook University, Stony Brook, NY, USA.
Saltz JH; Department of Biomedical Informatics, Stony Brook University, Stony Brook, NY, USA.
Teodoro G; Department of Computer Science, University of Brasília, Brasília-DF, Brazil.

Concurr Comput ; 30(14)2018 Jul 25.

Article em En | MEDLINE | ID: mdl-30344454

RESUMO

The Irregular Wavefront Propagation Pattern (IWPP) is a core computing structure in several image analysis operations. Efficient implementation of IWPP on the Intel Xeon Phi is difficult because of the irregular data access and computation characteristics. The traditional IWPP algorithm relies on atomic instructions, which are not available in the SIMD set of the Intel Phi. To overcome this limitation, we have proposed a new IWPP algorithm that can take advantage of non-atomic SIMD instructions supported on the Intel Xeon Phi. We have also developed and evaluated methods to use CPU and Intel Phi cooperatively for parallel execution of the IWPP algorithms. Our new cooperative IWPP version is also able to handle large out-of-core images that would not fit into the memory of the accelerator. The new IWPP algorithm is used to implement the Morphological Reconstruction and Fill Holes operations, which are operations commonly found in image analysis applications. The vectorization implemented with the new IWPP has attained improvements of up to about 5× on top of the original IWPP and significant gains as compared to state-of-the-art the CPU and GPU versions. The new version running on an Intel Phi is 6.21× and 3.14× faster than running on a 16-core CPU and on a GPU, respectively. Finally, the cooperative execution using two Intel Phi devices and a multi-core CPU has reached performance gains of 2.14× as compared to the execution using a single Intel Xeon Phi.

Palavras-chave

Fill Holes; Intelâ Xeon Phi™; Irregular Algorithm Propagation Pattern; Morphological Reconstruction

Texto completo

Adicionar na Minha BVS

Imprimir

XML

PubMed Links

Buscar no Google

Texto completo: 1 Coleções: 01-internacional Base de dados: MEDLINE Idioma: En Revista: Concurr Comput Ano de publicação: 2018 Tipo de documento: Article País de afiliação: Brasil País de publicação: Reino Unido

Texto completo

Adicionar na Minha BVS

Imprimir

XML

PubMed Links

Buscar no Google