RESUMO
In this paper, we present an algorithm for clustering multidimensional data, which we named TreeKDE. It is based on a tree structure decision associated with the optimization of the one-dimensional kernel density estimator function constructed from the orthogonal projections of the data on the coordinate axes. Among the main features of the proposed algorithm, we highlight the automatic determination of the number of clusters and their insertion in a rectangular region. Comparative numerical experiments are presented to illustrate the performance of the proposed algorithm and the results indicate that the TreeKDE is efficient and competitive when compared to other algorithms from the literature. Features such as simplicity and efficiency make the proposed algorithm an attractive and promising research field, which can be used as a basis for its improvement, and also for the development of new clustering algorithms based on the association between decision tree and kernel density estimator.
RESUMO
In this paper, we propose the MulticlusterKDE algorithm applied to classify elements of a database into categories based on their similarity. MulticlusterKDE is centered on the multiple optimization of the kernel density estimator function with multivariate Gaussian kernel. One of the main features of the proposed algorithm is that the number of clusters is an optional input parameter. Furthermore, it is very simple, easy to implement, well defined and stops at a finite number of steps and it always converges regardless of the data set. We illustrate our findings by implementing the algorithm in R software. The results indicate that the MulticlusterKDE algorithm is competitive when compared to K-means, K-medoids, CLARA, DBSCAN and PdfCluster algorithms. Features such as simplicity and efficiency make the proposed algorithm an attractive and promising research field that can be used as basis for its improvement and also for the development of new density-based clustering algorithms.