If the classification type is unknown,the K-means algorithm will randomly select the initial values,and different initial values will lead to differences in remote sensing image classification results. To solve such problems,this paper proposes an improved K-means algorithm. First, logarithmical transform is performed for the original data,and then principal component transformation is implemented. The number of principal components for the K-means algorithm is determined according to the contribution rate (≥85%). The proposed method can weaken the noise. Kernel density estimation can be used to determine the probability density function of the first principal component, from which the initial label for multi-dimensional K-means algorithm can be efficiently determined,and the sensitivity of the initial value selected at random can be avoided. Experiments show that the accuracy of the method proposed in this paper is higher than that of the traditional K-means based on mean-variance.
[1]Hartigan J A,Wong M A.A K-means Clustering Algorithm[J].Applied Statistics,1979,28(1):100-108.
[2]Pena J M,Lozano J A,Larranaga P.An Empirical Comparison of Four Initialization Methods for the K-means Algorithm[J].Pattern Recognition Letters,1999,20:1027-1040.
[3]Stephen J,Redmond,Heneghan C.A Method for Initialising the K-means Clustering Algorithm Using Kd-trees[J].Pattern Recognition Letters,2007,28(8):965-973.
[4]Lu J F,Tang J B,Tang Z M,et al.Hierarchical Initialization Approach for K-means Clustering[J].Pattern Recognition Letters,2008,29(6):187-795.
[5]Chang C I,Du Q,Sun T L,et al.A Joint Band Prioritization and Band Decorrelation Approach to
Band Selection for Hyperspectral Image Classification[J].IEEE Transactions on Geoscience and Remote Sensing,1999,37(6):2631-2641.
[6]Bowman A W,Azzalini A.Applied Smoothing Techniques for Data Analysis:the kernel approach with S-plus illustrations[M].UK:Oxford University Press,1997.