Abstract
The K-means clustering algorithm is well known and widely used. Recently, an alternative algorithm based on an earlier idea (Duda and Hart, Pattern Classification and Scene Analysis (1973); Ismail and Kamel, Pattern Recognition 22, 75-89 (1989)) has been proposed; in comprehensive experiments we verify that this alternative is superior to K-means. Both techniques suffer from the complexity of the cost function surface they are descending, and a consequent inclination to fall into poor local minima. A popular solution to this problem is to rerun the chosen iterative procedure from many randomly chosen starting positions, saving the best. We introduce here an alternative strategy to form the starting position for such a multiple run which consists of perturbing an existing result in an attempt to improve its poor aspects. This "controlled configuration change" has comparable complexity with random initialisation but usually produces superior clustering results. Comprehensive experiments confirm this result.
Original language | English |
---|---|
Pages (from-to) | 835-848 |
Number of pages | 14 |
Journal | Pattern Recognition |
Volume | 24 |
Issue number | 9 |
DOIs | |
Publication status | Published - 1991 |
Externally published | Yes |
Keywords
- Clustering problem
- Configuration
- Cost minimisation
- Global minimum
- K-means
- Random initialisation