A new clustering algorithm with multiple runs of iterative procedures

Qiwen Zhang, Roger D. Boyle*

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

20 Citations (Scopus)

Abstract

The K-means clustering algorithm is well known and widely used. Recently, an alternative algorithm based on an earlier idea (Duda and Hart, Pattern Classification and Scene Analysis (1973); Ismail and Kamel, Pattern Recognition 22, 75-89 (1989)) has been proposed; in comprehensive experiments we verify that this alternative is superior to K-means. Both techniques suffer from the complexity of the cost function surface they are descending, and a consequent inclination to fall into poor local minima. A popular solution to this problem is to rerun the chosen iterative procedure from many randomly chosen starting positions, saving the best. We introduce here an alternative strategy to form the starting position for such a multiple run which consists of perturbing an existing result in an attempt to improve its poor aspects. This "controlled configuration change" has comparable complexity with random initialisation but usually produces superior clustering results. Comprehensive experiments confirm this result.

Original languageEnglish
Pages (from-to)835-848
Number of pages14
JournalPattern Recognition
Volume24
Issue number9
DOIs
Publication statusPublished - 1991
Externally publishedYes

Keywords

  • Clustering problem
  • Configuration
  • Cost minimisation
  • Global minimum
  • K-means
  • Random initialisation

Fingerprint

Dive into the research topics of 'A new clustering algorithm with multiple runs of iterative procedures'. Together they form a unique fingerprint.

Cite this