This paper investigates a mathematical programming based methodology for solving the minimum sum-of-squares clustering problem, also known as the "k-means problem", in the presence of side constraints. We propose several exact and approximate mixed-integer linear and nonlinear formulations. The approximations are based on norm inequalities and random projections, the approximation guarantees of which are based on an additive version of the Johnson-Lindenstrauss lemma. We perform computational testing (with fixed CPU time) on a range of randomly generated and real data instances of medium size, but with high dimensionality. We show that when side constraints make k-means inapplicable, our proposed methodology-which is easy and fast to implement and deploy-can obtain good solutions in limited amounts of time.
Side-constrained minimum sum-of-squares clustering: mathematical programming and random projections
Benedetto Manca
2021-01-01
Abstract
This paper investigates a mathematical programming based methodology for solving the minimum sum-of-squares clustering problem, also known as the "k-means problem", in the presence of side constraints. We propose several exact and approximate mixed-integer linear and nonlinear formulations. The approximations are based on norm inequalities and random projections, the approximation guarantees of which are based on an additive version of the Johnson-Lindenstrauss lemma. We perform computational testing (with fixed CPU time) on a range of randomly generated and real data instances of medium size, but with high dimensionality. We show that when side constraints make k-means inapplicable, our proposed methodology-which is easy and fast to implement and deploy-can obtain good solutions in limited amounts of time.File | Dimensione | Formato | |
---|---|---|---|
HAL-Side-constrained minimum sum-of-squares clustering: mathematical programming and random projections.pdf
accesso aperto
Tipologia:
versione post-print (AAM)
Dimensione
537.62 kB
Formato
Adobe PDF
|
537.62 kB | Adobe PDF | Visualizza/Apri |
Side-constrained minimum sum-of-squares clustering: mathematical programming and random projections.pdf
Solo gestori archivio
Dimensione
563.72 kB
Formato
Adobe PDF
|
563.72 kB | Adobe PDF | Visualizza/Apri Richiedi una copia |
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.