This paper investigates a mathematical programming based methodology for solving the minimum sum-of-squares clustering problem, also known as the "k-means problem", in the presence of side constraints. We propose several exact and approximate mixed-integer linear and nonlinear formulations. The approximations are based on norm inequalities and random projections, the approximation guarantees of which are based on an additive version of the Johnson-Lindenstrauss lemma. We perform computational testing (with fixed CPU time) on a range of randomly generated and real data instances of medium size, but with high dimensionality. We show that when side constraints make k-means inapplicable, our proposed methodology-which is easy and fast to implement and deploy-can obtain good solutions in limited amounts of time.

Side-constrained minimum sum-of-squares clustering: mathematical programming and random projections

Benedetto Manca
2021-01-01

Abstract

This paper investigates a mathematical programming based methodology for solving the minimum sum-of-squares clustering problem, also known as the "k-means problem", in the presence of side constraints. We propose several exact and approximate mixed-integer linear and nonlinear formulations. The approximations are based on norm inequalities and random projections, the approximation guarantees of which are based on an additive version of the Johnson-Lindenstrauss lemma. We perform computational testing (with fixed CPU time) on a range of randomly generated and real data instances of medium size, but with high dimensionality. We show that when side constraints make k-means inapplicable, our proposed methodology-which is easy and fast to implement and deploy-can obtain good solutions in limited amounts of time.
2021
MINLP
k-means
Random projections
Side constraints
File in questo prodotto:
File Dimensione Formato  
HAL-Side-constrained minimum sum-of-squares clustering: mathematical programming and random projections.pdf

accesso aperto

Tipologia: versione post-print (AAM)
Dimensione 537.62 kB
Formato Adobe PDF
537.62 kB Adobe PDF Visualizza/Apri
Side-constrained minimum sum-of-squares clustering: mathematical programming and random projections.pdf

Solo gestori archivio

Dimensione 563.72 kB
Formato Adobe PDF
563.72 kB Adobe PDF   Visualizza/Apri   Richiedi una copia

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11584/351818
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 7
  • ???jsp.display-item.citation.isi??? 4
social impact