UNICA IRIS Institutional Research Information System

Range and k-nearest neighbor searching are core problems in pattern recognition. Given a database S of objects in a metric space M and a query object q in M, in a range searching problem the goal is to find the objects of S within some threshold distance to q, whereas in a k-nearest neighbor searching problem, the k elements of S closest to q must be produced. These problems can obviously be solved with a linear number of distance calculations, by comparing the query object against every object in the database. However, the goal is to solve such problems much faster. We combine and extend ideas from the M-Tree, the Multivantage Point structure, and the FQ-Tree to create a new structure in the "bisector tree" class, called the Antipole Tree. Bisection is based on the proximity to an "Antipole" pair of elements generated by a suitable linear randomized tournament. The final winners a; b of such a tournament are far enough apart to approximate the diameter of the splitting set. If dist (a; b) is larger than the chosen cluster diameter threshold, then the cluster is split. The proposed data structure is an indexing scheme suitable for ( exact and approximate) best match searching on generic metric spaces. The Antipole Tree outperforms by a factor of approximately two existing structures such as List of Clusters, M-Trees, and others and, in many cases, it achieves better clustering properties

Antipole Tree Indexing to Support Range Search and K-Nearest-Neighbor Search in Metric Spaces

Cantone,;Ferro, A;Pulvirenti, A;REFORGIATO RECUPERO, DIEGO ANGELO GAETANO;Shasha, D.

2005-01-01

Abstract

Range and k-nearest neighbor searching are core problems in pattern recognition. Given a database S of objects in a metric space M and a query object q in M, in a range searching problem the goal is to find the objects of S within some threshold distance to q, whereas in a k-nearest neighbor searching problem, the k elements of S closest to q must be produced. These problems can obviously be solved with a linear number of distance calculations, by comparing the query object against every object in the database. However, the goal is to solve such problems much faster. We combine and extend ideas from the M-Tree, the Multivantage Point structure, and the FQ-Tree to create a new structure in the "bisector tree" class, called the Antipole Tree. Bisection is based on the proximity to an "Antipole" pair of elements generated by a suitable linear randomized tournament. The final winners a; b of such a tournament are far enough apart to approximate the diameter of the splitting set. If dist (a; b) is larger than the chosen cluster diameter threshold, then the cluster is split. The proposed data structure is an indexing scheme suitable for ( exact and approximate) best match searching on generic metric spaces. The Antipole Tree outperforms by a factor of approximately two existing structures such as List of Clusters, M-Trees, and others and, in many cases, it achieves better clustering properties

Scheda breve

Scheda completa

Scheda completa (DC)

	Anno di pubblicazione
	
				2005
			
	Parole chiave
	
				indexing methods; similarity measures; information search and retrieval
			
	Tipologia:
	
				1.1 Articolo in rivista

File in questo prodotto:

File	Dimensione	Formato
antipole.pdf Solo gestori archivio Tipologia: versione editoriale (VoR) Dimensione 7.07 MB Formato Adobe PDF Visualizza/Apri Richiedi una copia	7.07 MB	Adobe PDF	Visualizza/Apri Richiedi una copia

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11584/140742

Citazioni

ND

53

38

social impact