The use of digital image analysis for discriminating between and comparing groups of seeds is becoming an increasingly common practice in taxonomic studies. For this type of study, many variables, concerning different kinds of data such as size, texture and shape, are generally used as inputs in statistical algorithms without any data pre-processing, thereby generating problems with noise and the consistency of the process for new samples. We propose an approach in which the variables for each kind of data are separately pre-processed by performing principal component analysis and Fourier analysis. Furthermore, the accuracy of the different kinds of data is measured by comparing the results obtained using several classification algorithms: k-Nearest Neighbour, Linear Discriminant Analysis, Naive Bayes, Support Vector Machines and Random Forest. We have taken as a case study the seeds of 19 cultivars of Sardinian Prunus domestica L. and four cultivars referable to other Prunus species. The combination of size, texture and shape data was able to perform well in discriminating between the seeds of Prunus sp. The present study confirms that image analysis techniques combined with the pre-processing of data are a useful tool for taxonomic investigation in plant biology and for discrimination at the cultivar level.

A statistical approach to the morphological classification of Prunus sp. seeds

Frigau, Luca;Bacchetta, Gianluigi;Sarigu, Marco;Ucchesu, Mariano
;
Mola, Francesco
2020-01-01

Abstract

The use of digital image analysis for discriminating between and comparing groups of seeds is becoming an increasingly common practice in taxonomic studies. For this type of study, many variables, concerning different kinds of data such as size, texture and shape, are generally used as inputs in statistical algorithms without any data pre-processing, thereby generating problems with noise and the consistency of the process for new samples. We propose an approach in which the variables for each kind of data are separately pre-processed by performing principal component analysis and Fourier analysis. Furthermore, the accuracy of the different kinds of data is measured by comparing the results obtained using several classification algorithms: k-Nearest Neighbour, Linear Discriminant Analysis, Naive Bayes, Support Vector Machines and Random Forest. We have taken as a case study the seeds of 19 cultivars of Sardinian Prunus domestica L. and four cultivars referable to other Prunus species. The combination of size, texture and shape data was able to perform well in discriminating between the seeds of Prunus sp. The present study confirms that image analysis techniques combined with the pre-processing of data are a useful tool for taxonomic investigation in plant biology and for discrimination at the cultivar level.
2020
Digital image analysis; Fourier analysis; LDA; Prunussp; seeds; stream data
File in questo prodotto:
File Dimensione Formato  
A statistical approach to the morphological classification of Prunus sp. seeds.pdf

Solo gestori archivio

Descrizione: articolo online
Tipologia: versione editoriale (VoR)
Dimensione 1.99 MB
Formato Adobe PDF
1.99 MB Adobe PDF   Visualizza/Apri   Richiedi una copia

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11584/281589
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 12
  • ???jsp.display-item.citation.isi??? 11
social impact