Seasonal variation in fatty acids and minerals concentrations was investigated through the analysis of Pecorino Romano cheese samples collected in January, April, and June. A fraction of samples contained missing values in their fatty acid profiles. Probabilistic principal component analysis, coupled with Linear Discriminant Analysis, was employed to classify cheese samples on a production season basis while accounting for missing data and quantifying the missing fatty acid concentrations for the samples in which they were absent. The levels of rumenic acid, vaccenic acid, and omega-3 compounds were positively correlated with the spring season, while the length of the saturated fatty acids increased throughout the production seasons. Concerning the classification performances, the optimal number of principal components (i.e., 5) achieved an accuracy in cross-validation equal to 98%. Then, when the model was tasked with imputing the lacking fatty acid concentration values, the optimal number of principal components resulted in an R2 value in cross-validation of 99.53%.
Investigation of Seasonal Variation in Fatty Acid and Mineral Concentrations of Pecorino Romano PDO Cheese: Imputation of Missing Values for Enhanced Classification and Metabolic Profile Reconstruction
Sibono, Leonardo;Grosso, Massimiliano;Tronci, Stefania;Errico, Massimiliano;Manis, Cristina;Caboni, Pierluigi
2023-01-01
Abstract
Seasonal variation in fatty acids and minerals concentrations was investigated through the analysis of Pecorino Romano cheese samples collected in January, April, and June. A fraction of samples contained missing values in their fatty acid profiles. Probabilistic principal component analysis, coupled with Linear Discriminant Analysis, was employed to classify cheese samples on a production season basis while accounting for missing data and quantifying the missing fatty acid concentrations for the samples in which they were absent. The levels of rumenic acid, vaccenic acid, and omega-3 compounds were positively correlated with the spring season, while the length of the saturated fatty acids increased throughout the production seasons. Concerning the classification performances, the optimal number of principal components (i.e., 5) achieved an accuracy in cross-validation equal to 98%. Then, when the model was tasked with imputing the lacking fatty acid concentration values, the optimal number of principal components resulted in an R2 value in cross-validation of 99.53%.File | Dimensione | Formato | |
---|---|---|---|
metabolites-13-00877-v2 (1).pdf
accesso aperto
Descrizione: articolo online
Tipologia:
versione editoriale (VoR)
Dimensione
1.94 MB
Formato
Adobe PDF
|
1.94 MB | Adobe PDF | Visualizza/Apri |
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.