Generalized Additive Models (GAMs) are widely used in statistics. In this work, we aim to tackle the challenge of identifying the most influential variables in GAMs. To accomplish this, we introduce a variance allocation approach based on the Shapley value. We derive a closed-form expression for this importance index, which allows for its computation on high-dimensional datasets and with any dependence structure. We discuss the practical implication that when a variable's importance is negligible, it can be safely eliminated from the GAM, simplifying the model. Through our case studies, we demonstrate that Shapley values offer more informative insights than p-values in terms of ranking the importance of variables. All the code is available online in the supplementary material.
An Exact Game-Theoretic Variable Importance Index for Generalized Additive Models
Amir Khorrami Chokami
;
2024-01-01
Abstract
Generalized Additive Models (GAMs) are widely used in statistics. In this work, we aim to tackle the challenge of identifying the most influential variables in GAMs. To accomplish this, we introduce a variance allocation approach based on the Shapley value. We derive a closed-form expression for this importance index, which allows for its computation on high-dimensional datasets and with any dependence structure. We discuss the practical implication that when a variable's importance is negligible, it can be safely eliminated from the GAM, simplifying the model. Through our case studies, we demonstrate that Shapley values offer more informative insights than p-values in terms of ranking the importance of variables. All the code is available online in the supplementary material.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.