Simultaneous Threshold Interaction Modeling Algorithm (STIMA) has been recently introduced in the framework of statistical modeling as a tool enabling to automatically select interactions in a Generalized Linear Model (GLM) through the estimation of a suitable defined tree structure called ”trunk”. STIMA integrates GLM with a classification tree algorithm or a regression tree one, depending on the nature of the response variable (nominal or numeric). Accordingly, it can be based on the Classification Trunk Approach (CTA) or on the Regression Trunk Approach (RTA). In both cases, interaction terms are expressed as ”threshold interactions” instead of traditional cross-products. Compared with standard tree-based algorithms, STIMA is based on a different splitting criterion as well as on the possibility to ”force” the first split of the trunk by manually selecting the first splitting predictor. This paper focuses on model selection in STIMA and it introduces an alternative model selection procedure based on a measure which evaluates the trade-off between goodness of fit and accuracy. Its performance is compared with the one deriving from the current implementation of STIMA by analyzing two real datasets.
A NOTE ON MODEL SELECTION IN STIMA
CONVERSANO, CLAUDIO
2011-01-01
Abstract
Simultaneous Threshold Interaction Modeling Algorithm (STIMA) has been recently introduced in the framework of statistical modeling as a tool enabling to automatically select interactions in a Generalized Linear Model (GLM) through the estimation of a suitable defined tree structure called ”trunk”. STIMA integrates GLM with a classification tree algorithm or a regression tree one, depending on the nature of the response variable (nominal or numeric). Accordingly, it can be based on the Classification Trunk Approach (CTA) or on the Regression Trunk Approach (RTA). In both cases, interaction terms are expressed as ”threshold interactions” instead of traditional cross-products. Compared with standard tree-based algorithms, STIMA is based on a different splitting criterion as well as on the possibility to ”force” the first split of the trunk by manually selecting the first splitting predictor. This paper focuses on model selection in STIMA and it introduces an alternative model selection procedure based on a measure which evaluates the trade-off between goodness of fit and accuracy. Its performance is compared with the one deriving from the current implementation of STIMA by analyzing two real datasets.File | Dimensione | Formato | |
---|---|---|---|
Conversano2011_merged.pdf
Solo gestori archivio
Tipologia:
versione editoriale (VoR)
Dimensione
1.27 MB
Formato
Adobe PDF
|
1.27 MB | Adobe PDF | Visualizza/Apri Richiedi una copia |
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.