Simultaneous Threshold Interaction Modeling Algorithm (STIMA) has been recently introduced in the framework of statistical modeling as a tool enabling to automatically select interactions in a Generalized Linear Model (GLM) through the estimation of a suitable defined tree structure called ”trunk”. STIMA integrates GLM with a classification tree algorithm or a regression tree one, depending on the nature of the response variable (nominal or numeric). Accordingly, it can be based on the Classification Trunk Approach (CTA) or on the Regression Trunk Approach (RTA). In both cases, interaction terms are expressed as ”threshold interactions” instead of traditional cross-products. Compared with standard tree-based algorithms, STIMA is based on a different splitting criterion as well as on the possibility to ”force” the first split of the trunk by manually selecting the first splitting predictor. This paper focuses on model selection in STIMA and it introduces an alternative model selection procedure based on a measure which evaluates the trade-off between goodness of fit and accuracy. Its performance is compared with the one deriving from the current implementation of STIMA by analyzing two real datasets.

A NOTE ON MODEL SELECTION IN STIMA

CONVERSANO, CLAUDIO
2011-01-01

Abstract

Simultaneous Threshold Interaction Modeling Algorithm (STIMA) has been recently introduced in the framework of statistical modeling as a tool enabling to automatically select interactions in a Generalized Linear Model (GLM) through the estimation of a suitable defined tree structure called ”trunk”. STIMA integrates GLM with a classification tree algorithm or a regression tree one, depending on the nature of the response variable (nominal or numeric). Accordingly, it can be based on the Classification Trunk Approach (CTA) or on the Regression Trunk Approach (RTA). In both cases, interaction terms are expressed as ”threshold interactions” instead of traditional cross-products. Compared with standard tree-based algorithms, STIMA is based on a different splitting criterion as well as on the possibility to ”force” the first split of the trunk by manually selecting the first splitting predictor. This paper focuses on model selection in STIMA and it introduces an alternative model selection procedure based on a measure which evaluates the trade-off between goodness of fit and accuracy. Its performance is compared with the one deriving from the current implementation of STIMA by analyzing two real datasets.
2011
978-3-6421-1362-8
Threshold Interaction Detection; Regression Trunk; Classification Trunk
File in questo prodotto:
File Dimensione Formato  
Conversano2011_merged.pdf

Solo gestori archivio

Tipologia: versione editoriale
Dimensione 1.27 MB
Formato Adobe PDF
1.27 MB Adobe PDF   Visualizza/Apri   Richiedi una copia

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11584/33580
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 0
  • ???jsp.display-item.citation.isi??? 0
social impact