In this thesis we study the problem of selecting a set of regressors when the response variable follows a parametric model (such as Weibull or lognormal) and observations are right censored. Under a Bayesian approach, the most widely used tools are the Bayes Factors (BFs) which are, however, undefined when using improper priors. Some commonly used tools in literature, which solve the problem of indeterminacy in model selection, are the Intrinsic Bayes factor (IBF) and the Fractional Bayes factor (FBF). The two proposals are not actual Bayes factors but it can be shown that they asymptotically tend to actual BFs calculated over particular priors called intrinsic and fractional priors, respectively. Each of them depends on the size of a minimal training sample (MTS) and, in particular, the IBF also depends on the MTSs used. When working with censored data, it is not immediate to define a suitable MTS because the sample space of response variables must be fully explored when drawing MTSs, but only uncensored data are actually relevant to train the improper prior into a proper posterior. In fact, an unweighted MTS consisting only of uncensored data may produce a serious bias in model selection. In order to overcome this problem, a sequential MTS (SMTS) is used, leading to an increase in the number of possible MTSs as each one has random size. This prevents the use of the IBF for exploring large model spaces. In order to decrease the computational cost, while maintaining a behavior comparable to that of the IBF, we provide a suitable definition of the FBF that gives results similar to the ones of the IBF calculated over the SMTSs. We first define the conditional FBF on a fraction proportional to the MTS size and, then, we show that the marginal FBF (mFBF), obtained by averaging the conditional FBFs with respect to the probability distribution of the fraction, is consistent and provides also good results. Next, we recall the definition of intrinsic prior for the case of the IBF and the definition of the fractional prior for the FBF and we calculate them in the case of the exponential model for right censored data. In general, when the censoring mechanism is unknown, it is not possible to obtain these priors. Also another approach to the choice of the MTS, which consists in weighting the MTS by a suitable set of weights, is presented. In fact, we define the Kaplan-Meier minimal training sample (KMMTS) which depends on the Kaplan-Meier estimator of the survival function and which contains only suitable weighted uncensored observations. This new proposal could be useful when the censoring percentage is not very high, and it allows faster computations when the predictive distributions, calculated only over uncensored observations, can be obtained in closed-form. The new methodologies are validated by means of simulation studies and applications to real data.

Objective bayesian variable selection for censored data

PERRA, SILVIA
2013-05-20

Abstract

In this thesis we study the problem of selecting a set of regressors when the response variable follows a parametric model (such as Weibull or lognormal) and observations are right censored. Under a Bayesian approach, the most widely used tools are the Bayes Factors (BFs) which are, however, undefined when using improper priors. Some commonly used tools in literature, which solve the problem of indeterminacy in model selection, are the Intrinsic Bayes factor (IBF) and the Fractional Bayes factor (FBF). The two proposals are not actual Bayes factors but it can be shown that they asymptotically tend to actual BFs calculated over particular priors called intrinsic and fractional priors, respectively. Each of them depends on the size of a minimal training sample (MTS) and, in particular, the IBF also depends on the MTSs used. When working with censored data, it is not immediate to define a suitable MTS because the sample space of response variables must be fully explored when drawing MTSs, but only uncensored data are actually relevant to train the improper prior into a proper posterior. In fact, an unweighted MTS consisting only of uncensored data may produce a serious bias in model selection. In order to overcome this problem, a sequential MTS (SMTS) is used, leading to an increase in the number of possible MTSs as each one has random size. This prevents the use of the IBF for exploring large model spaces. In order to decrease the computational cost, while maintaining a behavior comparable to that of the IBF, we provide a suitable definition of the FBF that gives results similar to the ones of the IBF calculated over the SMTSs. We first define the conditional FBF on a fraction proportional to the MTS size and, then, we show that the marginal FBF (mFBF), obtained by averaging the conditional FBFs with respect to the probability distribution of the fraction, is consistent and provides also good results. Next, we recall the definition of intrinsic prior for the case of the IBF and the definition of the fractional prior for the FBF and we calculate them in the case of the exponential model for right censored data. In general, when the censoring mechanism is unknown, it is not possible to obtain these priors. Also another approach to the choice of the MTS, which consists in weighting the MTS by a suitable set of weights, is presented. In fact, we define the Kaplan-Meier minimal training sample (KMMTS) which depends on the Kaplan-Meier estimator of the survival function and which contains only suitable weighted uncensored observations. This new proposal could be useful when the censoring percentage is not very high, and it allows faster computations when the predictive distributions, calculated only over uncensored observations, can be obtained in closed-form. The new methodologies are validated by means of simulation studies and applications to real data.
20-mag-2013
Improper priors
analisi della sopravvivenza
intrinsic prior
prior improprie
prior intrinseche
survival analysis
Perra, Silvia
File in questo prodotto:
File Dimensione Formato  
Perra_PhD_Thesis.pdf

accesso aperto

Tipologia: Tesi di dottorato
Dimensione 1.17 MB
Formato Adobe PDF
1.17 MB Adobe PDF Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11584/266108
 Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
social impact