Sieve maximum likelihood estimation for a general class of accelerated hazards models with bundled parameters (Q2405160)
From MaRDI portal
scientific article
Language | Label | Description | Also known as |
---|---|---|---|
English | Sieve maximum likelihood estimation for a general class of accelerated hazards models with bundled parameters |
scientific article |
Statements
Sieve maximum likelihood estimation for a general class of accelerated hazards models with bundled parameters (English)
0 references
21 September 2017
0 references
The authors study sieve maximum likelihood estimation for a general class of hazard regression models and provide semiparametric efficient estimators for the parameters that are bundled inside the nonparametric component. The accelerated failure time (AFT) model (\(\log T = -\beta_0^TZ + \varepsilon\)) can be written as \[ S(t| Z) = S_0( t e^{\beta_0^TZ}) \] to avoid the constant proportionality between hazard functions, which may not hold in practice for the Cox model. This is typically carried out by least squares or rank methods and the corresponding variances are estimated by resampling algorithms such as the bootstrap. In a randomized clinical trial, the treatment groups are essentially identical at \(t=0\) due to randomization, and randomization makes different groups alike except for treatment. \textit{Y. Q. Chen} and \textit{M.-C. Wang} [J. Am. Stat. Assoc. 95, No. 450, 608--618 (2000; Zbl 0995.62103)] proposed the accelerated hazard model by replacing the survival function with the corresponding hazard functions \[ \lambda (t| Z) = \lambda_0 (t e^{\beta_0^T Z}). \] This model is intuitive in the sense that the hazard functions for different values of \(Z\) are the same at time \(t=0\), and the hazards in different groups would gradually change due to different treatment as time changes. To enhance the estimation efficiency and modeling flexibility, the authors study the sieve maximum likelihood estimation for the following general class of accelerated hazard regression models: \[ \Lambda (t | Z) = \Lambda_0 (t e^{\beta_0^T Z}) e^{\gamma_0^T X}, \] where \(\Lambda_0 (\cdot )\) is an unknown baseline cumulative hazard function and \(\beta_0\) and \(\gamma_0\) are unknown vectors of the regression parameters. Asymptotic properties of the resulting estimators are established and it is shown that the estimator for the regression parameter achieves a semiparametric efficiency bound given in this paper. For the proposed sieve MLE method it is easier to find weighted estimators instead of finding the optimal weight which is challenging in practice. Furthermore, the standard error estimates are obtained directly by either inverting the observed information matrix of all the parameters or inverting the efficient information matrix of the regression parameters. Section 2 provides the sieve maximum likelihood estimating procedure with explicit log-likelihood-function given in (7) for the parameters \((\beta, \gamma, \lambda)\), \[ l_n(\beta, \gamma, g) = n^{-1}\sum_{i=1}^n \bigg [\Delta_i\bigg\{\beta^T Z_i + \gamma^T X_i + g(Y_ie^{\beta^TZ_i})\bigg\} - \int_0^{Y_ie^{\beta^T Z_i}} \exp (g (s) ) ds e^{\gamma^T X_i}\bigg ], \] where \(g(t) = \log \lambda (t)\) is estimated by a spline-based method. Let \(\hat{\theta}_n = (\hat{\beta}_n, \hat{\gamma}_n, \hat{\xi}_n (\cdot, \hat{\beta}_n))\) be the sieve estimator which maximizes the empirical log-likelihood over the sieve space, and \(\theta_0 = (\beta_0, \gamma_0, \xi_0 (\cdot, \beta_0))\) is the true parameter. In Section 3, Theorem 1 shows that under regularity conditions and for \(\frac{1}{2p+2}<v < \frac{1}{2p}\), the \(L^2\)-distance between the sieve estimator \(\hat{\theta}_n\) and \(\theta_0\) is \(O_p(n^{- \min \{pv m (1-v)/2\}})\), and \(n^{1/2}((\hat{\beta}_n^T, \hat{\gamma}_n^T)^T - (\beta_0^T, \gamma^T_0)^T)\) converges in distribution to a mean zero normal random vector with covariance matrix as the semiparametric efficiency bound of \((\beta_0^T, \gamma^T_0)^T\) in Theorem 2, and a consistent estimator for the limiting covariance matrix is summarized by Theorem 3. Section 4 is devoted to a simulation study to assess the proposed sieve MLE for finite samples. The proposed sieve MLE method performs well under all of the four different baseline hazard functions, and the parameter estimates are virtually unbiased for both \(\beta\) and \(\gamma\), and the bias decreases as the sample size increases. Section 5 applies to a study of bone marrow transplantation with 137 patients of acute leukemia. The disease-free survival time, the time to relapse, death or the end of study, is of primary interest. Patients are grouped into three risk categories based on their disease status. The authors plot the kernel-smoothed hazard rate functions with bandwidth 100 days, and fit the data by the proposed sieve MLE with smoothing splines. For the time scaled effects of risk status categories, the sieve MLE method shows that patients with AML low risk had significant decelerated hazard risks while the scaled time effect of patients with AML high risk was not significant. Section 6 remarks that the data-driven methods could be used for the classification of covariates if there is no biological information as guidance. Section 7 is devoted to technical proofs of the theorems. Theorem 1 follows from Theorem 1 in [\textit{X. Shen} and \textit{W. H. Wong}, Ann. Stat. 22, No. 2, 580--615 (1994; Zbl 0805.62008)], and hence follows from verifying condition \(C1\)--\(C3\) of Theorem 1 in [loc. cit.], where \(C1\) follows from the Taylor expansion, Cauchy-Schwarz inequality and regularity conditions, and \(C2\) follows from regularity conditions and Taylor expansion, and \(C3\) follows from the estimation of the Kullback-Leibler distance. Theorem 2 is decomposed by verifying six properties and applying Theorem 2.1 of [\textit{Y. Ding} and \textit{B. Nan}, Ann. Stat. 39, No. 6, 3032--3061 (2011; Zbl 1246.62103)], then the result follows from Theorem 6.1 of [\textit{J. A. Wellner} and \textit{Y. Zhang}, Ann. Stat. 35, No. 5, 2106--2142 (2007; Zbl 1126.62084)]. The proof of Theorem 3 follows from Theorem 1 and regularity conditions.
0 references
accelerated failure time model
0 references
Cox model
0 references
accelerated hazard regression model
0 references
B-spline
0 references
proportional hazards model
0 references
semiparametric efficiency bound
0 references
sieve maximum likelihood estimator
0 references
convergence rate
0 references
Donsker property
0 references
survival data
0 references