Discriminant analysis based on binary time series (Q2189750)
From MaRDI portal
scientific article
Language | Label | Description | Also known as |
---|---|---|---|
English | Discriminant analysis based on binary time series |
scientific article |
Statements
Discriminant analysis based on binary time series (English)
0 references
16 June 2020
0 references
Given a binary time series, this paper proposes a new method of classifying the series into one of two previously defined categories. Let's see a brief introduction to the mathematical aspects of the paper. Let \((\Omega,F,P)\) be a probability space; \(n\geq 1\) integer; for each \(i=1,\dots,n\), \(Z_{i}:\Omega\to\mathbb{R}\) is a random variable; \(Z=(Z_1,\dots,Z_{n})'\) is said to have an ellipsoidal distribution of dimension \(n\) with location parameter \(\mu\in\mathbb{R}^n\), scale parameter \(\Sigma_{n}\), and function parameter \(g\) if \(Z\) has a probability density function given by \[ p_{Z}(x)=((c_{n})/(\sqrt{(|\Sigma_{n}|)}))g((x-\mu)'\Sigma_{n}^{-1}(x-\mu)), \] where: \(\Sigma_{n}\) is a symmetric positive defined \(n\times n\) matrix, \(g:[0,+\infty)\to[0,+\infty)\) is a continuous function such that \[ \int_0^{+\infty}t^{(n/2)-1}g(t)\cdot dt<\infty, \] \(|\Sigma_{n}|\) is the determinant of \(\Sigma_{n}\), and \[ c_{n}=\Gamma((n/2))(\pi^{(n/2)}\int_0^{+\infty}t^{(n/2)-1}g(t)\cdot dt). \] For each \(t\in\mathbb{Z}\) let \(Z_{t}:\Omega\to\mathbb{R}\) a random variable. It is said that the stochastic process \(Z=(Z_{t})_{t\in\mathbb{Z}}\) is an ellipsoidal process if for every \((t_1,\dots,t_{n})\in\mathbb{Z}^n\) and \(n\in\mathbb{N}\), \((Z_{t_1},\dots,Z_{t_{n}})'\) has ellipsoidal distribution. The stochastic process \(Z=(Z_{t})_{t\in\mathbb{Z}}\) is \(\alpha\)-mixing or strongly mixing if for every \(n\in\mathbb{N}\) \[ \sup_{k\in\mathbb{Z},A\in B_{-\infty}^{k},B\in B_{k+n}^{+\infty}}|P(A\cap B)-P(A)\cdot P(B)|\leq\alpha(n)\to_{n\to\infty}0, \] where for each \(-\infty\leq p\leq q\leq+\infty\) be \(B_{p}^{q}\) the \(\sigma\)-field generated by \[ G_{p}^{q}=\{Z_{i}^{-1}(B)/p\leq i\leq q, B\in B(\mathbb{R})\}, \] where \(B(\mathbb{R})\) is the Borel \(\sigma\)-field of \(\mathbb{R}\). In this paper it is assumed that the process observed \(Z=(Z_{t})_{t\in\mathbb{Z}}\) is \(\alpha\)-mixing such that for each \(n\geq 1\) and \((t_1,\dots,t_{n})\in\mathbb{Z}^n\), \((Z_{t_1},\dots,Z_{t_{n}})'\) have an ellipsoidal distribution with location parameter \(0\in\mathbb{R}^n\), scale parameter \(\Sigma_{n}\), and function parameter \(g\) as above, and \[ \sup_{n\in\mathbb{N}}\alpha(n)\cdot n^{8+\delta}<\infty \] for some \(\delta>0\). The process \(X=(X_{t})_{t\in\mathbb{Z}}\), called clipped process generated by \(Z\), is defined for each \(t\in\mathbb{Z}\) by \[ X_t= \begin{cases} 1\text{ if }Z_{t}\geq 0\\ 0\text{ if }Z_{t}<0. \end{cases} \] Based on the \(X\) process, the authors define a new discriminant method applied to the \(Z\) process. Consider classifying the \(Z\) series into two categories, say \(\prod_1\) and \(\prod_2\) with spectral densities \(f_1\) and \(f_2\). Be \(K:[-\pi,\pi]\to\mathbb{R}\), \(f\) and \(g\) functions defined on \([-\pi,\pi]\) such that \[ \int_{[-\pi,\pi]}(K(\lambda)(f(\lambda)-g(\lambda)))^2\cdot d\lambda<\infty. \] It is then defined: \[ DM(f,g)=(1/(8\pi))\int_{[-\pi,\pi]}(K(\lambda)(f(\lambda)-g(\lambda)))^2\cdot d\lambda. \] Let \((Z_1,\dots,Z_{n})'\) be a \(n\)-size sample of \(Z\). Let \(f_{Z,n}\) be a spectral density estimator of \(Z\) which is based on \(X\). The discrimination criterion applied to \(Z\) is: \[ \text{ if }D_{n}(f_{Z,n},f_1,f_2)>0\text{ then }Z\in\prod_1, \] \[ \text{ if }D_{n}(f_{Z,n},f_1,f_2)\leq 0\text{ then }Z\in\prod_2, \] where \[ D_{n}(f_{Z,n},f_1,f_2)=DM(f_{Z,n},f_2)-DM(f_{Z,n},f_1). \] The following property of asymptotic consistency is proved in the paper: \[ \lim_{n\to\infty}P(D_{n}<0|\prod_1)=0=\lim_{n\to\infty}P(D_{n}>0|\prod_2). \] In a simulation study the authors analyze the robustness properties of the proposed discriminant method in the case of finite samples, when possible outliers exist. The simulation procedure is explained in detail. Finally they test the new method with real data from ECGs that fall into two categories: from a normal patient and from a patient with heart problems. Both in the simulation study and with the real data the new method was compared with a classical one, reaching the conclusion that in the presence of outliers, the classical method is unreliable unlike the proposed method which is reliable. The paper is very well written, it is remarkable for its didactics. The proofs of the main results are rigorous and extensively detailed. The list of references is extensive and up-to-date.
0 references
stationary process
0 references
spectral density
0 references
binary time series
0 references
robustness
0 references
discriminant analysis
0 references
misclassification probability
0 references
0 references
0 references
0 references
0 references
0 references