Abstract
We present a method of data reduction using a wavelet transform in discriminant analysis when the number of variables is much greater than the number of observations. The method is illustrated with a prostate cancer study, where the sample size is 248, and the number of variables is 48,538 (generated using the ProteinChip technology). Using a discrete wavelet transform, the 48,538 data points are represented by 1271 wavelet coefficients. Information criteria identified 11 of the 1271 wavelet coefficients with the highest discriminatory power. The linear classifier with the 11 wavelet coefficients detected prostate cancer in a separate test set with a sensitivity of 97% and specificity of 100%.
Original language | English (US) |
---|---|
Pages (from-to) | 143-151 |
Number of pages | 9 |
Journal | Biometrics |
Volume | 59 |
Issue number | 1 |
DOIs | |
State | Published - Mar 2003 |
Externally published | Yes |
Keywords
- Area under the ROC curve
- Divergence
- Fisher discriminant analysis
- Kullback-Leibler information
- Mahalanobis distance
- Principal components analysis
ASJC Scopus subject areas
- Statistics and Probability
- Biochemistry, Genetics and Molecular Biology(all)
- Immunology and Microbiology(all)
- Agricultural and Biological Sciences(all)
- Applied Mathematics