Vol 48(2014) N 2 p. 287-296; DOI 10.1134/S0026893314020101
L. Hua*, P. Zhou
Combining Protein-Protein Interactions Information with Support Vector Machine to Identify Chronic Obstructive Pulmonary Disease Related GenesBiomedical Engineering Institute of Capital Medical University, Beijing 100069, China, Beijing, 100069, China
Received - 2013-03-19; Accepted - 2013-09-30
Chronic obstructive pulmonary disease (COPD) is a complex human disease with a high mortality rate. So far, the studies of COPD have not been well organized despite the well-documented role of cigarette smoking in the genesis of COPD. In the recent years, microarray analyses have helped to identify some potential disease related genes. However, the low reproducibility of many published gene signatures has been criticized. It therefore suggested that incorporation of network or pathway information into prognostic biomarker discovery might improve the prediction performance. In this analysis, we combined protein-protein interactions (PPI) information with the support vector machine (SVM) method to identify potential COPD-related genes that would allow one to distinguish accurately severe emphysema from non-/mildly emphysematous lung tissue. We identified 8 COPD-related feature genes. When compared with another SVM method which did not use the prior PPI information, the prediction accuracy was significantly enhanced (AUC was increased from 0.513 to 0.909). On the base of results obtained one can suppose that incorporating network of prior knowledge into gene selection methods significantly improves classification accuracy. Consequently, the gene expression profiles from human emphysematous lung tissue may provide insight into the pathogenesis, and a good classification prediction algorithm based on prior biological knowledge can further strengthen this performance.
COPD, microarray, protein-protein information, support vector machine