Author/Authors :
Huang, Canyi School of Computer - Jiangxi University of Traditional Chinese Medicine - Nanchang, China , Li, Keding School of Humanities - Jiangxi University of Traditional Chinese Medicine - Nanchang, China , Du, Jianqiang School of Computer - Jiangxi University of Traditional Chinese Medicine - Nanchang, China , Nie, Bin School of Computer - Jiangxi University of Traditional Chinese Medicine - Nanchang, China , Xu, Guoliang Jiangxi University of Traditional Chinese Medicine - Nanchang, China , Xiong, Wangping School of Computer - Jiangxi University of Traditional Chinese Medicine - Nanchang, China , Luo, Jigen School of Computer - Jiangxi University of Traditional Chinese Medicine - Nanchang, China
Abstract :
.e basic experimental data of traditional Chinese medicine are generally obtained by high-performance liquid chromatography
and mass spectrometry. .e data often show the characteristics of high dimensionality and few samples, and there are many
irrelevant features and redundant features in the data, which bring challenges to the in-depth exploration of Chinese medicine
material information. A hybrid feature selection method based on iterative approximate Markov blanket (CI_AMB) is proposed
in the paper. .e method uses the maximum information coefficient to measure the correlation between features and target
variables and achieves the purpose of filtering irrelevant features according to the evaluation criteria, firstly. .e iterative approximation Markov blanket strategy analyzes the redundancy between features and implements the elimination of redundant
features and then selects an effective feature subset finally. Comparative experiments using traditional Chinese medicine material
basic experimental data and UCI’s multiple public datasets show that the new method has a better advantage to select a small
number of highly explanatory features, compared with Lasso, XGBoost, and the classic approximate Markov blanket method.
Keywords :
Hybrid , Approximation , Markov , CI_AMB