Date of Award
Ph.D. in Engineering Science
Computer and Information Science
Robert J. Doerksen
Much research combines data from multiple sources in an effort to understand the underlying problems. It is important to find and interpret the most important information from these sources. Thus it will be beneficial to have an effective algorithm that can simultaneously extract decision rules and select critical features for good interpretation while preserving the prediction performance. We propose an efficient approach, combining rule extraction and feature elimination, based on 1-norm regularized random forests. This approach simultaneously extracts a small number of rules generated by random forests and selects important features. To evaluate this approach, we have applied it to several drug activity prediction data sets, microarray data sets, a seacoast chemical sensors data set, a Stockori flowering time data set, and three data sets from the UCI repository. This approach performs well compared to state-of-the-art prediction algorithms like random forests in terms of predictive performance and generates only a small number of decision rules. Some of the decision rules extracted are significant in solving the problem being studied. It demonstrates high potential in terms of prediction performance and interpretation on studying real applications.
Liu, Sheng, "Random Forests Based Rule Learning And Feature Elimination" (2014). Electronic Theses and Dissertations. 453.
Emphasis: Computer Science