Date of Award
2014
Document Type
Dissertation
Degree Name
Ph.D. in Engineering Science
Department
Computer and Information Science
First Advisor
Yixin Chen
Second Advisor
Robert J. Doerksen
Third Advisor
Conrad Cunningham
Relational Format
dissertation/thesis
Abstract
Much research combines data from multiple sources in an effort to understand the underlying problems. It is important to find and interpret the most important information from these sources. Thus it will be beneficial to have an effective algorithm that can simultaneously extract decision rules and select critical features for good interpretation while preserving the prediction performance. We propose an efficient approach, combining rule extraction and feature elimination, based on 1-norm regularized random forests. This approach simultaneously extracts a small number of rules generated by random forests and selects important features. To evaluate this approach, we have applied it to several drug activity prediction data sets, microarray data sets, a seacoast chemical sensors data set, a Stockori flowering time data set, and three data sets from the UCI repository. This approach performs well compared to state-of-the-art prediction algorithms like random forests in terms of predictive performance and generates only a small number of decision rules. Some of the decision rules extracted are significant in solving the problem being studied. It demonstrates high potential in terms of prediction performance and interpretation on studying real applications.
Recommended Citation
Liu, Sheng, "Random Forests Based Rule Learning And Feature Elimination" (2014). Electronic Theses and Dissertations. 453.
https://egrove.olemiss.edu/etd/453
Concentration/Emphasis
Emphasis: Computer Science