A Study Of Data Informatics: Data Analysis And Knowledge Discovery Via A Novel Data Mining Algorithm
Date of Award
2014
Document Type
Dissertation
Degree Name
Ph.D. in Business Administration
Department
Management Information Systems
First Advisor
Sumali Conlon
Second Advisor
Tony Ammeter
Third Advisor
Milam Aiken
Relational Format
dissertation/thesis
Abstract
Frequent pattern mining (fpm) has become extremely popular among data mining researchers because it provides interesting and valuable patterns from large datasets. The decreasing cost of storage devices and the increasing availability of processing power make it possible for researchers to build and analyze gigantic datasets in various scientific and business domains. A filtering process is needed, however, to generate patterns that are relevant. This dissertation contributes to addressing this need. An experimental system named fpmies (frequent pattern mining information extraction system) was built to extract information from electronic documents automatically. Collocation analysis was used to analyze the relationship of words. Template mining was used to build the experimental system which is the foundation of fpmies. With the rising need for improved environmental performance, a dataset based on green supply chain practices of three companies was used to test fpmies. The new system was also tested by users resulting in a recall of 83.4%. The new algorithm's combination of semantic relationships with template mining significantly improves the recall of fpmies. The study's results also show that fpmies is much more efficient than manually trying to extract information. Finally, the performance of the fpmies system was compared with the most popular fpm algorithm, apriori, yielding a significantly improved recall and precision for fpmies (76.7% and 74.6% respectively) compared to that of apriori (30% recall and 24.6% precision).
Recommended Citation
Balan, Shilpa, "A Study Of Data Informatics: Data Analysis And Knowledge Discovery Via A Novel Data Mining Algorithm" (2014). Electronic Theses and Dissertations. 914.
https://egrove.olemiss.edu/etd/914
Concentration/Emphasis
Emphasis: MIS