Honors Theses
Date of Award
2016
Document Type
Undergraduate Thesis
Department
Modern Languages
First Advisor
Allison Burkette
Relational Format
Dissertation/Thesis
Abstract
This paper takes a deep dive into a particular area of the interdisciplinary domain of Computational Linguistics, Part-of-Speech Tagging algorithms. The author relies primarily on scholarly Computer Science and Linguistics papers to describe previous approaches to this task and the often-hypothesized existence of the asymptotic accuracy rate of around 98%, by which this task is allegedly bound. However, after doing more research into why the accuracy of previous algorithms have behaved in this asymptotic manner, the author identifies valid and empirically-backed reasons why the accuracy of previous approaches do not necessarily reflect any sort of general asymptotic bound on the task of automated Part-of-Speech Tagging. In response, a theoretical argument is proposed to circumvent the shortcomings of previous approaches to this task, which involves abandoning the flawed status-quo of training machine learning algorithms and predictive models on outdated corpora, and instead walks the reader from conception through implementation of a rule-based algorithm with roots in both practical and theoretical Linguistics. While the resulting algorithm is simply a prototype which cannot be currently verified in achieving a tagging-accuracy rate of over 98%, its multi-tiered methodology, meant to mirror aspects of human cognition in Natural Language Understanding, is meant to serve as a theoretical blueprint for a new and inevitably more-reliable way to deal with the challenges in Part-of-Speech Tagging, and provide much-needed advances in the popular area of Natural Language Processing.
Recommended Citation
Foley, William, "The Theoretical Argument for Disproving Asymptotic Upper-Bounds on the Accuracy of Part-of-Speech Tagging Algorithms: Adopting a Linguistics, Rule-Based Approach" (2016). Honors Theses. 948.
https://egrove.olemiss.edu/hon_thesis/948
Accessibility Status
Searchable text