Honors Theses
Date of Award
2015
Document Type
Undergraduate Thesis
Department
Computer and Information Science
First Advisor
Yixin Chen
Relational Format
Dissertation/Thesis
Abstract
The Expectation Maximization algorithm also known as the EM algorithm is an algorithm used to solve the maximum likelihood parameter estimation problem. This problem arises when some of the data involved are missing or incomplete, hence it becomes difficult to know the parameters of the underlying distribution. The EM algorithm mainly comprises of two steps; the E—Step, and the M—Step. In the E—Step, estimated parameter values are used as true values to calculate the maximum likelihood estimate, and in the M—Step, the maximum likelihood calculated is used to estimate the parameters. The E—Step and M—Step iterate through until a speciï¬ed convergence is met. Applications of the EM algorithm include density estimation in unsupervised clustering, estimating class—conditional densities in supervised learning settings, and for outlier detection purposes. The Spatial — EM algorithm is a novel approach that utilizes median — based location and rank — based scatter estimators to replace the sample mean and sample covariance matrix in the M — Step of an EM algorithm. This helps to enhance the stability and robustness of the Spatial — EM algorithm for ï¬nite mixture models. The algorithm is especially robust to outliers. In this research, we use the trimmed Bayesian Information Criterion (BIC) to determine the optimal value of the number of components in the distribution. The algorithm is implemented as an R package, and tested on different datasets.
Recommended Citation
Aloba, Aishat O., "Estimating the Number of Components of a Spatial—Em Algorithm: an R Package" (2015). Honors Theses. 13.
https://egrove.olemiss.edu/hon_thesis/13
Accessibility Status
Searchable text