Honors Theses

Date of Award

5-6-2019

Document Type

Undergraduate Thesis

Department

Computer and Information Science

First Advisor

Naeemul Hassan

Relational Format

Dissertation/Thesis

Abstract

Sexual assault has gone largely under-reported, and social media movements, like #WhyIDidntReport, have brought great awareness to this issue. In order to take advantage of the large amounts of data the #WhyIDidntReport movement has generated, the study uses tweets to explore reasons why victims do not report their assault. The thesis cites current research on the topic of assault to generate a list of explanations victims use to describe their lack of reporting and compares the distributions with existing studies. We use a supervised learning technique to automatically categorize tweets into one of eight categories. This approach uses social sensing to determine why people do not report rather than surveys and interviews like current research. The machine learning algorithms used to categorize the tweets as having a reason or not are Naive Bayes, Random Forest, and Recurrent Neural Networks. Only Naive Bayes and Random Forest were used for categorizing the reasons because there was not enough data to train large numbers of parameters of RNN. Each algorithm produces relatively precise results for the binary classification and categorizing whether a tweet references shame, denial/minimization, fear of consequences, hopelessness/helplessness, drugs or drinking or disassociation, lack of information, protecting the assailant, or age as the reason they did not report. These algorithms and tweets can be used to label data in future studies. Using the current research, natural language processing, and machine learning, we were able to determine a list of reasons mentioned on Twitter under the #WhyIDidntReport movement. The distribution of the reasons differed from current research, most likely as a result of the form of data collection. However, the categories themselves were consistent with findings from other studies. The use of social sensing to determine reasons presents a new perspective on the topic and allows for comparison with other research.

Share

COinS
 
 

To view the content in your browser, please download Adobe Reader or, alternately,
you may Download the file to your hard drive.

NOTE: The latest versions of Adobe Reader do not support viewing PDF files within Firefox on Mac OS and if you are using a modern (Intel) Mac, there is no official plugin for viewing PDF files within the browser window.