Last modified: 2019-09-23
Abstract
Predictive analytics on twitter feeds is becoming a popular field for research. A tweet holds wealth of information on how an individual express and communicate their feelings and emotions within their social network. Large scale collection, cleaning, and mining of tweets will not only help in capturing an individual’s emotion but also the emotions of a larger group. However, capturing a large volume of tweets and identifying the emotions expressed in it is a very challenging task. Different classification algorithms employed in the past for classifying emotions have resulted in low-to-moderate accuracies thus making it difficult to precisely predict the outcomes of an event. In this study an emotion based classification scheme has been proposed. Initially a synthetic dataset is built by randomly picking instances from different training datasets. Using this newly constructed dataset the classifiers are trained (model building). Finally, emotions are predicted on the test datasets using the generated models. By training Naïve Bayes Multinomial and Random Forest classifiers on the synthetic dataset that is constructed from two well-known emotion-classified training dataset, classifications were performed on the test dataset containing tweets corresponding to the 2016 US presidential election. Upon classifying the tweets in the test dataset to one of the four basic emotion types; Anger, Happy, Sadness and Surprise, and by determining the sentiments of the people we have tried to paint the emotional swings across different camps over the period of 6 weeks before the election.