Economically-efficient Sentiment Stream Analysis

TitleEconomically-efficient Sentiment Stream Analysis
Publication TypeConference Proceedings
Year of Publication2014
AuthorsRoberto Lourenco, Adriano Veloso, Adriano Pereira, Wagner Meira, Renato Ferreira, Srinivasan Parthasarthy
Conference Name37th international ACM SIGIR Conference on Research & Development in Information Retrieval
Pagination637-646
Date Published07/2014
PublisherACM
Conference LocationGold Coast, Australia
ISSN Number978-1-4503-2257-7
KeywordsEconomic Efficiency, Sentiment Analysis, Streams and Drifts
Abstract

Text-based social media channels, such as Twitter, produce torrents of opinionated data about the most diverse topics and entities. The analysis of such data (aka. sentiment analysis) is quickly becoming a key feature in recommender systems and search engines. A prominent approach to sentiment analysis is based on the application of classification techniques, that is, content is classified according to the attitude of the writer. A major challenge, however, is that Twitter follows the data stream model, and thus classifiers must operate with limited resources, including labeled data and time for building classification models. Also challenging is the fact that sentiment distribution may change as the stream evolves. In this paper we address these challenges by proposing algorithms that select relevant training instances at each time step, so that training sets are kept small while providing to the classifier the capabilities to suit itself to, and to recover itself from, different types of sentiment drifts. Simultaneously providing capabilities to the classifier, however, is a conflicting-objective problem, and our proposed algorithms employ basic notions of Economics in order to balance both capabilities. We performed the analysis of events that reverberated on Twitter, and the comparison against the state-of-the-art reveals improvements both in terms of error reduction (up to 14%) and reduction of training resources (by orders of magnitude).

DOI10.1145/2600428.2609612