Semester

Fall

Date of Graduation

2010

Document Type

Thesis

Degree Type

MS

College

Statler College of Engineering and Mineral Resources

Department

Lane Department of Computer Science and Electrical Engineering

Committee Chair

Donald A Adjeroh

Abstract

Sentiment analysis is an emerging field, concerned with the detection of human emotions from textual data. Sentiment analysis seeks to characterize opinionated or evaluative aspects of natural language text thus helping people to discover valuable information from large amount of unstructured data. Sentiment analysis can be used for grouping search engine results, analyzing news content, reviews for books, movie, sports, blogs, web forums, etc. Sentiment (i.e., bad or good opinion) described in texts has been studied widely, and at three different levels: word, sentence, and document level. Several methods have been proposed for sentiment analysis, mostly based on common machine learning techniques such as Support Vector Machine (SVM), Naive Bayes (NB), Maximum Entropy (ME).;In this thesis we explore a new methodology for sentiment analysis called proximity-based sentiment analysis. We take a different approach, by considering a new set of features based on word proximities in a written text. We focused on three different word proximity based features, namely, proximity distribution, mutual information between proximity types and proximity patterns. We applied this approach to the analysis of movie reviews domain. We perform empirical research to demonstrate the performance of the proposed approach. The experimental results show that proximity-based sentiment analysis is able to extract sentiments from a specific domain, with performance comparable to the state-of-the-art. To the best of our knowledge, this is the first attempt at focusing on proximity based features as the primary features in sentiment analysis.

Share

COinS