dc.description.abstract |
The explosion of Web based user generated reviews has caused the emergence of Opinion
Mining (OM) applications for knowing and analyzing the users‟ opinions toward a product,
service, and policy. Opinion mining is getting popular due to rapid growth of web users,
increasing number of online discussion forums, and other social media sites. Opinion mining
is the process of determining the feelings or opinions of other people about services, politics,
products and policies. However, due to the economic importance of these opinions, there is a
growing trend of developing efficient and effective opinion mining systems.
The main motivation of this thesis is to extract opinion from online blogs and user reviews
using lexicon based approach. This work focuses on the development of lexicon based
improved term weighting method for polarity classification at sentence level. The polarity
lexicons often play a pivotal role in polarity classification of OM, indicating the positivity
and negativity of a term along with the numeric score. However, the commonly available
domain independent lexicons are not an optimal choice for all domains in OM applications,
as polarity of a term changes from one domain to other, and such lexicons do not contain the
correct polarity of a term for every domain. In this work, focus is lexicon based polarity
classification by adapting a domain dependent polarity lexicon from set of labeled user
reviews and domain independent lexicon, and propose a unified learning framework based on
information theory concepts that can assign the terms with correct polarity (+ive, -ive) scores.
The comparative results obtained from experiments show that proposed method outperforms
the other baseline methods (e.g. machine learning methods) and achieves an average accuracy
of 79% on word level and 81% at sentence level. The quantitative evaluation of proposed
method against baseline methods shows that, (i) for a specific domain proposed method can
provide a sufficient coverage of required opinionated text; (ii) adapted domain-specific
lexicons have achieved improved performance in a real world and manually built datasets;
(iii) polarity classification performance can be improved significantly with resulting adapted
lexicon; and (iv) threshold adjustment gives increased accuracy for polarity classification.
The proposed framework is quite generalized and capable of classifying the opinionated text
in any domain. |
en_US |