PASTIC Dspace Repository

A Framework to Improve Classification of Positive and Negative Opinions in Roman Urdu-English Code Switching Environment

Show simple item record

dc.contributor.author Hassan, Muhammad Awais
dc.date.accessioned 2019-11-13T06:54:10Z
dc.date.accessioned 2020-04-11T15:40:51Z
dc.date.available 2020-04-11T15:40:51Z
dc.date.issued 2016
dc.identifier.govdoc 18589
dc.identifier.uri http://142.54.178.187:9060/xmlui/handle/123456789/5289
dc.description.abstract In computational linguistics, sentiment analysis facilitates classification of opinion as a positive or a negative class. In last decade, the area of sentiment analysis of English language is explored largely with different techniques those have improved the overall performance.Urdu is language of sixty-six million people and largely spoken in south-asian subcontinent. Also, it is national language of Pakistan which is world sixth most populous country according to United Nations Population Division. Sentiment analysis of Urdu language is important tool to understand the behavioural aspects, cultural values and social habits of the people living in this part of world. Opinion mining is also crucial for governments, policy makers, business owners and brand ambassadors to make their decisions in accordance to sentiment of the public. However, sentiment analysis of Urdu language is not well explored as that of English language. The Urdu sentiment analysis is performed with simple Bag-of-Word (BoW) method and machine learning (ML) techniques with limited set of features. The BoW method is not sufficient to handle complex opinions. Also, the accuracy of ML techniques, with legacy features, is not comparable to the sentiment classification task of other languages. For English language, the discourse information (sub-sentence level information) boosted the performance of both BoW method and ML techniques. A theory for Urdu sentiment analysis that extract and use the discourse information at sub sentence level and also suggest a computational model to achieve more accurate and better results than the simple bag of word approach. The proposed solution segmented the sentiment into two sub-opinions, extracted discourse information (discourse relation and polarity relation), proposed an extended BoW method (rule based method) and suggested a new small subset of features for ML techniques. The results significantly enhance (p < 0.001) the performance of recall, precision and accuracy by 37.25%, 8.46%, and 24.75% respectively. The current research targeted sentiment with two sub-opinions that remain excellent until the opinions are short messages like those on Twitter, in forum comments or as Facebook status posts. The proposed technique can be extended for sentiments with more than two sub-opinions such as blogs, reviews, and TV talk shows. en_US
dc.description.sponsorship Higher Education Commission Pakistan en_US
dc.language.iso en_US en_US
dc.publisher University of Engineering & Technology, Lahore. en_US
dc.subject Computer Science en_US
dc.title A Framework to Improve Classification of Positive and Negative Opinions in Roman Urdu-English Code Switching Environment en_US
dc.type Thesis en_US


Files in this item

This item appears in the following Collection(s)

Show simple item record

Search DSpace


Advanced Search

Browse

My Account