Personalized Video Summarization Based on Viewer's Emotions

Qayyum, Huma

DSpace Home
→
Humanities
→
Thesis
→
View Item

Personalized Video Summarization Based on Viewer's Emotions

Qayyum, Huma

URI: http://142.54.178.187:9060/xmlui/handle/123456789/4318

Date: 2019

Abstract:

Due to a rapid growth in the field of multimedia content, the user now demands video summaries, which represent the video content in a precise and compact manner according to their needs. Conventionally, video summaries have been produced by using a low-level image, audio and textual features, which are unaware of the viewer’s requirements and result in a semantic gap. Video content evokes certain emotions in a viewer, which can be measured and act as a strong source of information to generate summaries meeting viewer’s expectation. In this research, personalized video summarization framework is designed that classifies viewer’s emotion based on his/her facial expressions and electroencephalography (EEG) signals while watching a video to extract keyframes is presented. The first contribution of this thesis is to propose a new strategy to recognize facial expressions. For this purpose, the stationary wavelet transform is used to extract features for facial expression recognition due to its good localization characteristics, both in spectral and spatial domains. More specifically, a combination of horizontal and vertical sub-bands of the stationary wavelet transform is used as these sub-bands contain muscle movement information for the majority of the facial expressions. Feature dimensionality is reduced by applying discrete cosine transform on these sub bands. The selected features are then passed into a feed-forward neural network that is trained through back propagation algorithm to recognize facial expressions. The second contribution of this thesis is to generate personal video summaries with proposed facial expression recognition scheme. The video is shown to the viewer and facial expressions are recorded simultaneously using a Microsoft Kinect device. Those frames are selected as keyframes from the video, where different facial expressions of the viewer are recognized. The third and final contribution of this research is a new personalized video summarization technique based on human emotion classification using EEG signals. The video is shown to the viewer and electrical brain activity is recorded simultaneously using EEG electrodes. Features are extracted in time, frequency and wavelet domain to classify viewer’s emotion into happy, love, sad, anger, surprise and neutral. Those frames are selected as keyframes from the video, where the different emotions of the viewer are evoked. According to the experimental results the proposed facial expression recognition scheme using stationary wavelet transform gives an accuracy of 98.8%, 96.61% and 94.28% in case of Japanese Female Facial Expressions (JAFFE), Extended Cohn Kanade Dataset (CK+) and Microsoft- Kinect (MS-Kinect) datasets. Furthermore, it is evident from the results that the personalized video summarization using proposed facial expression recognition generates personal video summaries with high precision, recall, F-measure, accuracy rate, and low error rate, hence reducing the semantic gap. In case of emotion recognition using EEG signals, classification accuracy up to 92.83% is achieved by using support vector machine classifier when time, frequency and wavelet domain features are used in a hybrid manner. Experimental results also demonstrate that the proposed EEG based personal video summarization framework outperforms the state-of-the-art video summarization methods

Show full item record