EVALUATION OF HIDDEN MARKOV MODEL FOR MALWARE BEHAVIORAL CLASSIFICATION

Imran, Mohammad

DSpace Home
→
Engineering and Technology
→
Thesis
→
View Item

dc.contributor.author	Imran, Mohammad
dc.date.accessioned	2018-02-22T05:47:16Z
dc.date.accessioned	2020-04-11T15:33:39Z
dc.date.available	2020-04-11T15:33:39Z
dc.date.issued	2016
dc.identifier.uri	http://142.54.178.187:9060/xmlui/handle/123456789/4896
dc.description.abstract	Malware is a growing threat to computer systems and networks around the world. Ever since the malware construction kits and metamorphic virus generators became easily available, creating and spreading obfuscated malware has become a simple matter. The cyber-security vendors receive thousands of new malware samples everyday for analysis. It has become a challenging task for the malware analysts to identify if a given malware sample is a variant of a known malware or belongs to a new breed altogether. Since making an accurate decision about the nature of an unknown malware sample is crucial for updating of signature databases and propagation of the update to their customers, therefore vendors of cyber-security products need accurate malware classi cation techniques for this purpose. The research community has been active for providing a solution to the above problem, and a number of diverse avenues have been explored such as machine learning, graph theory, nite state machines, etc. Furthermore, many syntactic and semantic aspects of computer programs have been tried out in search of the best aspect that could be used to distinguish between harmful and harmless computer programs, and to di erentiate malware belonging to di erent families. All the proposed approaches have merits and demerits of their own, and the search for a solution that maximizes the classi cation accuracy with minimal computational costs is continued. This dissertation formulates malware classi cation as a sequence classi cation problem, and evaluates a widely used sequence classi cation tool, Hidden Markov Model (HMM), for the task of malware classi cation. HMM has been a method of choice for a broad range of sequential pattern matching applications such as speech analysis, behavior modeling and handwriting recognition to name a few. The dissertation rst proposes and evaluates novel methods of malware classi cation by combining HMM and malware behavioral features, which are attributes frequently used to distinguish between normal and malicious programs and to di erentiate x among malware families. As an another major contribution, the dissertation lls a signi cant research gap by studying the role of an important HMM parameter, the number of hidden states, in malware classi cation applications. Based on observations from comprehensive experiments conducted on a large and diverse dataset consisting of malware behavioral reports, the dissertation concludes that although HMM shows encouraging results when used for malware classi cation tasks, its potential from a practical standpoint is fairly limited. The dissertation makes the third contribution by proposing to replace the HMM component of malware classi cation method with Markov Chain Model (MCM), and performing comparative evaluation between the two models. Results of the comparison prove that classi cation performance achieved by HMM can be attained much more e - ciently by MCM, and therefore MCM should be preferred over HMM for malware classi cation applications.	en_US
dc.description.sponsorship	Higher Education Commission, Pakistan	en_US
dc.language.iso	en	en_US
dc.publisher	CAPITAL UNIVERSITY OF SCIENCE & TECHNOLOGY ISLAMABAD	en_US
dc.subject	Computer science, information & general works	en_US
dc.title	EVALUATION OF HIDDEN MARKOV MODEL FOR MALWARE BEHAVIORAL CLASSIFICATION	en_US
dc.type	Thesis	en_US