Improving Software Quality Prediction Using Intelligent Computing Techniques

Rana, Zeeshan Ali

DSpace Home
→
Engineering and Technology
→
Thesis
→
View Item

dc.contributor.author	Rana, Zeeshan Ali
dc.date.accessioned	2018-08-08T09:48:36Z
dc.date.accessioned	2020-04-11T15:34:09Z
dc.date.available	2020-04-11T15:34:09Z
dc.date.issued	42510
dc.identifier.uri	http://142.54.178.187:9060/xmlui/handle/123456789/4949
dc.description.abstract	Software Quality Prediction (SQP) has been an area of interest for the last four decades. The aim of quality prediction has been to identify the defect prone modules in software. With the help of SQP the defect prone modules can be identified and thus improved at early stages of software development. SQP is done using models that predict the defect prone modules. These prediction are based on software metrics. Software metrics and defect related information is recorded in form of datasets. These defect datasets contain instances of defect prone and not-defect prone modules. Major motive behind quality prediction is to identify defect prone modules correctly in early phases of development. Imbalanced datasets and late predictions are problems that affect this motive. In most of the datasets, the number of instances of not-defect prone modules dominate the number of instances of defect prone modules. This creates imbalance in the datasets. The defect prone modules are not identified effectively due to the imbalance. Effectively predicting defect prone modules and achieving high Recall using the public datasets becomes a challenging task. Predictions based on code metrics are considered late. Majority of the metrics in the datasets are code metrics which means that accurate predictions can be made once code metrics become available. Another issue in the domain of software quality and metrics is that software metrics used so far have inconsistent nomenclature which makes it difficult to study certain software metrics. In this thesis an association mining (AM) based approach is proposed that improves prediction of defect prone modules. The proposed approach modifies the data in a manner that a prediction model learns defect prone modules better even if there are few instances of defect prone modules. We use Recall to measure performance of the model developed after proposed preprocessing. The issue of late predictions has been handled by using a model which can work with imprecise values of software metrics. This thesis proposes a Fuzzy Inference System (FIS) based model that helps predict defect prone modules when exact values of code metrics are not available. To handle the issue of inconsistent nomenclature this thesis provides a unification and categorization framework that works on the principle of chronological use of metric names. The framework has been used to identify same metrics with different names as well as different metrics with same name. The association mining based approach has been tested using public datasets and Naive Bayes classifier. Naive Bayes classifier is the simplest and is considered as one of the best performers. The proposed approach has increased Recall of the Naive Bayes classifier upto 40%. Performance of the proposed Fuzzy Inference System (FIS), used to handle the issue of late predictions, has been compared with models like neural networks, classification trees, and linear regression based classifiers. The FIS model has performed as good as other models. Upto 10% improvement in Recall has been observed in case of FIS model. The nomenclature unification of approximately 140 metrics has been done using the proposed unification framework. Out of these 140 metrics approximately 6% different metrics have been used with same name in literature. Their naming issues have been resolved based on the chronological use of the names. Achieving better Recall using the proposed approach can help avoid costs incurred due to identification of a defect prone module late in software lifecycle when cost of fixing defects becomes higher. The proposed FIS model can be used for earlier rough estimates initially. Later, better and accurate estimates can be made when code metrics become available.	en_US
dc.description.sponsorship	Higher Education Commission, Pakistan	en_US
dc.language.iso	en	en_US
dc.publisher	Lahore University of Management Sciences, Lahore, Pakistan	en_US
dc.subject	Technology	en_US
dc.title	Improving Software Quality Prediction Using Intelligent Computing Techniques	en_US
dc.type	Thesis	en_US