dc.contributor.author |
Rana, Zeeshan Ali |
|
dc.date.accessioned |
2018-08-08T09:48:36Z |
|
dc.date.accessioned |
2020-04-11T15:34:09Z |
|
dc.date.available |
2020-04-11T15:34:09Z |
|
dc.date.issued |
42510 |
|
dc.identifier.uri |
http://142.54.178.187:9060/xmlui/handle/123456789/4949 |
|
dc.description.abstract |
Software Quality Prediction (SQP) has been an area of interest for the last four decades. The aim
of quality prediction has been to identify the defect prone modules in software. With the help
of SQP the defect prone modules can be identified and thus improved at early stages of software
development. SQP is done using models that predict the defect prone modules. These prediction
are based on software metrics. Software metrics and defect related information is recorded in form
of datasets. These defect datasets contain instances of defect prone and not-defect prone modules.
Major motive behind quality prediction is to identify defect prone modules correctly in early
phases of development. Imbalanced datasets and late predictions are problems that affect this
motive. In most of the datasets, the number of instances of not-defect prone modules dominate
the number of instances of defect prone modules. This creates imbalance in the datasets. The
defect prone modules are not identified effectively due to the imbalance. Effectively predicting
defect prone modules and achieving high Recall using the public datasets becomes a challenging
task. Predictions based on code metrics are considered late. Majority of the metrics in the datasets
are code metrics which means that accurate predictions can be made once code metrics become
available. Another issue in the domain of software quality and metrics is that software metrics used
so far have inconsistent nomenclature which makes it difficult to study certain software metrics.
In this thesis an association mining (AM) based approach is proposed that improves prediction
of defect prone modules. The proposed approach modifies the data in a manner that a prediction
model learns defect prone modules better even if there are few instances of defect prone modules.
We use Recall to measure performance of the model developed after proposed preprocessing. The
issue of late predictions has been handled by using a model which can work with imprecise values
of software metrics. This thesis proposes a Fuzzy Inference System (FIS) based model that helps
predict defect prone modules when exact values of code metrics are not available. To handle the
issue of inconsistent nomenclature this thesis provides a unification and categorization framework
that works on the principle of chronological use of metric names. The framework has been used to
identify same metrics with different names as well as different metrics with same name.
The association mining based approach has been tested using public datasets and Naive Bayes
classifier. Naive Bayes classifier is the simplest and is considered as one of the best performers.
The proposed approach has increased Recall of the Naive Bayes classifier upto 40%. Performance
of the proposed Fuzzy Inference System (FIS), used to handle the issue of late predictions, has
been compared with models like neural networks, classification trees, and linear regression based
classifiers. The FIS model has performed as good as other models. Upto 10% improvement in
Recall has been observed in case of FIS model. The nomenclature unification of approximately
140 metrics has been done using the proposed unification framework. Out of these 140 metrics
approximately 6% different metrics have been used with same name in literature. Their naming
issues have been resolved based on the chronological use of the names.
Achieving better Recall using the proposed approach can help avoid costs incurred due to identification
of a defect prone module late in software lifecycle when cost of fixing defects becomes
higher. The proposed FIS model can be used for earlier rough estimates initially. Later, better and
accurate estimates can be made when code metrics become available. |
en_US |
dc.description.sponsorship |
Higher Education Commission, Pakistan |
en_US |
dc.language.iso |
en |
en_US |
dc.publisher |
Lahore University of Management Sciences, Lahore, Pakistan |
en_US |
dc.subject |
Technology |
en_US |
dc.title |
Improving Software Quality Prediction Using Intelligent Computing Techniques |
en_US |
dc.type |
Thesis |
en_US |