Boosting Based Multiclass Ensembles and their Applications in Machine Learning

baig, Mirza Mubasher

DSpace Home
→
Engineering and Technology
→
Thesis
→
View Item

Boosting Based Multiclass Ensembles and their Applications in Machine Learning

baig, Mirza Mubasher

URI: http://142.54.178.187:9060/xmlui/handle/123456789/5036

Date: 2016

Abstract:

Boosting is a generic statistical process for generating accurate classifier ensembles from only a moderately accurate learning algorithm. AdaBoost (Adaptive Boosting) is a machine learning algorithm that iteratively fits a number of classifiers on the training data and forms a linear combination of these classifiers to form a final ensemble. This dissertation presents our three major contributions to boosting based ensemble learning literature which includes two multi-class ensemble learning algorithms, a novel way to incorporate domain knowledge into a variety of boosting algorithms and an application of boosting in a connectionist framework to learn a feed-forward artificial neural network. To learn a multi-class classifier a new multi-class boosting algorithm, called M-Boost, has been proposed that introduces novel classifier selection and classifier combining rules. M-Boost uses a simple partitioning algorithm (i.e., decision stumps) as base classifier to handle a multi-class problem without breaking it into multiple binary problems. It uses a global optimality measures for selecting a weak classifier as compared to standard AdaBoost variants that use a localized greedy approach. It also uses a confidence based reweighing strategy for training examples as opposed to standard exponential multiplicative factor. Finally, M-Boost outputs a probability distribution over classes rather than a binary classification decision. The algorithm has been tested for eleven datasets from UCI repository and has consistently performed much better for 9 out of 11 datasets in terms of classification accuracy. Another multi-class ensemble learning algorithm, CBC: Cascaded Boosted Classifiers, is also presented that creates a multiclass ensemble by learning a cascade of boosted classifiers. It does not require explicit encoding of the given multiclass problem, rather it learns a multi-split decision tree and implicitly learns the encoding as well. In our recursive approach, an optimal partition of all classes is selected from the set of all possible partitions and training examples are relabeled. The reduced multiclass learning problem is then learned by using a multiclass learner. This procedure is recursively applied for each partition in order to learn a complete cascade. For experiments we have chosen M-Boost as the multi-class ensemble learning algorithm. The proposed algorithm was tested for network intrusion detection dataset (NIDD) adopted from the KDD Cup 99 (KDDâ˘A ´ Z99) prepared and managed by MIT Lincoln Labs as part of the 1998 DARPA Intrusion Detection Evaluation Program. To incorporate domain knowledge into boosting an entirely new strategy for incorporating prior into any boosting algorithm has also been devised. The idea behind incorporating prior into boosting in our approach is to modify the weight distribution over training examples using the prior during each iteration. This modification affects the selection of base classifier included in the ensemble and hence incorporate prior in boosting. Experimental results show that the proposed method improves the convergence rate, improves accuracy and compensate for lack of training data. A novel weight adaptation method in a connectionist framework that uses AdaBoost to minimize an exponential cost function instead of the mean square error minimization is also presented in this dissertation. This change was introduced to achieve better classification accuracy as the exponential loss function minimized by AdaBoost is more suitable for learning a classifier. Our main contribution in this regard is the introduction of a new representation of decision stumps that when used as base learner in AdaBoost becomes equivalent to a perceptron. This boosting based method for learning a perceptron is called BOOSTRON. The BOOSTRON algorithm has also been extended and generalized to learn a multi-layered perceptron. This generalization uses an iterative strategy along with the BOOSTRON algorithm to learn weights of hidden layer neurons and output neurons by reducing these problems into problems of learning a single layer perceptron.

Show full item record