Abstract:
The field of computer science has greatly benefitted from the advances in combina-
torial algorithms in the last few decades. This is because the advent of high speed
computers has made it possible to generate lists of combinatorial objects in a practi-
cal amount of time. In areas such as genome science and data mining the problems
are often vaguely defined, and researchers have to look for meaningful information in
huge datasets.
In this thesis, efficient generation algorithms for subgraphs such as bicliques and
paths in cliques are developed. Cliques and bicliques are used to model various
real-world problems encountered in bio-informatics, data mining and networks. We
consider two variations of bicliques: pseudo-bicliques and c-isolated bicliques. Pseudo-
bicliques relax the rigid connectivity requirement of bicliques to cater for missing and
noisy data. On the other hand, the c-isolated bicliques enforce a restriction on the
external connectivity of the vertices in a biclique to model cohesive communities.
This thesis presents an algorithm based on reverse search to list all pseudo-bicliques
in a graph G. The algorithm takes linear time on average to generate each pseudo-
biclique. On the other hand, our generation algorithm for c-isolated bicliques exploit
underlying properties of an isolated biclique to trim the input graph. Furthermore,
the algorithm deploys the vertex cover enumeration algorithm based on fixed point
tractability (FTP) and lists all isolated bicliques in linear time, in the case where c is
constant. The performance of the proposed algorithms is evaluated on random graphs
and real-world problems. The results are quite promising and confirm our theoretical
findings.
In this research work, we also explore another combinatorial object called a clique. A
constant amortized time algorithm is proposed to generate all spanning paths and all
paths in a clique in minimal change order (an ordering in which successive elements
differ in a small way)