PASTIC Dspace Repository

A Mathematical Model Quantifying Sequence Alignment for Constructing Phylogenetic Trees and Ant-Minor Protein Structure Classification.

Show simple item record

dc.contributor.author Khan, Muhammad Asif
dc.date.accessioned 2019-07-22T11:30:35Z
dc.date.accessioned 2020-04-11T15:37:03Z
dc.date.available 2020-04-11T15:37:03Z
dc.date.issued 2019
dc.identifier.govdoc 18066
dc.identifier.uri http://142.54.178.187:9060/xmlui/handle/123456789/5134
dc.description.abstract Biological sequence comparison is fundamental in extracting information that is valuable in applications such as protein structure prediction, predicting structural similarity, phylogenic analysis, homology detection, function prediction and discovering evolutionary relationship. Besides biologists, numerous researchers like mathematicians, statistician and even computer scientists attracted largely towards sequence analysis because of its involvement in various important applications. Protein classi cation is one of the major areas of research in recent years. Despite technological advances, classifying proteins accurately is still a big challenge. In this work, we rst introduce an ant-inspired data mining approach for protein classi cation problem to investigate the e ectiveness of rulesbased approach. Supervised classi cation mechanism along with data mining concepts establishes compact and e cient rules classifying proteins into its correct family. Towards biological sequence analysis, we propose ASIF, a novel algorithm that consists of an alignment algorithm ASIFALIGN and a mathematical model (dASIF ) quantifying the sequence alignment. The proposed approach is based on intra-residue-distance and a plausible (unbiased) penalty factor. A standard dataset of DNA sequences are tested that produces reliable and robust sequence dissimilarities/similarities. Moreover, the proposed approach is used to construct a phylogenetic tree. Phylogenetic trees constructed by our approach outperform other methods. In addition, the proposed approach is applied to protein secondary structure classi cation problem. A dataset of twelve secondary structures are used to validate the distance matrix for classi cation purpose generated by the new alignment algorithm and a mathematical model. Results produced by the new scoring model are very much encouraging which shows reliability of our approach. Our approach not only provides a solid ground for its applications but also performs the fundamental job of dissimilarities/similarities calculation at a reasonable computational complexity. Results reveal the signi cance of our approach and provide a basis of the proposed model to be adopted for other biological applications such as protein function prediction, homology detection and protein fold recognition problem. I would like to dedicate this thesis to My Father (A strong and gentle soul who taught me to trust in ALLAH, believe in hard work and rest assure for the best of the results), My Mother (late)(For being my rst mentor and a true guide in shape of her beautiful memories and love), My Brothers, Sisters and Family (For supporting and encouraging throughout my studies and research). en_US
dc.description.sponsorship Higher Education Commission, Pakistan en_US
dc.language.iso en_US en_US
dc.publisher National University of Computer and Emerging Sciences Islamabad en_US
dc.subject Computer Science en_US
dc.title A Mathematical Model Quantifying Sequence Alignment for Constructing Phylogenetic Trees and Ant-Minor Protein Structure Classification. en_US
dc.type Thesis en_US


Files in this item

This item appears in the following Collection(s)

Show simple item record

Search DSpace


Advanced Search

Browse

My Account