Abstract:
Pakistan has a strong geo-strategic location in South Asia that served as a corridor for
subsequent human migration events which configured its genetic variation. Besides, Indo-Pak
has previous history of frequent invasions that contributed to the diversity of culture and
reshaped its genetic makeup. Pakistani population is categorized in various ethnic groups
major of them are Punjabi, Pathan, Baloch, Sindhi, Kashmiri, Hazara and Makrani.
I have characterized mitochondrial DNA control region, Y-Chromosomal STRs and
autosomal STRs on 318 random individuals from three ethnic groups; Sindhi, Kashmiri and
Hazara. Mitochondrial DNA control region analysis revealed that the major proportion of
Sindhi and Kashmiri maternal lineage was contributed by South Asian and Eurasian
haplogroups respectively. The minor proportion was contributed by East Asian, American
and African haplogroups. In Hazara population major maternal components were comprised
of Eurasian and South Asian whereas minor maternal components of American and East
Asian.
Consequently, a series of invasions were reflected in Y-Chromosome gene pool of
Pakistani population. Paternally inherited Y-Chromosome STRs analysis showed great
haplotype diversity of Sindhi (0.999677), Kashmiri (0.99752) and Hazara (0.99989)
populations which were illustrated through median joining network based on haplotypes
frequencies. Allelic frequency distribution exhibited that locus DYS385b was more diverse
and polymorphic in Kashmiri (0.8001), Sindhi (0.8373) and Hazara (0.8373) populations
whereas locus DYS391 was least diverse in Kashmiri (0.4374) and locus DYS392 displayed
minimum diversity value in Sindhi (0.4515) and Hazara (0.4515) population.
Moreover, in this study 318 individuals from Sindhi, Kashmiri and Hazara populations
were genotyped for 15 autosomal STRs. Distribution of allele frequency and other forensic
efficiency parameters; for instance Power of Exclusion (PE), Matching Probability (MP) and Power of Discrimination (PD) were estimated for Sindhi, Kashmiri and Hazara populations.
Locus D2S1338 exhibited maximum power of discrimination in Sindhi (0.9594), Kashmiri
(0.963) and Hazara (0.967). Pairwise linkage disequilibrium was also estimated at a
probability level of p<0.05 revealed that three loci D3S1358, TPOX and D8S1179 in Sindhi
population were significantly deviated from Hardy-Weinberg equilibrium. On the other hand
after applying the Bonferroni correction (p<0.003) only one locus TPOX remain deviated
from Hardy-Weinberg equilibrium. Locus D18S51 and D19S433 in Kashmiri and Hazara
population respectively displayed deviation at the probability level of p<0.005 however, no
deviation was observed after Bonferroni correction (p<0.003).
Accordingly, the pattern of heterogeneous admixture and genetic variation of selected
Pakistani populations were further unveiled by the comparison with local and global
populations through Principal Component Analysis (PCA), Multidimensional Scaling (MDS)
and Phylogenetic analysis. PCA based on mitochondrial haplogroups frequency revealed the
genetic closeness of all Pakistani populations to each other and also with Uzbekistan. MDS
based on Y-chromosome haplotypes exhibited nearness of Kashmiri population with Greece
and Serbia whereas Sindhi population indicated the genetic affinity with East Anatolia and
Iran.
Bipaternal phylogenetic analysis displayed that Sindhi population was in vicinity of
Iraq and Kashmiri population was near to South India. Hazara population shared ancestors
with Siberia and Mongol populations. Furthermore, the data generated in this comprehensive
study can be used to establish lineage of Sindhi, Kashmiri and Hazara population and to
develop a data base of Pakistani population for forensic purpose.