Genomics era and complex disorders: Implications of GWAS with special reference to coronary artery disease, type 2 diabetes mellitus, and cancersR Pranavchand, MM Reddy
Molecular Anthropology Group, Biological Anthropology Unit, Indian Statistical Institute, Hyderabad, Andhra Pradesh, India
Correspondence Address: Source of Support: None, Conflict of Interest: None DOI: 10.4103/0022-3859.186390
Source of Support: None, Conflict of Interest: None
The Human Genome Project (HGP) has identified millions of single nucleotide polymorphisms (SNPs) and their association with several diseases, apart from successfully characterizing the Mendelian/monogenic diseases. However, the dissection of precise etiology of complex genetic disorders still poses a challenge for human geneticists. This review outlines the landmark results of genome-wide association studies (GWAS) with respect to major complex diseases - Coronary artery disease (CAD), type 2 diabetes mellitus (T2DM), and predominant cancers. A brief account on the current Indian scenario is also given. All the relevant publications till mid-2015 were accessed through web databases such as PubMed and Google. Several databases providing genetic information related to these diseases were tabulated and in particular, the list of the most significant SNPs identified through GWAS was made, which may be useful for designing studies in functional validation. Post-GWAS implications and emerging concepts such as epigenomics and pharmacogenomics were also discussed.
Keywords: Candidate gene approach, epigenetics, monogenic disease, pharmacogenomics, pleiotropism
With advancing scientific technologies and methods, novel approaches have been employed in understanding the genetic etiology of human diseases. The rapid emergence of genomics and epigenomics from genetics has a huge contribution in revealing the etiology of complex diseases. A large number of novel genetic variants have been identified, apart from those identified by the candidate gene approach and family-based linkage studies. Genomic technologies are widely used in developing personalized medicine and novel drug delivery systems. This review outlines the landmark results of genome-wide association study (GWAS) approach with respect to major complex diseases, particularly coronary artery disease (CAD), type 2 diabetes mellitus (T2DM), and predominant cancers.
Genomics, a term coined by Tom Roderick in 1986, is a branch of genetics that represents the study of genomes. Genome was the term first used by Winkler in 1920 in the context of haploid chromosome set. The suffix "ome," which represents wholeness has its origin from Greek. Later, "omics" was widely used as a suffix for representing studies where collective characterization and quantification of biological molecules were involved. Downstream of genomics is proteomics, which is the study of total protein constituents of organisms.  Although DNA was first isolated in 1869, the history of genomics began after the introduction of Sanger's DNA sequencing method during the 1970s and Mullis polymerase chain reaction (PCR) amplification of DNA during the 1980s [Figure 1]. These methods laid foundation to the Human Genome Project (HGP), 1990 to 2003, which was successfully completed prior to the targeted date and the first draft was published in the February 2001 issue of both "Nature" and "Science" Journals.  Technological advances in DNA sequencing  and the advent of microarray technology where researchers developed silicon microchips with an array of millions of probe molecules against nucleotides of DNA and amino acids of proteins called DNA chips or protein chips provided a platform for several omics studies.  The identification of millions of single nucleotide polymorphisms (SNPs) in human genome and their association with several diseases revolutionized the medical genetics. The GWAS design potentially identified and characterized several mutations in monogenic diseases and provided a base strategy for unraveling the genetic etiology of complex diseases. , This transition in research from genetics to genomics by way of screening millions of SNPs throughout the genomes became possible only by the emergence of ultrahigh throughput genotyping platforms such as Invader assays, Perlegen Genotyping Platform, Affymetrix GeneChips, and Illumina's Infinium BeadChips. Custom designed high throughput SNP assays such as molecular inversion probes (Affymetrix), iPlex assays on the MassARRAY platform (Sequenom), the Centaurus Assay (Nanogen), SNPlex (Applied Biosystems), Golden Gate and Infinium assays (Illumina), TaqMan assay and the OpenArray System (Applied Biosystems), and SNPstream (Beckman Coulter) were developed through a combination of automation, microfluidics, and nanotechnologies and used specifically in replication and validation studies.  Significant SNPs identified across the populations globally are being maintained at the National Human Genome Research Institute's (NHGRI) GWAS database. As of June 15, 2015, the catalog of this site included 2,204 studies and 15,187 SNPs, along with 16,976 SNP-trait associations related to different human genetic diseases.  On the other hand, existing information on the genetic aspects of diseases generated via conventional strategies (candidate gene studies) still remain scattered and attempts are being made to provide curated databases for future use. Some of these databases specifically developed for predominant complex diseases with available information and links to these resources are listed in [Table 1]. A description of 1,552 databases (including database resources such as NCBI, EBI, and JGI), updated and sorted into 14 categories and 41 subcategories related to several aspects of molecular biology, has also been provided by the journal "Nucleic Acid Research." 
We have read through 140 publications till 2015 that were accessed through public websites such as PubMed and Google by utilizing the literature search terms such as "genomics," "genomics of complex diseases," "genomics of CAD," "genomics of T2DM," "genomics of cancers," "epigenetics," "epigenomics," and "pharmacogenetics." Publications not listed in the above websites were obtained through personal correspondence with the authors. While preparing the list of the most important SNPs specific to each of the complex diseases under study, publications/data of GWAS were downloaded in the Excel spreadsheet format from NHGRI GWAS catalog at the NHGRI website (http://www.genome.gov/gwastudies). The other databases searched are as listed in [Table 1].
Degenerative man-made diseases such as cardiovascular diseases (CVDs), diabetes, cancers, and chronic obstructive pulmonary disease (COPD) are the most prevailing diseases worldwide and the major cause for socioeconomic burden in all the World Health Organization (WHO) classified geographical locations. , These diseases are caused due to multiple genetic factors with low or moderate effects and make an individual susceptible under certain environmental triggers; hence, they are called complex genetic diseases (CGDs).  Most of these diseases do not follow a clear-cut pattern of inheritance that can be explained by the genetic architecture.  There has been a huge debate during the past two decades about the genetic etiological aspects of these diseases. Common disease-rare variant (CDRV) hypothesis argues that multiple rare DNA sequence variants, each with relatively high penetrance, are the major contributors to common diseases. On the other side, common disease-common variant (CDCV) hypothesis assumes that "the genetic risk for common diseases will often be due to disease-producing alleles found at relatively high frequencies (>1%)." While the basic evidence for CDRV hypothesis is provided by the findings of breast cancer forms of BRCA1 and BRCA2 mutations and several point mutations in CFTR gene associated with cystic fibrosis, Apo E4 polymorphism, and its association with several complex diseases such as Alzheimer's, coronary disease, T2DM, and metabolic syndrome provide evidence for CDCV hypothesis.  However, in case of the latter it is also found that the associated polymorphisms are also quite common in the general population than in affected individuals. Therefore, the CDCV hypothesis is extended to common variant multiple disease (CVMD) hypothesis stating that "the common alleles which contribute to a given disease under a certain combination of interacting genes and environmental conditions, may act in other genetic backgrounds influenced by other environmental factors resulting in different, possibly related clinical outcomes".  This is consistent with a recent study on the subjects of the Human Genome Diversity Panel, which observed a considerable ethnic variation in the risk allele frequencies of 25 disease SNPs belonging to six major complex diseases without any substantial variation between the disease SNPs and random SNPs.  For example, while the Crohn's disease-associated risk allele rs10761659 is in near-fixation among non-African populations but not found in Africans, the risk allele of one of the T2DM-associated SNPs (rs564316), found in low to intermediate frequencies in some populations, is near-fixation among the Africans. Although such findings plausibly suggest the role of positive selection of these alleles leading to the observed geographical variation in the prevalence of complex diseases, more in-depth analysis on diverse populations is required to properly address this question. 
Many of the complex disease-associated SNPs are observed to show modest phenotypic effect in such a way that they cannot be distinguished as disease-causative or disease-susceptible variants (synonymously used as driver and passenger mutations in case of cancers). For most of the complex diseases, the GWAS-identified variants are different from those that are identified by familial linkage based or candidate gene approaches. The list of significant SNPs found in GWAS (threshold P value less than 10 -8 ), along with chromosomal region, nearby genes, and their functional context are presented with reference to a few complex diseases in [Table 2]. [Figure 2] represents the number of GWAS conducted and the number of SNPs associated with these diseases. We have outlined below the salient features of the findings pertaining to CAD, T2DM, and cancers.
Coronary artery disease
CAD is a predominant cardiovascular condition with estimated 7.2 million deaths in the year 2012.  This disease has become a challenging phenotype for researchers because of its peculiar subclinical heterogeneity. Among the GWAS findings, genes such as CDKN2A/2B, CELSR2-PSRC1-SORT1, PHACTR1, C6orf105, MIA3, CXCl12, and APO A1-CIII-AIV-AV gene cluster are most consistently replicated for CAD. However, these genes are different from the set of genes identified by candidate gene studies such as LDLR, ABC, ApoE, ApoB, and ACE. , Most of the GWAS were conducted on European populations with the chromosomal locus 9p21.3 being consistently replicated among them and subsequently validated in other populations such as Asian Indians. In spite of the novel loci reported by meta-analyses, 9p21.3 remained the most significant region, which needs to be considered for functional evaluation.  Deep sequencing and functional analysis revealed that the conserved sequence of this gene desert chromosomal region is the second densest interval for enhancers, which epigenetically regulate the adjacent genes such as CDKN2A/2B, MTAP by physically interacting between their chromatin domains.  Experiments on mouse models, with deletions in the sequences homologous to the human 9p21.3 chromosomal regions showed increased risk of CAD progression by altering vascular cell proliferation.  A conserved sequence (CNS3) of this chromosomal region is observed with enhancer activity, leading to high expression levels of the short antisense noncoding RNA in the INK4 locus (ANRIL) transcript among individuals with homozygous risk alleles for rs1333045. Quantitative real time (QRT) PCR analysis demonstrated upregulation of cell proliferation gene sets in the patients with increased expression of the ANRIL short variant, whereas decrease in CDKN2B gene expression was observed  Gene expression data obtained from donor heart and vascular tissues with reference to rs1333049, a representative SNP at 9p21.3 chromosomal region, suggest several transcripts to be differentially expressed with no definite pattern. However, canonical pathway modeling of this expression data identified cell cycle G1 phase progression pathway to be activated by proteins encoded by CDKN2A and CDKN2B genes that are adjacent to the 9p21.3 risk locus.  Atherosclerotic tissue-specific expression of ANRIL and the differentially expressed regulatory genes in this 9p21.3 region, which are both associated with multiple complex diseases remain to be explored as therapeutic targets.
Another significant observation from GWAS is that despite the major role of dyslipidemia and blood pressure as modifiable risk factors in the manifestation of CAD, only 12 and 5 of the 41 most significant susceptible loci so far identified for this disease were found to be associated with lipid traits and blood pressure, respectively. However, the results of functional studies, revealed the high risk contribution of inflammatory mechanisms in CAD progression than the abnormal lipid traits.  Nevertheless, the SNP rs964184 from the 11q23.3 chromosomal region that contains Apo genes, which are key regulators of cholesterol metabolism, is consistently associated with CAD, MI phenotypes, and several abnormal lipid traits and therefore, needs to be explored for any underlying molecular regulatory role on the adjacent apolipoprotein genes.
Diabetes is a metabolic disorder that is most prevalent among the complex diseases. WHO estimated about 1.5 million deaths to have occurred worldwide due to diabetes in the year 2012.  The International Diabetes Federation (IDF) reports 382 million people to have diabetes at present, which would rise to 592 million by the year 2035. T2DM is the most common form of diabetes and is the major risk factor for disability caused by making the individual prone to other conditions such as heart disease, stroke, hypertension, nephropathy, neuropathy, skin complications, eye complications as well as mental illness than its direct involvement in the individual's death. The candidate gene studies identified CAPN10, PPARG, KCNJ11, ABCC8, HNF1A, HNF4A, GCK, PC-1/ENPPI, IRS, PTPN1, and LMNA to be the most replicated susceptible genes of T2DM.  TCF7L2 is a candidate gene, which was identified through a linkage study on the Icelandic population and mapped on to chromosome 10.  It was subsequently replicated in Danish, European, and US cohorts  and currently known to be associated across the ethnic groups worldwide. It encodes a transcription factor and is an important regulatory gene of glucose homeostasis in pancreatic islets as determined through several in vitro experiments on mouse models. , Apart from TCF7L2, GWAS found 38 more loci to be associated with the disease among which HHEX, CDKN2A/2B, IGF2BP2, SLC30A8, CDKAL1, HMGA2, KCNQ11, and NOTCH-ADAM30 are the most replicated ones.  Of these, CDKN2A/CDKN2B genes in the 9p21.3 chromosomal region were commonly associated with T2DM and CAD and explored for their specificity in manifesting these diseases in the Human Genome Diversity Project (HGDP) subjects. It was observed that this chromosomal region contains two haplotype blocks, a 44-kb region specifically associated with CAD and another 4-kb region with T2DM.  In a study of Han Chinese population, three SNPs of T2DM-associated haplotype block were also found to be associated with CAD. Given the functional implication of these genes in cell proliferation pathway, more studies are necessary to explore the precise roles of SNPs of these CDKN2A/CDKN2B genes as well as those in the adjacent chromosomal regions.
Cancer is more of a genetic disease in the sense that a change in the genetic material responsible for controlling the cell division or growth is necessary to cause the disease. There could be several factors, which are mutagenic in nature. Not all mutations can cause cancer. Only mutations in an individual cell type that confer the capability to proliferate more than the neighboring cells and subsequent invasion of the tissue and metastasis can lead to different kinds of cancers. Therefore, cancer is considered to be an evolutionary disease where abnormally dividing cells are selected in the microenvironment. Most of the cancers are sporadic in nature and caused due to dominant mutations. Cancers due to germline mutations are rare and often involve multiple organs to be affected. , Cancer accounted for 8.2 million of the global deaths in 2012 and is one of the leading causes of deaths due to noncommunicable diseases (NCDs). , For the number of cases and deaths reported due to the most predominant cancers in the year 2012, lung and breast cancers rank at number 1 and 2, respectively [Figure 3].
Genetic architecture of cancer is well-understood through linkage, candidate gene, and GWAS approaches. A meta-analysis of candidate gene studies reported a noteworthy association of GSTM1, GSTT1, and NAT2 genes belonging to xenobiotic metabolism, MTHFR, CHEK2, XPD, XRCC1 of DNA synthesis and repair mechanism, an inflammatory gene (RNASEL), and MDM2 and TGFB1 genes involved in tumor suppression with various cancers.  This study did not include APC and BRCA1/BRCA2, the low frequency and high penetrant genes in addition to H Ras gene, which is mostly replicated for breast and lung cancers. About only 20% of the cancers are familial and caused by high penetrant genes , and the others involve several of the low to moderate penetrant genes [Table 3].
BRCA1 and BRCA2 genes that produce tumor suppressor proteins are strong candidates for breast and ovarian cancers with an incidence of 55-65% and 11%, respectively. , Ras (H Ras, K Ras, and N Ras) genes provide an interesting link between cancer and cell cycle by coding G protein, a well-known signaling molecule of receptor-mediated signal transduction. Due to its tissue-specific expression with complex downstream effector signaling network, tumor specific oncogenesis is observed with mutations in any of these Ras genes.  The GWAS have also found FGFR2 and MSMB novel cancer susceptible candidate genes for breast and prostate cancers, respectively. Most of the genes associated with cancer belong to either cell signaling or cell cycle regulatory pathway, implicating their prime mechanistic role in oncogenesis. Another prominent finding of the GWAS is the common association of 8q24 chromosomal region with various cancers. A meta-analysis of this region from nine GWASs for seven types of tumors (breast, prostate, pancreatic, lung, ovarian, colon, and glioma) found SNP rs6983267 to be the most significant among the 6,686 SNPs, spanning 128 MB of this gene poor region.  Chromatin analysis narrowed down this region to 1.5KB containing enhancers that might influence the cancer risk via regulation of gene expression.  Kim et al.  provided insights into functional aspects of 8q24 region by showing that this gene desert region is transcribing lncRNAs, which are termed as cancer-associated region long noncoding RNAs (CARLos). Expression analysis of CARLos revealed that CARLo-5 is an important regulator of cell cycle and tumor progression by the long range interaction of its promoter region with cancer-associated variant rs6983267 in MYC enhancer. Taking CARLo-5 as a potential target for cancer therapy, this study developed an approach to investigate the functional relevance of the disease-related variants in gene desert regions.  The future investigations in this direction may determine the clinical importance of this region, which might help in cancer prognosis.
Despite technological advances in identifying disease-susceptible genotypes, we are neither able to characterize the subclinical phenotypes of any complex disease nor make prognosis of the disease itself. Pleiotropism and noncoding localization of the GWAS-identified SNPs have been the major confounding factors in this regard, , apart from the implicit genetic heterogeneity, ethnic susceptibility, gene environment interactions, and epigenetic mechanisms involved in the manifestation of a disease. These are the limiting factors in the potential translation of GWAS results for clinical benefits. On the other hand, the agnostic and unbiased design of GWAS approach which ignores the disease pathobiology, flaws that arise in data collection, experimental execution, and limited use of biostatistical methods, has a huge impact on the success of research. Hence, there is a need for reinterpretation of the GWAS data by computational means in order to understand the genotype - Phenotype relationship that is characterized by genetic heterogeneity and gene-gene and gene - Environment interactions. Bioinformatics tools of data mining, machine learning, and computational modeling using Random Forests or Multifactor dimensionality reduction analysis (MDR) followed by algorithm based attribute selection , would be of help to implement the post-GWAS strategies such as pathway/SNP enrichment analysis, gene - Environment modeling, in vitro functional experiments, and/or in vivo studies on model organisms.  For example, using gene-centered and comparative toxicogenomics databases, SNP Enrichment/Pathway Enrichment Analysis (SEPEA) identified the metabolism of xenobiotics by cytoP450, retinol metabolism, Janus kinase (JAK)- signal transducer and activator of transcription (STAT) metabolism, toll-like receptor signaling, and adipocytokine signaling pathways to be five critical pathways in cardiovascular and metabolic disease progression.  The other tools such as SNP Set Enrichment Analysis (SSEA) and SNP Prioritization Online Tool (SPOT), which analyze information from Hap Map, Gene Expression, and Metabolic Pathway Network databases are designed to prioritize a biologically relevant enriched set of SNPs. ,,, Since living systems are working via dynamic interactions between genes, proteins, and other biochemical molecules, the computational analysis by integrating data related to several etiological aspects of diseases may lead us to a better understanding of the complex disease physiology in the new era called systems biology. , However, extensive measures are to be taken while evaluating the bioinformatic output as well as results of functional studies on causal relationships to different diseases in aggregated biological systems. Future benefits of bioinformatics depend only on the extent of collaborative approach of molecular biologists, biostatisticians, and computer professionals.
Comprising one-sixth of the world's population, India ranks among the top two countries for deaths due to major NCDs that have now assumed endemic proportions. As per the WHO global report,  60% of the total deaths occur in India due to NCDs with 26%, 13%, 7%, and 2% accounting for cardiovascular, COPD, cancer, and diabetes, respectively. There is also an increasing prevalence of coronary disease, diabetes, and dyslipidemia among South Indians, , which makes it imperative to explore these populations urgently for their genetic predisposition. As soon as the NHGRI's GWAS database was made available, validation studies were initiated for the most replicating and significant GWAS-identified SNPs specific to these complex diseases on Indians and Indian migrants living in the Western countries. The prominent findings of GWAS such as association of TCF7L2 gene with T2DM and CDKN2A/2B with both T2DM and CAD were replicated among them. ,, However, some of the major T2DM genes of GWAS such as IGF2BP2 and SLC30A8, consistently replicated in other ethnic groups were not found to be associated with the disease in South Indians while they were in the case of North Indians.  These results suggest a lack of consistency in the pattern of association of disease-specific SNPs among the ethnic groups, both within India and elsewhere. The unique genetic predisposition toward complex diseases of Indians could be due to their unique genetic constitution as suggested by an earlier study,  which observed a common MYBPC 25bps deletion variant only in South Asians with chronic risk of heart failure.  Given the unique genetic makeup of the Indians with diverse ethnic and linguistic groups among them, multiethnic GWAS is imperative for complex diseases, which may throw light on novel genes and pathways. A couple of GWASs conducted on Indians during the past 3 years was in compliance with this; for example, GWAS on the Punjabi population identified a novel intronic variant rs9552911 at chromosomal locus 13q12 harboring SGCG gene associated with T2DM.  Another GWAS conducted simultaneously by the Indian Diabetes Consortium (INDICO) comprising both North and South Indian samples identified novel variant rs998451 that lies in intron 2 of TMEM163 gene to be associated with T2DM.  Apart from replicating a large number of significant loci of T2DM, the above studies implicitly convey that the India-specific susceptible genetic variants need to be explored extensively. Similarly, a GWAS of rheumatoid arthritis conducted among North Indians identified ARL15 as a novel risk factor specific to this population.  Except for T2DM and rhuematoid arthritis, there have been no other GWAS on Indians for complex diseases such as coronary disease, cancer, and COPD, which needs to be the primary focus of the Indian human geneticists.
As mentioned earlier in this paper, there is lack of clarity on the heritability of complex diseases and this could be partly attributed to the epigenetic mechanisms operating on them.  Epigenetics is the "study of stable genetic modifications that result in changes in gene expression and function without a corresponding alteration in DNA sequence."  So far, the best known epigenetic modifications are the DNA methylations, histone modifications, and micro-RNA-mediated gene regulation which, by the site-specific docking of enzyme complexes, modifies accessibility to DNA regulatory elements and open reading frames. The epigenetic modifications with the respective functional status of chromatin (active and inactive state of chromatin) are outlined in [Figure 4]. Among these, DNA methylation is a dynamic epigenetic modification that regulates the gene expression throughout the course of development of multicellular organisms. The methylation of cytosine residues preceding guanine nucleotides of CpG islands by the DNA methyltransferases, namely, DNMT1, DNMT3a, and DNMT3b is best understood for their coordinated action and participation in de novo and maintenance methylation functions.  The other well-understood epigenetic mechanisms are the acetylation, methylation, phosphorylation, ubiquitination, and sumoylation such as posttranslational modifications of amino-terminus of core histone proteins with acetylation and methylation of lysine at histone tail being common.  Apart from these, several noncoding RNAs such as miRNA, piRNA snoRNA, and lncRNA are observed to regulate gene expression by silencing the target genes with miRNA, piRNA, and lncRNA acting via RNA interference mechanism and snoRNAs guiding these functional RNA molecules in their stable folding. , Expression of eNOS gene in endothelial cells with hypermethylated DNA and acetylated H3K9, H4K12 histone proteins and HDAC1-activated hypoxia-induced angiogenesis clearly describe epigenetic regulation of the vascular system. Recent epigenetic studies have observed the histone acetylation-mediated signaling of inflammatory molecules via TNFα and NFκB promoters with a prominent role in the recruitment of inflammatory cells, ultimately leading to progression of atherosclerosis and restenosis. , A variety of dysregulated histone modifying enzymes like lysine methyltransferases, lysine demethylases, and lysine methylation readers were found to be carcinogenic and represent novel targets for therapy. Histone acetyltransferases (HATs) activate the transcription, whereas histone deacetylases (HDACs) are transcriptional repressors. These two enzymes are very important in the dynamic function of chromatin. A number of cancer cell lines are found to be hypoacetylated when compared to normal cell lines, suggesting histone deacetylation could be the primary event of carcinogenesis. This is also evident from the studies confirming overexpression of HDACs in breast, prostate, and colorectal tumors. Preclinical experiments on mouse models suggest that HDAC inhibitors and DNA demethylating agents are promising chemotherapeutic agents to treat cancer.  Epigenetic characterization of diseases is a prerequisite to developing potential treatment strategies for complex diseases. To achieve this major challenge, molecular methods are developed to understand the methylation and histone modification status of chromatin. Bisulfate modification of DNA (that primarily converts cytosine to uracil under the conditions where 5-methyl cytosine remains unaltered) followed by analysis of methylation pattern is the basic principle of the techniques developed for studying this type of epigenetic modification. A spectrum of microarray-based methylation profiling techniques based on methylation sensitive restriction enzyme analysis was also developed to capture large-scale epigenetic modifications that led to emergence of the concept of "epigenomics."  Methylated DNA immunoprecipitation (MeDIP) and chromatin immunoprecipitation (ChIP)-like techniques, which were based on DNA-protein interactions and followed by sequencing (MeDIP-Seq/ChIP-Seq) were used in the Encyclopedia of DNA Elements (ENCODE) project to characterize all kinds of sequences including regulatory, transcriptional factor ,coding, noncoding, and other functional sequences in the human genome. 
Human body response to a drug is a complex trait. Research to study individual differences in drug metabolism was seen after incidences such as G6PD deficiency induced hemolysis in individuals receiving the antimalarial drug, primaquine, during World War II and peripheral neuropathy developed in individuals receiving isoniazid due to impaired N-acetyl transferase enzyme, who are labeled as "slow acetylators" post World War II.  However, the absolute beginning of pharmacogenetic studies can be traced back to Kalow's findings on some of his patients' slow metabolizing response to succinylcholine, a muscle relaxant. It was later found (in the year 1957) that this abnormal response of individuals was due to the inherited atypical cholinesterase enzyme present in their bodies.  Subsequent studies extrapolating candidate gene approaches to understand the implications of such genetic variations in pharmacokinetics and pharmacodynamics laid foundations for the discipline "pharmacogenetics," a term coined by Vogel in 1959.  The importance of pharmacogenetics can be underlined because of the emergence of data such as ~100,000 patients die every year to adverse drug reactions (ADRs) in the US alone, which could be avoided if clinicians have prior knowledge of the patient's genetic profile of drug metabolizing enzymes that determine drug response.  Pharmacogenetic testing involves a careful examination (sequencing) for the presence or absence of specific SNPs and this genetic information can be used in predicting the drug response. To quote Sir William Osler in 1892:
"If it were not the great variability among individuals, medicine might as well be a science."
The following candidate gene pharmcogenetic approach-based findings are classic illustrations of the pharmacogenetic-guided therapy: 
Among the candidate gene studies, cytochrome P450 class of hemoprotein enzymes were mostly studied with CYP2C9, CYP2C19, and CYP2D6 polymorphisms being commonly associated with varied drug metabolisms. This could be because of their ubiquitous nature and major role in phase I drug metabolism.  Further, more comprehensive studies covering whole genome variants and a wide range of diseases are in demand for clinical utility of pharmacogenetics. Implementing GWAS approach to achieve this has led to pharmacogenomics, which replicated HLA-B*5701 mutations showing hypersensitivity and drug-induced liver injury toward abcavir or flucloxacillin. So far, a number of adverse drug reactions such as chemotherapy-induced neutropenia, alopecia, hypersentivity were studied by this GWAS approach.  Considerable data are published on pharmacogenomic aspects of complex traits such as myocardial infarction, leukemia, schizophrenia, and depression using Affymetrix and Illumina platforms. The gene-drug-disease relationship related information is curated and maintained at PharmGKB database (http://www.pharmgkb.org/aboutUs.jsp).  However, the study design, genetic heterogeneity, phenotypic characterization, etc., were found to be confounding factors in determining pharmacogenomic implications of identified genetic variants.  Despite this limitation, pharmacogenomics promises a better management strategy for diseases by providing a drug therapy based on the individual's genetic profile.
Advances in genomic technologies have made a drastic change in the field of disease genetics by reducing the cost of DNA sequencing/SNP genotyping. Today, any monogenic disease can be easily diagnosed and characterized based on the individual's genetic profile. Simultaneously, a large number of disease-specific variants have been identified for complex diseases. Although there have been no concrete conclusion made so far for complex diseases, next generation sequencing and exome sequencing methods are now attempting to achieve it by aiming to characterize the disease-causing variants and disease-susceptible variants. Some of the prominent findings such as CDKN2A/CDKN2B associated with almost all the complex diseases, TCF7L2 with T2DM, a gene-poor 8q24 chromosomal region associated with many cancers are landmark discoveries for complex diseases in the genomic era. Dissecting the precise functional role of these genetic factors in the manifestation of complex diseases would help in developing better disease management strategies. Identification of the rapidly increasing number of disease-specific genetic variants demands integrated genomic-proteomic technologies in order to functionally annotate them. Simultaneously, these genomic technologies have a potential role in therapeutics, which should be marked by conducting clinical trials based on genetically defined populations and developing methods of personalized treatment. These experiments would give rise to a better, more effective, and safer therapeutic output in due course of time.
Financial support and sponsorship
Indian Statistical Institute, Kolkata.
Conflicts of interest
There are no conflicts of interest.
[Figure 1], [Figure 2], [Figure 3], [Figure 4]
[Table 1], [Table 2], [Table 3]