Current status of understanding of the genetic etiology of coronary heart diseaseR Pranavchand, BM Reddy
Molecular Anthropology Group, Biological Anthropology Unit, Indian Statistical Institute, Hyderabad, Andhra Pradesh, India
Coronary heart disease (CHD), synonymously known as coronary artery disease (CAD) is the most predominant among the cardiovascular diseases and ranked number one in prevalence among the developing countries. CHD is a multifactorial disease involving both genetic and environmental factors and is primarily caused due to a process of progressive damage of coronary arteries called atherosclerosis. We present here a comprehensive review of molecular genetic studies conducted so far on CAD. The information was gathered through the internet using appropriate search terms for CHD/CAD. We also compiled the relevant information from the following websites: http://www.bioguo.org/CADgene/and http://www.genome.gov. Besides several Mendelian forms of the CHD, ~300 more genes have been identified in different studies through candidate gene approach. Additionally 32 more loci have been identified through genome wide association studies that include 9p21.3 as the most replicated genetic locus across the globe. Nevertheless, overall, these studies have been characterized by a relative lack of consistency in the association pattern across the populations. A fair degree of ethnic variation in the nature of association of different genetic variants with the disease has also been apparent. Pleiotropic effects of genes, existence of subclinical phenotypes and genetic heterogeneity appear to have been the limiting factors for developing a genetic risk profile test for the disease. Given the high prevalence of this disease in India, the presence of environmental triggers and genetic variation, it would be prudent to conduct multi-ethnic large-scale studies in India, representing the subcontinent as a whole-there have been a very limited number of molecular genetic studies on Indian populations.
Keywords: Atherosclerosis, candidate gene association study, ethnicity, genome wide association study
Diseases that are related to the heart and blood vessels are called cardiovascular diseases (CVDs). CVDs are the major cause of morbidity and mortality, accounting for 17 million of 36 million non-communicable disease (NCD) deaths out of the total 57 million deaths that occurred during 2008, and are represented as a pie diagram in the [Figure 1], including deaths due to other major NCDs.  By 2030, the number of CVD deaths is expected to rise to 23.6 million out of the total 44 million NCD deaths.  There are many kinds of CVD conditions. However, due to common genetic and environmental factors and their interactions across the CVD conditions and the interrelated metabolic pathways involved in the functioning of the organ system, it is difficult to classify them precisely. Nevertheless, for the sake of convenience, they can be categorized according to the problem involved as outlined in [Table 1]. Among these conditions, coronary artery disease is the most predominant accounting for 90-95% of all cases and deaths.
The disease usually occurs due to a block or clot in the coronary blood vessels, leading to clinical phenotypes from a less severe angina (both stable and unstable) to more severe forms like acute myocardial infarction (MI), ischemia (low oxygen supply to heart) or even sudden cardiac death. Understanding the mechanisms of disease progression are crucial in the management of a disease, particularly CAD, where problems with phenotypic characterization are well recognized.  Simple biochemical tests for triglycerides, low-density lipoprotein (LDL) cholesterol, high-density lipoprotein (HDL) cholesterol or biophysical methods like electrocardiogram, echocardiogram are only preliminary methods for CAD diagnosis and only a coronary angiogram confirms the disease. , The primary cause of these conditions is atherosclerosis or arteriosclerosis, a progressive damage of the coronary arteries  and hence the disease is also called atherosclerotic heart disease/coronary artery disease/coronary heart disease. 
Atherosclerosis is defined as a disease of blood vessels consisting of both degenerative and regenerative processes that initially affect the intima (inner layer of blood vessel) and at a later stage the media (middle layer of blood vessel) at the bifurcations of the major arteries.  This process of hardening of arteries was initially thought to occur due to lipid deposition but emerging research has found the role of inflammatory response in the development of a hard fibrous cap in the intimal layer of blood vessels.  The process usually begins in the young age with one in every six teenagers having evidence of atherosclerosis, which takes a decade for manifestation of symptoms. A number of theories have been proposed to explain the pathogenesis of atherosclerosis. ,, However, the response to injury hypothesis, which encompasses the essential elements of all the earlier hypotheses and states that Atherosclerosis begins with endothelial injury making it susceptible to the accumulation of lipids and deposition of them, is most widely accepted.
CAD is a multifactorial disease involving both genetic and environmental factors as well as interaction between them. Family history, age, gender, diabetes, smoking, hypertension, dyslipidemia, and obesity have been considered as traditional markers of the disease [Figure 2]. In a multiethnic US study with cohort participants 38% White, 28% Black, 22% Hispanic, and 12% Chinese, it was found that family history of premature CAD was associated with a higher prevalence and magnitude of the disease and is independent of other risk factors.  The risk of developing CAD increases with age and is generally greater in men albeit the risk for women increases after menopause. This gender difference is due to the protective antioxidant effect of the estrogen hormone in women. , A large-scale inter Heart study has observed that all these traditional factors were consistently adverse in all ethnic groups across the world.  There is a strong influence of environmental factors like smoking, diet, lack of physical activity, alcohol consumption, infection of fetal environment and air pollution on the disease.  Tobacco consumption in the form of smoking is the major risk factor among the environmental factors and is involved in many atherogenic metabolisms.  It induces oxidative stress and further oxidation of LDLs and is found to increase the risk independent of the presence of classic risk alleles like Apo E4.  Regular physical activity and moderate alcohol intake increases HDL cholesterol levels, which is known to be protective but over-consumption of alcohol is a risk factor. 
Besides these environmental factors, genetic risk factor assessment studies have been coming up with many nontraditional markers like elevated levels of homocysteine, fibrinogen, C-reactive protein and low levels of HDL cholesterol, insufficiency in CoQ10, nitric oxide and vitamins like D, K.  Most of these risk factors have their own genetic makeup and independent genetic contribution to the disease and behave variably in different environments. Significant genetic components behind these risk factors have been observed: Elevated LDL and VLDL (very low density lipoproteins) cholesterol (40-60%), low HDL cholesterol (45-75%), elevated triglycerides (40-80%), increased BMI (body mass index) (25-60%), elevated SBP (systolic blood pressure) (50-70%), elevated DBP (diastolic blood pressure) (50-65%), Lipoprotein A levels (90%), homocysteine levels (≈45%), T2DM (Type 2 diabetes mellitus) (40-80%), fibrinogen (20-50%) and elevated C-reactive protein (≈40%). ,
Heritability for CAD is an established risk factor from the early twin studies.  A Swedish twin registry study based on 21,004 twins depicted high concordance rates of CAD deaths for monozygotic twins than dizygotic twins in both men and women; this study also observed genetic effects to be more in younger patients than in older age groups.  Apart from twin studies, linkage analysis, allele sharing methods, association studies and analysis of large crosses in model organisms are other approaches in the genetic dissection of CAD and other complex diseases. Initial family-based studies of CAD discovered the monogenic forms like familial hypercholesterolemia, familial ligand-defective apolipoprotein B 100, sitosterolemia, autosomal recessive hypercholesterolemia (ARH) occurring due to mutations in the genes coding for low-density lipoprotein receptor (LDLR), apolipoprotein B 100 (APOB 100), Adenosine triphosphate (ATP) binding cassette transporters (ABCG5 and ABCG8), and ARH respectively.  Abifadel et al., found three families with autosomal dominant hypercholesterolemia due to missense mutation in proprotein convertase subtilisin/kexin 9 (PCSK9) gene, coding for a 692 amino acid protein that belongs to proprotein convertase which coordinates LDL catabolism and is expressed predominantly in the liver, intestine, and kidney. Subsequent studies in humans and mice confirmed that inactivation of this gene resulted in hypercholesterolemia.  All these disease forms are associated with elevated levels of LDL cholesterol except for Tangier disease which is characterized by the absence of HDL cholesterol. Mutations in the ATP binding cassette transporter family gene ABCA1 is identified to cause the Tangier disease.  The identified Mendelian forms through linkage analysis represent only a minor fraction of the CADs with the majority being multifactorial.  The application of cell culture and other techniques in understanding the genetics of these Mendelian forms, particularly familial hypercholesterolemia, unraveled the molecular metabolisms of lipids, such as the role of enzyme HMG Co A reductase, a rate-limiting enzyme in cholesterol synthesis. Further research on lipids and their metabolism found the significance of cholesterol and other lipoprotein complexes of blood in heart diseases. 
Given the multifactorial nature of CAD with many susceptible genes (genes with small to moderate effects), a combination of both linkage and association studies has been conducted in identifying these genes. In these studies, candidate gene approach and genome wide scans have been widely used during the past two decades. These studies mostly included CAD and MI phenotypes whereas the inclusion criteria were based on a broad definition of CAD, covering all clinical phenotypes.  We present here a comprehensive and critical review of the molecular genetic studies conducted so far on CAD. The literature search was made through the internet, including the National Centre for Biotechnology Information (NCBI) website, and using search terms such as coronary artery disease, genetics of coronary artery disease, epidemiology of coronary artery disease, candidate genes of coronary artery disease, genome wide association study (GWAS) for coronary artery disease, genetics of coronary artery disease-Indian scenario and ethnic variation. Publications in the past two decades (till February 2012) were reviewed extensively. For GWAS and candidate gene studies of CAD, we compiled the relevant information from websites such as http://www.bioguo.org/CADgene/and http://www.genome.gov. We also contacted the respective authors requesting for PDF version of the articles to which we did not have free online access besides sometimes downloading the same from other institutes having online access to those journals.
Candidate gene approach is an assessment of the association of a particular allele or variant of a gene that might play a role in the manifestation of disease. This approach involves choosing a candidate gene polymorphism and testing for its frequency distribution and linkage disequilibrium in random samples of affected cases and unrelated controls.  The genetic associations at the human leukocyte antigen (HLA) locus with auto-immune diseases which were initially discovered and replicated by association studies and then confirmed by linkage studies show the robustness of candidate gene approach in genetic dissection of any disease. , So far, more than 300 candidate genes for CAD have been identified by this approach, which belong to a wide range of metabolic pathways like lipid metabolism, blood coagulation, blood pressure, inflammation, cell cycle regulation, and so on. A comprehensive database for CAD candidate genes which includes information from 1300 publications is provided on the website http://www.bioguo.org/CADgene/. These genes are classified into 12 functional categories which are relatively independent of each other except for the category "Others" under which only multitasking genes are included. The number of genes reported under each of these categories is shown in [Figure 3].  Of these metabolic categories, a relatively much larger number of genes was screened for immune and inflammation, lipid and lipoprotein metabolism, endothelial integrity, thrombosis and oxidation-reduction states. A list of extensively studied candidate genes with identified chromosomal loci and relevant metabolic function are given in [Table 2] and the relative proportion of studies showing significant associations for each gene is represented in [Figure 4].
We will now discuss candidate genes that are best understood for their etiological mechanism and widely studied in different populations. Many genes are involved in removal of LDL cholesterol, including those coding for apolipoproteins, lipases and ATP binding proteins. Apo E as a ligand for receptor-mediated clearance of chylomicrons, chylomicron remnants and excess cholesterol has a prominent role in determining plasma cholesterol levels. Three isoforms of this gene-E2, E3 and E4-exist, which differ from each other in amino acid residues at 112 and 158. E2 has cysteine residues at both sites, E3 has cysteine at position 112 and arginine at 158 and E4 has arginine at both sites.  The E4 allele is found to confer increased susceptibility to atherosclerosis whereas E2 is found to be both atherogenic and anti-atherogenic in Finnish,  Caucasian,  Indian , and Chinese populations. Consistent association is also observed for other lipoprotein genes clustered at chromosomal loci, Apo AI-CIII-AIV by traditional restriction fragment length polymorphism (RFLP) marker-based linkage and case control studies among Americans, Europeans and Asians.  Two common variants i.e., D9N (Aspartic acid substituting Aspargine) and N291S (Aspargine substituting Serine) of another prominent gene, LPL, which codes for lipoprotein lipase, a rate-limiting enzyme in lipolysis of triglyceride-rich lipoproteins, are also associated with the disease phenotype. ,
Steps that control cholesterol delivery and disposal are also regulated by membrane transporters of the ATP binding cassette super family and 4 genes of this family, namely ABCA1, ABCG1, ABCG5 and ABCG8, were found to be associated with monogenic forms like Tangier disease and sitosterolemia. Single nucleotide polymorphism (SNP) analysis of ABCA1 identified 20 common variants in the coding, promoter, and 5'untranslated region of the gene.  Kyriakou et al., genotyped for 15 common SNPs in the proximal promoter region and identified a functional polymorphism-407 G > C associated with higher age at onset of the symptoms in angiographically confirmed 1164 British European CAD patients. 
Similarly, a 287 bp I/D polymorphism in intron 16 of Renin Angiotensin System (RAS) metabolism gene, ACE, was initially found associated with MI by Francois Cambien in 1992, since then it has been replicated in several studies among Caucasians, Europeans, Germans, Indians, Japanese and other populations. And a meta-analysis involving 43,733 CAD cases and 82,606 controls from 118 studies revealed 25% increased risk of DD genotypes with the disease.  Later, mutations in other genes of RAS metabolism like M235T variation of angiotensin gene  and variants of angiotensin receptor genes such as 1166 A to C polymorphism of angiotensin receptor II type I gene (AGTR1)  and -1332G variant of angiotensin receptor II type II gene (AGTR2)  have been found associated with the disease. Common variants of important genes with populations studied are provided in [Table 3].
Although genes such as ACE, ATGR1, AGT, MTHFR, NOS3, PON1, SERPINE1 and IL6 are found associated and widely replicated across populations, a relatively greater proportion of studies failed to show association as compared to genes of lipid metabolism, implying the prominent role of lipids in heart ailments. On the other hand, studies on established factors like chemokine receptors, interleukins of inflammatory pathway, coagulation factors like F5, F7 and thrombospondin genes are relatively less, hence more number of replication studies for such genes is required in the near future to gauge the relative importance of these genes in the pathophysiology of CAD.
Despite limitations like the pleiotropic gene effect, study design flaws, lack of statistical power and non-replicability of results,  candidate gene studies have potentially validated a large number of genes to have been associated with CAD. But the ultimate aim of providing a simple diagnostic test by genetic screening remains unfulfilled for CAD or for any other complex disease. Towards realizing this possible goal, genome wide scans for nonconventional genetic markers have been considered useful and carried out widely in the recent past.
Genome wide scanning: Family-based linkage analysis for microsatellite markers
Genome wide scanning is initially done for microsatellite markers that are distributed evenly across the genome for every ten centimorgans (approximately one million base pair region). A significant linkage peak, which is defined by LOD score 3.5, indicates a gene near or within the marker that is in linkage disequilibrium with the disease.  The first genome wide linkage scan involved screening of 303 microsatellite markers in 156 finnish families including at least two affected individuals with premature CAD. Affected sib pair analysis of the population yielded two point LOD scores of 3.7 and 2.9 for the chromosomal regions, 2q21.1-22 and Xq23-26.  Similar studies on other populations with identified chromosomal loci are given in [Table 4].
So far only one locus, 2p11, which is a secondary locus found in Americans, has been replicated in the British Heart Foundation (BHF) study and strikingly three studies i.e., Pajukanta et al.,  Harrap et al., and Samani et al., have identified nearer regions on Chromosome 2 suggesting that this stretch may have disease-susceptible genes. Helagadoittir et al., identified a novel candidate gene, Arachidonate 5 Lipoxygenase activating protein (ALOX5 AP) involved in leukotriene B4 production in an Icelandic population at chromosomal loci 13q12-13. Further, screening for SNPs and LD mapping of the loci revealed a significant association of four-SNP haplotype called Hap A with MI patients (P=0.005). While determining the risk factors of these SNP variants in a British population, the same group found another four-SNP haplotype called Hap B ( P=0.046). These two haplotypes are mutually exclusive.  They also found elevated levels of leukotriene B4 in the blood neutrophils of MI patients suggesting gain of function mutation in the variant Hap A. A subsequent replication study in a Scottish population observed only Hap A being significantly associated with ischemic stroke. 
Another study scanned 382 microsatellite markers in a large family where there were 13 CAD patients with autosomal dominant pattern of disease inheritance and found positive linkage for the marker D15S120 on chromosomal loci 15q26, which is designated as adCAD1. A 21bp deletion in the MEF2A (Melanocyte enhancing factor 2A) gene coding for a transcription factor is identified in this loci in all living affected individuals of the family. The deletion which was absent in normal family members and 119 other individuals with normal angiograms is found associated with a loss of seven amino acids of the protein.  However, subsequently, these results could not be replicated in an Italian population.  LDL receptor-related protein (LRP6) gene on chromosomal loci 12p13 is another novel gene with a missense mutation found in a linkage study covering four generations of a family with early-onset coronary disease and osteoporosis, transmitted in autosomal dominant fashion. For further replication, this study needs to be conducted in families with similar phenotype. 
Similarly, several other genes like Galactin 2 (LGALS2), Arachidonate 5 Lipoxygenase (ALOX5), Phosphodiesterase 4 D (PDE4D), Leukotriene A 4 hydrolase (LTA4H), Lymphotoxin alpha (LTA), Connexin 37 (CX37) were identified as susceptible by this approach.  But the whole genome microsatellite linkage analysis was not successful because of the underlying difficulty in identifying genes in or near such a large marker region. The inconsistency in the results of these studies could also be attributed to several other factors like age of the subjects, diagnostic clinical phenotype, ethnicity, number of families recruited for the study and different statistical methods used in the analysis. 
Genome wide association studies based on SNPs
While the family-based studies failed to replicate, the advent of microarray technologies and use of SNP markers have introduced a modified approach called Genome Wide Association Scan. This was possible only with the successful completion of the human genome project, after discovering millions of SNPs in the human genome, and the International HapMap Project that characterized the linkage disequilibrium patterns.  The identification of complement factor-H as a causal factor for Age-Related Macular Degeneration (ARMD) and identification of genetic factors like TCF7L2, SLC30A8, KCNJ11 and HHEX for Type II Diabetes are two successful Genome Wide Association Studies (GWAS) for complex disorders. 
Employing this approach for coronary disease resulted in identification of 9p21.3, the most replicated genetic signal around the world, highlighting the reliability of GWAS. Since then several loci have been identified by study groups like the Wellcome Trust Case Control Consortium, the Ottawa heart study, German myocardial infarction family studies, decode and Myocardial Infarction Genetics Consortium.  A catalogue of their published studies is provided on the website http://www.genome.gov by National Human Genome Research Institute (NHGRI).  Meta-analyses of 14 GWAS studies by the Coronary Artery DIsease Genome-Wide Replication Meta-Analysis (CARDIoGRAM) group replicated ten already established loci and identified 13 novel loci plus two loci in a more focused design. The International Consortium for Coronary Artery Disease is another group involved in large-scale meta-analyses of GWAS and has come up with three additional loci. So far 32 novel loci have been established from both independent GWAS studies and meta-analyses for CAD/MI phenotype.  Some of the significant loci from GWAS with nearby genes identified and the associated risk alleles are given in [Table 5].
Pleiotropism and intergenic or intronic localization in the genome are characteristic features of many GWAS-identified variants. Of the total 32 loci, 9p21.3 and 11q23.3 are the most pleiotropic, harboring 26 and 20 statistically significant associations, respectively, with various phenotypes. The variants for the 9p21.3 locus are found in the intergenic region of cell cycle regulating genes, CDKN2A and CDKN2B, conferring 29% increased risk for CAD and are also associated with other complex diseases like diabetes, Alzheimer's, stroke, cancer, aneurysms etc., Of these SNP variants, rs10757274(G) and rs2383206(G) are the two common alleles found associated with CAD in three independent cohorts of a Caucasian population  and rs2383207(G) and rs10757278(G) with MI in Germans.  However, one of the studies on Caucasians failed to show an association between the 9p21.3 locus and carotid intima media thickness, suggesting the need of mechanistic studies to delineate the role of SNPs at this locus on plaque instability and intra-arterial thrombosis.  Along with 9p21.3, another locus corresponding to 6q25.3 is observed to increase the CAD risk by 51% and is found to harbor LPA, LPAL2 and SLC22A3 genes belonging to lipid metabolism. Besides these, GWAS has covered several additional loci that are related to CAD risk factors, including: Low-density lipoprotein (LDL) cholesterol (1p13.3, 1p32.3, 9q34.2, 11q23.3, and 19p13.2), high-density lipoprotein (HDL) cholesterol (11q23.3, 17p13.3, and 19p13.2), total cholesterol (1p13.3, 9q34.2, 11q23.3, and 19p13.2), body mass index (BMI) (6p21.3 and17q21.32), triglyceride (11q23.3), blood lipid trait (1p13.3), obesity (6p21.31, 6q23.2 and 7q22.3), adiposity (1q41), alcoholism (6p21.31), smoking behavior (9q34.2 and 15q25.1), diastolic blood pressure (DBP) (12q24.12 and 17q21.32), and systolic blood pressure (SBP) (10q24.12 and 12q24.32). But very few of the earlier reported candidate genes are identified in the marker loci of GWAS e.g., APOA1, APOA5, and APOC3 in the 11q22.3 locus.  While the developing technology integrated with statistical methods made GWAS a potential method in finding all new possible genetic markers for CAD, pleiotropic effects of genes, existence of subclinical phenotypes and genetic heterogeneity are limiting factors in our attempts at using the method for developing a genetic risk profile test.
From the epidemiological and etiological studies, it is evident that ethnic variation is often associated with CAD in terms of incidence, prevalence and estimated effects of risk factors, and it has been a limiting factor in replication of genetic studies. BHF statistical database of ethnic inequalities among Whites, Blacks and South Asians suggests that the incidence of CAD is more in South Asian migrants, accounting for a quarter of all deaths in England, whereas stroke incidence rates are more in Blacks than in the White ethnic group for both sexes.  However, no significant ethnic differences have been found for systolic blood pressure, serum cholesterol and smoking among Whites and Blacks, but for all causes of cardiovascular deaths the incidence is found to be higher in Blacks than Whites.  These findings are confirmed in three Canadian ethnic cohorts of South Asian, Chinese and European ancestry, reporting maximum Intima Media Thickness (IMT) or subclinical atherosclerosis in South Asians.  Among the South Asians which include people of India, Pakistan and Bangladesh, the incidence of CAD is highest in India.  Indian migrants were also found to have high risk as compared to Malay and Chinese migrants in Singapore.  Angiographic patterns in South Asians are similar to those observed in diabetes, and as diabetes is more common among South Asians, this could account for ethnic difference in the disease. 
India, with 2.9 million deaths in the age group 25-69, accounted for 25% of CAD deaths in developing countries in the year 1990 and this would likely to increase to 111% by 2020 as compared to 77% for china, 106% for other Asian countries and 15% for developed countries. , Epidemiological studies of CAD began in early twentieth century in India were confined to the areas like Delhi, Agra, Jaipur and Chandigarh in the north and only Kerala and Chennai  in the south, and confirmed the increasing prevalence of the disease in urban India.  Prevalence of CAD in urban adults of India has increased fourfold in 40 years and even in rural areas it has doubled during the time.  The factors responsible for the increased prevalence of CADs in India include adoption of unhealthy lifestyles comprising lack of exercise and tobacco consumption, nutrition transition towards an atherogenic, cholesterol-rich diet and socioeconomic transition associated with urbanization and industrialization. 
The Indian subcontinent, with one-sixth of the world's population and with diverse ethnic, linguistic and cultural groups had always been the centre of focus for genetic studies. With evidence of the increasing prevalence of CAD during the past two decades among Indians and Indian migrants in other countries, a large number of genetic association studies were conducted among them, which replicated association pattern of many of the candidate gene polymorphisms. Despite inherent difficulties of the design, two recent studies attempted affected sib pair analysis in order to determine the familial nature of CAD. The Indian Atherosclerotic Research Study (IARS) is an ongoing genetic study investigating the molecular basis of CAD in Indian families with strong family history. The participants for this study were drawn from clinics in Bangalore and Mumbai and this group has identified APOC3-Sac1 SNP, an important genetic variant that is associated with CAD through its interaction with plasma lipids.  Another group from the Indian Statistical Institute, Kolkata, screened for 209 SNP markers in 31 genes of ten Quantitative traits like apolipoprotein B (ApoB), C-reactive protein (CRP), fibrinogen (FBG), homocysteine (HCY), lipoprotein (a) (LPA), cholesterol - total (CHOL-T), cholesterol - HDL (CHOL-H), cholesterol - LDL (CHOL-L), cholesterol - VLDL (CHOL-V) and triglyceride (TG) in 144 nuclear families of a homogenous Marwari population from Kolkata. Through Q-TDT analysis, they found nine SNPs of four genes - SELE, VEGFA, FBG and NFKB1-to impact significantly on quantitative precursors of CAD.  However, genetic association studies replicated very few of the major genes like ApoE, ApoA5, ACE, MTHFR, eNOS and PON1 among the Indian populations [Table 6].
With reference to the number of genes associated with CAD worldwide, the candidate gene studies conducted so far on Indian populations are insufficient to draw any comparison. Nevertheless, the prominent locus 9p21.3 found in GWAS is replicated for three SNPs-rs10116277, rs1333040 and rs2383206-in the North Indians and two SNPs, rs2383207 and rs10757278, in the South Indian population. , This 9p21 locus which includes a large 53 kb region has been hypothesized to regulate expression of genes controlling cell proliferation pathways, leading to atherosclerosis. Hence there is a need for an analysis of risk conferred by this region among different ethnic groups in India.
Unlike diabetes, ARMD, Alzheimer's and other complex diseases, where GWAS found TCF7L2, CFH, ApoE-like major genes, respectively, results of genetic research of CAD are quite unsatisfactory in the sense that no major gene has yet been identified. There is inconsistency in the results making it difficult for researchers to identify a demarcating line between the disease-causing genes and disease-susceptible genes of CAD. Next generation sequencing such as exome sequencing may resolve this problem by identifying disease-causing major genes, including the rare ones. However, given its complex nature with a large number of genes showing modest effects, CAD has become a promising phenotype for the geneticists. Mismatch of cases and controls in terms of age and ethnicity and lack of statistical power due to small sample size might have led to spurious associations which cannot be replicated. Another challenge in studying such common diseases is in finding ideal control samples for association studies and/or finding large families to replicate the inheritance patterns from linkage studies. Therefore, an appropriate design for association study considering homogenous phenotype, large sample size with cases and controls drawn from similar age group and ethnicity should be employed. With increasing number of novel genetic factors being identified, multi-ethnic replication studies are necessary to identify populations that are genetically more predisposed to CAD, which may help in devising preventive measures.
There is also a need for surveillance studies in highly populated developing countries like India and China in order to timely assess the disease prevalence. Unless there are effective prevention strategies these nations will have to bear the adverse consequences of the increasing socio-economic burden of these complex genetic disorders. Despite the evidence that Indian migrants in other countries are relatively more susceptible to CAD, a very limited number of candidate genes studies have hitherto been conducted on Indian populations. Given the high prevalence of this disease, the presence of environmental triggers and genetic variation, it is feasible to conduct a multi-ethnic large-scale study in India that would represent the whole of South Asia.
[Figure 1], [Figure 2], [Figure 3], [Figure 4]
[Table 1], [Table 2], [Table 3], [Table 4], [Table 5], [Table 6]