Equine Genomics & Genetics: The Impact on Disease Diagnosis, Research & Treatment
Molly E. McCue, DVM, MS, PhD, DACVIM
St. Paul, MN, USA
Within the last 15 to 20 years there has been an explosion of information on medical genetics that has greatly influenced the practice of human medicine. Most of these advances have been the result of the human genome project that has sequenced the entire human genome and located the estimated 20,000 to 25,000 human genes on their respective chromosomes. The sequencing of the human genome has not only allowed for the discovery of numerous disease mutations, it has resulted in the development of "genomic medicine." Genomics is a relatively recent term that refers to not just the study of single genes and their effects, but to the functions and interactions of all the genes in the genome. Genetics in human medicine is moving away from simply the diagnosis and treatment of single gene diseases and towards understanding how a patient's underlying genetic susceptibilities effect health, disease risk and treatment choices.
In 2006 The Broad Institute of Harvard and MIT sequenced the genome of the Thoroughbred mare Twilight and made the equine genome sequence publicly available early in 2007. Through the efforts of the Broad Institute and previous and current work done in the equine genetics community, over 90% of approximately 2.7 billion base pairs of the equine genome sequence have been identified by their linear arrangement and anchored on chromosomes. Through the Morris Animal Foundation's Consortium in Equine Medical Genetics this equine sequence information has been used to develop powerful tools that allow equine genetic researchers the opportunity to more effectively study simple genetic diseases and opens the doors for the study of more complex equine diseases.
Impact of Genetic Traits in Equine Medicine
Until recently, genetics in equine medicine has focused primarily on single-gene (monogenic) diseases, where a mutation within a single gene that is inherited in a simple Mendelian pattern (autosomal dominant, autosomal recessive, sex-linked) results in significant disease in the affected individual. To date all known equine genetic mutations are single gene diseases and include Hyperkalemic Periodic Paralysis (HYPP)1, Severe Combined Immunodeficiency (SCID)2, Overo Lethal White Syndrome (OLWS)3, Junctional Epidermolysis Bulosa (JEB)4, Glycogen Branching Enzyme Deficiency (GBED)5, Malignant Hyperthermia (MH)6, Hereditary Equine Regional Dermal Asthenia (HERDA)7 and type 1 Polysaccharide Storage Myopathy (type 1 PSSM).8 Although this list is short, single gene diseases are surprisingly prevalent, which is likely due to selective breeding practices. Selective breeding in animal populations may give rise to a common founder that can disseminate the same genetic mutation to all affected offspring. If highly popular sires carry a genetic mutation, their descendents can very rapidly produce many thousands of related offspring carrying the same mutation. This was certainly the case with the stallion Impressive and HYPP, which is estimated to affect 4% of the Quarter Horse population.9
In the simple view of monogenic diseases, the disease genotype predicts the presence or absence of the disease phenotype. However, even for single gene diseases, this relationship is not always clear. For example, the phenotype of horses with the GYS1 mutation responsible for type 1 PSSM varies considerably from subclinical disease to severe rhabdomyolysis which can result in death. The role of environmental factors in this disease has been well established. Horses with PSSM can be managed clinically with a high-fat, low-starch diet and consistent daily exercise.10 However, we have recently demonstrated that horses with the GYS1 mutation that have a concurrent mutation in a second gene have a clinically more severe phenotype that is less responsive to management. Thus, although type 1 PSSM is the result of a single gene autosomal dominant mutation, management of this disease is complicated by both environmental and genetic factors.
Many other genetic traits are thought to be caused by the effects of multiple genes (polygenic traits) or the combination of more than one gene and environmental factors (multifactorial traits). The inheritance and expression of these traits is complex, although it can be demonstrated that they have a genetic component. Examples of polygenic and/or multifactorial traits in equine medicine include Recurrent Airway Obstruction (RAO), Equine Metabolic Syndrome (MetS), Osteochondrosis/Osteochondrosis Dessicans (OC/OCD) and Cervical Stenotic Myelopathy (CSM). Quantitative traits, or those traits that are measured on a continuous numerical scale, are also examples of multifactorial traits. Genetic variation in multifactorial disorders may have a pathologic or protective role in the disease process.
While monogenic diseases are common in domestic animal populations when compared to human populations, they still only account for a relatively small portion of clinical practice. Multifactorial diseases however, account for an even greater portion of clinical equine practice. This may be due in part to the fact that the underlying genetic predispositions to these diseases go unnoticed when environmental conditions are favorable, or that genetic predisposition is not recognized as a component of the disease, resulting in genetically susceptible individuals remaining in the breeding population. Monogenic diseases in equine medicine have had a large impact on the health of a relatively small number of equine patients, whereas polygenic and multifactorial diseases tend to have a moderate impact on the health of a larger number of patients.
In clinical equine patients, the detection of disease causing mutations in monogenic disease is typically perused as a definitive diagnosis, or as part of a pre-breeding assessment. In cases such as PSSM and HYPP, where appropriate management (diet and exercise) and treatment (acetazolamide) are available, genetic testing allows for preventative care. Similarly, identification of the underlying genetic predispositions in the multifactorial diseases such as RAO or MetS will allow for the identification of susceptible individuals and the implementation of preventative measures prior to clinical manifestation of disease.
Underlying genetic predispositions are also important in the susceptibility to infectious diseases. The classic example of this is the protective effect of a 32 base pair deletion in the CCR5 gene. Humans homozygous for this CCR5 mutation are completely resistant to HIV infection. In the future, identification of disease susceptibility to equine infections such as Rhodococcus equi infections, for example, could help the clinician to identify susceptible individuals on a farm where R. equi is endemic.
Beyond disease diagnosis and genetic susceptibility, a patient's genetic make-up could also eventually be used to predict a patient's drug response. It is estimated that 20-95% of the variation in drug disposition and effect in human patients is due to genetic effects. This impact of genetic factors has lead to the formation of an entirely new discipline term "Pharmacogenetics."
Candidate Genes and Genetic Mapping: Finding Disease Mutations and Underlying Genetic Susceptibility
The first step in moving towards genomic medicine in equine practice is the identification of the genetic factors responsible for monogenic and polygenic diseases as well as disease susceptibility. Clinicians play a critical role in this process. Heritable traits should be suspected when there are clusters of affected individuals identified, the trait resembles a heritable disease in other species, or specific family lines are apparent. Because heritable traits usually occur at a relatively low rate in populations, it may take a considerable amount of time to recognize that several related individuals are presenting with a similar disease phenotype. This is particularly true with disease such as HERDA and PSSM where the disease often does not manifest itself until the animals are > 2 years of age.7;10 This problem of recognition is magnified in the multifactorial diseases where particular environmental triggers are necessary for disease to be manifested. Therefore it takes the efforts of astute clinicians to recognize particular disease patterns, and make the connection to genetic disease.
Two different methods are used to identify disease mutations, the candidate gene approach and genetic mapping approaches. Candidate gene approach involves identification of a gene(s) likely to contain the causative mutation based on known biochemical function of the gene product or a homologous disease in other species. The candidate gene approach was used to identify HYPP, SCID, OWLS, JEB, GBED and MH in horses. The candidate gene approach involves the sequencing of the suspect gene in both normal and affected individuals to determine if a detrimental mutation occurs in the affected individuals. A candidate gene can be relatively rapidly sequenced for coding sequence mutations by reverse transcriptase PCR from affected and control tissues. Sequencing of candidate genes has been dramatically simplified by the equine genome sequence.
Although candidate gene approaches are powerful and relatively rapid, they rely upon correct identification of a candidate gene. Identification of a candidate is difficult when no homologous disease is known, biochemical evidence is lacking or potential candidates based on phenotype have been ruled out. In these scenarios, genetic mapping approaches are used to identify "positional candidate genes". Genetic mapping approaches use the inheritance of DNA markers to identify the region of the genome containing the disease gene. The two broad categories of genetic mapping are linkage analysis and association mapping. Both methods rely on the basic genetic principle that the closer two loci are together on a chromosome, the less likely recombination will separate them, and the more likely they will be inherited together. When two genetic loci are inherited together they are "linked". Thus linkage and association analysis aim to identify DNA marker alleles that are always inherited with the disease phenotype. These marker alleles should be linked to the disease gene allele. Because the chromosomal location of the genetic marker is known, the location of the disease gene can be identified.
Linkage analysis evaluates the inheritance of DNA markers within families and is the most powerful mapping approach. However, whole genome scanning using linkage analysis in horses has been limited by the development of appropriate, accurately phenotyped families. Access to large multigenerational highly informative equine families is difficult for several reasons. First, the development of large full sibling resource families is time consuming and expensive. The gestational period in horses is 335 to 342 days, resulting in at most one healthy offspring per mating per year. A minimum of two additional years are required for the F1 generation to reach sexual maturity, resulting in a minimum 3 year period to the birth of the F2 generation. Attaining naturally occurring multigenerational families in which a genetic defect is segregating is also difficult. Gaining the cooperation of owners and breeders is not easy, due to their financial interest in the horses involved and the concern over confidentiality when a genetic disease is present in their breeding herd. In addition most equine breeding operations are in the business of producing horses to sell for profit prior to physical and sexual maturity. Thus the offspring produced may be sold before disease phenotype is manifested, and few individuals are maintained in the breeding program, limiting the number of related individuals in subsequent generations. Furthermore the practice of mating mares to different stallions in subsequent years is commonplace resulting in a large number of half-sibling pairs with few full sibling families, which decreases the informativeness of these naturally occurring pedigrees when compared to planning full-sibling breeding trials.
The use of an association analysis is particularly attractive in equine genetic research because it has the potential to overcome these limitations. Association mapping utilizes populations of unrelated cases and controls and is based on the principle of linkage disequilibrium (LD). Linkage disequilibrium is the nonrandom association between genetic loci.11-13 LD typically refers to allelic association; certain alleles occurring together more frequently than would be explained by random chance because the alleles are closely linked on a chromosome.11 This nonrandom association of alleles can be utilized to identify the location of disease causing mutations when the gene is unknown. When a genetic mutation occurs within an individual (the founder), the flanking chromosomal segment has a particular set of alleles. When the founder individual passes the chromosomal segment with the mutation on to offspring, the offspring also inherit this particular set of flanking alleles. In initial generations the size of the conserved chromosomal segment is large, however over time the size of the conserved chromosomal segment flanking the mutation is eroded by recombination, mutation and genetic drift.11-13 Despite the erosion of the length of the chromosomal segment (and LD) over time; the marker alleles most closely linked to the mutation are much less likely to be altered by recombination and will be conserved. Individuals that have the disease causing mutation would also have a portion of the surrounding chromosomal segment from the founder many generations later. The linkage between an unknown disease causing mutation and particular alleles can be utilized to determine the chromosomal position of the unknown gene by genetic (allelic) association.12,14 The actual disease causing mutation is unknown, therefore, cases are defined by the presence of the disease phenotype and controls are defined by the absence of the disease phenotype. When a marker locus (observed) is in linkage with the disease locus (unobserved), the allele frequencies in cases will differ from the allele frequencies in controls.12,13 This difference in allele frequencies can be statistically detected by the use of the Pearson's Chi square (X2) test of independence.12
There are two types of DNA markers typically used for genetic mapping, the microsatellite marker (MS) and the single nucleotide polymorphism (SNP). Until recently, MS was all that was available for equine mapping projects. The use of association mapping requires that a marker allele is in LD with the disease locus and is therefore dependant on the availability of MS and their genome coverage. The mapping of PSSM and HERDA are both examples of successful mapping of disease loci using MS in the horse. With only about 3000 MS markers mapped on the equine genome, and uneven genome coverage, however, MS have failed to be powerful enough to map other equine diseases. Limitations with association mapping due to genome coverage will be greatly reduced by the availability of the equine genome sequence. There are now tens of thousands of MS in the genome that can be identified from the genome sequence. Additionally, the Broad Institute of Harvard and MIT's SNP discovery effort in Twilight and seven representatives from both recently developed and ancient breeds has identified 1.5 million SNPs in the equine genome. An equine whole genome SNP genotyping chip (~20 SNPs/Mb) is now available, which greatly enhances the power to detect genetic association and find disease genes by analyzing 60,000 SNP markers on a single horse at the same time. We predict that these SNP genotyping chips will allow for rapid identification of monogenic disease genes and will provide the power to map polygenic and multifactorial traits.
With the new genetic tools available, the rate-limiting step in the identification of disease traits will be the identification of large well phenotyped populations to study. This is why the clinician is critical in this process. Once a heritable pattern is recognized, it is critical to collect samples from as many affected individuals as possible. Ideally, 10-20 ml of blood should be collected from both affected and control individuals, and the buffy coat collected and stored at -80°C for future DNA extraction. Whenever possible tissues should also be collected and flash frozen for future mRNA extractions. Tissues should be selected based on knowledge of disease expression.
Beyond Mapping Disease Genes: Other Research Made Possible by New Genetic Tools
In addition to identification of disease genes, the use of SNP markers and new genomic tools can also be used to answer a wide variety of additional research questions. For example, the variation in haplotype, or linear arrangement of SNP markers on chromosomes, was used to determine that the PSSM GYS1 mutation had the same ancestral origin in all the breeds in which it has been identified.8 This same haplotype variation was used to statistically determine that the GYS1 mutation arose approximately 1200-1500 years ago, prior to the creation of modern draft horse breeds.8 And we are in the process of determining if high prevalence of the mutation in some horse breeds is due to a selectional advantage.
In addition to answering questions about disease mutations, new genetic tools can be utilized to answer questions about normal biology and disease pathology. The availability of the whole genome sequence will also allow for the development of tools such as equine-specific expression arrays. Whole genome expression arrays are now used in human, mouse and economically important agricultural species to identify genes regulating normal growth and development, disease resistance and progression and the manifestation of complex traits. Although the etiology and tissue pathology for a range of equine diseases are known, very little is known about the actual genes that are involved, how their patterns of expression are altered in affected tissues, or their effects on the initiation, progression and manifestation of disease. There is also inadequate knowledge of the expression levels of groups of genes under a variety of normal conditions, including normal growth and development. Availability of an equine expression arrays will lead to research to identify the relevant genes, elucidate the DNA sequence variation within these genes, and characterize their expression. A range of developmental bone diseases, infectious diseases, laminitis, respiratory diseases, allergic diseases and other conditions that involve a combination of environment, heredity and management factors will be better understood by the application of expression microarray studies than by the application of any of the currently available tools and resources.
1. Spier SJ, Carlson GP, Harrold D, et al. Genetic study of hyperkalemic periodic paralysis in horses. J Am Vet Med Assoc 1993;202(6):933-937.
2. McGuire TC, Poppie MJ, Banks KL. Combined (B- and T-lymphocyte) immunodeficiency: a fatal genetic disease in Arabian foals. J Am Vet Med Assoc 1974;164(1):70-76.
3. Santschi EM, Purdy AK, Valberg SJ, et al. Endothelin receptor B polymorphism associated with lethal white foal syndrome in horses. Mamm Genome 1998;9(4):306-309.
4. Baird J.D., Millon LV, Dileanis S, et al. Junctional Epidermolysis Bullosa in Belgian Draft Horses. 2003;2003.
5. Ward TL, Valberg SJ, Adelson DL, et al. Glycogen branching enzyme (GBE1) mutation causing equine glycogen storage disease IV. Mamm Genome 2004;15(7):570-577.
6. Aleman M, Riehl J, Aldridge BM, et al. Association of a mutation in the ryanodine receptor 1 gene with equine malignant hyperthermia. Muscle Nerve 2004;30(3):356-365.
7. Tryon RC, White SD, Bannasch DL. Homozygosity mapping approach identifies a missense mutation in equine cyclophilin B (PPIB) associated with HERDA in the American Quarter Horse. Genomics 2007;90(1):93-102.
8. McCue ME, Valberg SJ, Miller MB, et al. Glycogen synthase (GYS1) mutation causes a novel skeletal muscle glycogenosis. Genomics 2008; accepted for publication.
9. Bowling AT, Byrns G, Spier S. Evidence for a single pedigree source of the hyperkalemic periodic paralysis susceptibility gene in quarter horses. Anim Genet 1996;27(4):279-281.
10. Valberg SJ, MacLeay JM, Billstrom JA, et al. Skeletal muscle metabolic response to exercise in horses with 'tying-up' due to polysaccharide storage myopathy. Equine Vet J 1999;31(1):43-47.
11. Abecasis GR, Ghosh D, Nichols TE. Linkage disequilibrium: ancient history drives the new genetics. Hum Hered 2005;59(2):118-124.
12. Ott J. Nonparametric Methods. In: Anonymous. Analysis of Human Genetic Linkage. Third ed. Baltimore: The John Hopkins University Press, 1999;272-296.
13. Gordon D, Finch SJ. Factors affecting statistical power in the detection of genetic association. J Clin Invest 2005;115(6):1408-1418.
14. Peltonen L. Positional Cloning of Disease Genes: Advantages of Genetic Isolates. 2000;50(1):66-75.