Technologies from the Field
QUANTITATIVE TRAIT LOCUS ANALYSIS: MULTIPLE CROSS AND HETEROGENEOUS STOCK MAPPING
Robert Hitzemann, Ph.D.; John K. Belknap, Ph.D.; and Shannon K. McWeeney, Ph.D.
ROBERT HITZEMANN, PH.D., is a professor in and chair of the Department of Behavioral Neuroscience, Oregon Health & Science University, and a research pharmacologist in the Research Service, Portland VA Medical Center, both in Portland, Oregon.
JOHN K. BELKNAP, PH.D., is a professor in the Department of Behavioral Neuroscience, Oregon Health & Science University and a senior career scientist in the Research Service, Portland VA Medical Center, both in Portland, Oregon.
SHANNON K. MCWEENEY, PH.D., is an associate professor in the Division of Biostatistics in the Department of Public Health and Preventative Medicine and an associate professor in the Department of Medical Informatics and Clinical Epidemiology, Oregon Health & Science University, Portland, Oregon.
KEY WORDS: Genetic theory of alcohol and other drug use; genetic factors; environmental factors; behavioral phenotype; behavioral trait; quantitative trait gene (QTG); inbred animal strains; recombinant inbred (RI) mouse strains; quantitative traits; quantitative trait locus (QTL) mapping; multiple cross mapping (MCM); heterogeneous stock (HS) mapping; microsatellite mapping; animal models
Until well into the 1990s, both preclinical and clinical research focused on finding “the” gene for human diseases, including alcoholism. This focus was reinforced by the emergence of technologies to either inactivate (i.e., knock out) a gene or add extra copies of an existing gene in a living organism, which clearly demonstrated that over- or underexpressing a single gene could have a profound effect on behavior. However, a small but vocal group of scientists, including many alcohol researchers, argued that behaviors, including alcohol-related behaviors, were complex traits and therefore no one gene likely would have a large effect. This view was consistent with a large body of genetic research conducted in plants and fruit flies (e.g., Paterson et al. 1988) indicating that, for example, even a presumably simple characteristic, such as the size of a tomato, was determined by several genes. However, it was difficult to convince the scientific community that, in terms of its genetic determination, behavior was similar to the size of a tomato. Only with the advent of new genetic tools did it become possible to prove that many different genes contribute to complex behavioral characteristics. These tools included the following (see Phillips 2002):
- Panels of recombinant inbred (RI) mouse strains. RI strains generally are generated by repeatedly inbreeding brother–sister pairs from the second-generation (F2) offspring of two genetically distinct parent inbred strains. Each F2 animal has a slightly different combination of the parental genes. By repeated inbreeding of brother– sister pairs, researchers can generate numerous distinct inbred animal strains.
- Quantitative trait locus (QTL) mapping. Quantitative traits are characteristics such as height or sensitivity to alcohol that differ in the extent to which an individual possesses that characteristic. The variation in these traits is determined by both genetic and environmental factors. As noted above, the genetic contribution typically involves multiple genes, and each of these genes may exist in several variants (i.e., alleles). QTL analysis allows one to map, with some precision, the genomic position of these alleles.
For many researchers in the alcohol field, the breakthrough with respect to the genetic determination of alcoholrelated behaviors occurred when Plomin and colleagues (1991) made the seminal observation that a specific panel of RI mice (i.e., the BXD panel) could be used to identify the physical location of (i.e., to map) QTLs for behavioral phenotypes. Because the phenotypes of the different strains in this panel had been determined for many alcoholrelated traits, researchers could readily apply the strategy of RI–QTL mapping (Gora-Maslak et al. 1991). Although investigators recognized early on that this panel was not extensive enough to answer all questions, the emerging data illustrated the rich genetic complexity of alcoholrelated phenotypes (Belknap 1992; Plomin and McClearn 1993).
The next advance came with the development of microsatellite maps (Dietrich et al. 1992, 1996). Microsatellites are short pieces of DNA characterized by the repetition of short (i.e., two to four nucleotide) sequences.1(1Nucleotides are the building blocks of DNA. There are four different nucleotides called adenosine [A], cytosine [C], guanosine [G], and thymidine [T]. Microsatellites are characterized by the repetition of two-to-four nucleotide pattens, such as CACACACA.) The number of repetitions of some microsatellites differs among individuals or inbred strains and therefore can be used as a marker, allowing researchers to track how specific microsatellite sequences are inherited. Researchers have mapped the locations, of thousands of such microsatellites in the mouse as well as human genome. Tracking microsatellite markers at specific known sites in the genome is useful because one can simultaneously track the gene variants linked to these markers. With these tools available, the first QTL study mapping a behavioral trait (i.e., activity in a novel environment) in F2 offspring of two genetically distinct inbred mouse strains was published by Flint and colleagues (1995). This study detected numerous QTLs that were significantly associated with the behavior under investigation (see Lander and Kruglyak  for a discussion of how a QTL is determined to be significant). Subsequently, there was an explosion of behavioral QTL mapping studies, including studies that focused on alcoholrelated traits. In a summary of the behavioral mapping data in mice and rats, Flint (2003) reported that hundreds of QTLs had been detected and that, as expected, most of these had very small effects (i.e., accounted for less than 5 percent of the phenotypic variance). Although there has been no detailed summary of behavioral QTL mapping data since 2003, it is reasonable to assume that the number of QTLs detected just in animal models has increased by an order of magnitude.
Of course the easiest, most convenient strategy to map QTLs in mice would be to cross animals from two inbred strains that differ in the behavior under investigation (e.g., sensitivity to alcohol) and then study the offspring to identify relevant QTLs and eventually determine which gene located in the vicinity of the QTL actually is responsible for the observed effect. The main problem with mapping QTLs in such simple intercrosses is that the DNA region, in which the QTL most likely is located (i.e., the 95 percent confidence interval [CI]2 [2As with other parameters, the location of a QTL cannot be determined exactly based on measurements in just a few animals. Instead, a CI is calculated using statistical models that gives an estimated range of values which likely includes the unknown parameter. A 95 percent CI means that the likelihood of the estimated range including the actual value is 95 percent. The CI is calculated from a given set of sample data, and the more data are available, the more accurate the estimate will be and the smaller the range of values for the CI will be. A very wide CI indicates that more data should be collected before the parameter can be determined accurately.] of the QTL), frequently is very large and may, in some cases, include an entire chromosome. Darvasi and Soller (1995) provided a simple equation3 (3According to this equation, CI = 1,500/Nd2, where d = the standardized gene effect; the constant 1,500 was determined empirically from computer simulations and is not related to genome size.) to calculate the 95 percent CI. Based on this equation, if researchers used 600 F2 animals to map a QTL with an effect size of 5 percent, the DNA region that would contain the QTL with 95 percent certainty would encompass 25 centiMorgan (cM) or, for most chromosomes, between 35 and 50 million nucleotides—a region that typically contains hundreds of genes. To reduce this interval to a size that can be analyzed more easily (i.e., to about 1 cM), one would have to study 15,000 animals, which obviously is not feasible. It therefore seems safe to say that the issue of reducing the QTL interval (given the generally modest effect size of most behavioral QTLs) has been the biggest impediment in moving from identifying QTLs to identifying the actual quantitative trait gene(s) (QTGs) and eventually even the relevant nucleotides in those genes (i.e., the quantitative trait nucleotide[s] [QTNs]). Accordingly, relatively few QTGs have been identified unambiguously that contribute to behavioral phenotypes (e.g., Yalcin et al. 2004), and only one of these—a gene called Mpdz— is associated with an alcohol-related trait (i.e., acute alcohol withdrawal) (Fehr et al. 2002; Shirley et al. 2004).
In recent years, however, several strategies have emerged that may help reduce the QTL interval and thereby facilitate the identification of QTGs. This article briefly describes two approaches—multiple cross and heterogeneous stock mapping. Additional approaches are described in the following article by Denmark and colleagues (pp. 266–269).
Multiple Cross Mapping
The concept of combining (i.e., integrating) data obtained from intercrosses of several inbred strains (i.e., multiple crosses) is being used widely to improve QTL characterization for traits of agricultural value (see, for example, Christiansen et al. 2006; Khatkar et al. 2004). The application of this approach, which has been termed multiple cross mapping (MCM), to traits of physiological and behavioral interest also is becoming more frequent (e.g., Hitzemann et al. 2000, 2002, 2003; Jagodic and Olsson 2006; Li et al. 2005; Malmanger et al. 2006; Park et al. 2003; Wergedal et al. 2007; Wittenburg et al. 2005). Our interest in MCM was triggered by the observation that QTL data generated by three different mouse F2 intercrosses in three different laboratories4 (4The studies involved crosses of C57BL/6J mice with BALB/cJ mice [Flint et al. 1995], A/J mice [Gershenfeld et al. 1997], and DBA/2J mice [Koyner et al. 2000].) apparently all detected the same QTL on a part of mouse chromosome 1 that was associated with open-field activity (Flint et al. 1995; Gershenfeld et al. 1997; Koyner et al. 2000); however, the QTL was not detected in a cross of two other mouse strains5 (5This intercross involved BALB/cJ mice and LP/J mice [Hitzemann et al. 2000].) (Hitzemann et al. 2000). Hitzemann and colleagues (2000) proposed that the information obtained with multiple crosses could be used to develop an empirical algorithm for sorting microsatellite markers in order to detect chromosomal regions with the highest probability of containing QTLs.
The principle underlying this theory was that since the inbred mouse strains used actually are closely related, the data described above suggests that there must be a region or regions on chromosome 1 where three strains (i.e., DBA/2J, BALB/cJ and A/J strains) are identical and different from the fourth strain (i.e., C57BL/6J strain). It is perhaps easiest to visualize this in binary terms, where 0 and 1 represent different nucleotides; in a region of interest, the three similar strains could have the structure “0100011100” while the C57BL/6J strain would have the structure “1011100011.” These different patterns are termed differences in haplotype structure. Accordingly, the three strains carry one unit of haplotype structure and the C57BL/6J strain carries a different unit. The haplotype difference could involve a single nucleotide polymorphism (SNP) or, as in the example above, multiple SNPs. Knowing the regions where the strains are similar and where they differ enhances QTL analyses because it provides additional information and thus greater statistical power. Even more details of this haplotype structure became available when researchers developed dense maps that showed the location of SNPs in multiple mouse strains (e.g., Wade et al. 2002). These maps confirmed that some regions of the genome contain very few SNPs, whereas others contain many SNPs. A QTL was presumed to have a greater likelihood of being associated with the SNP-dense region than with the SNP-poor region where there is very little genetic variation.
When conducting MCM analyses, researchers often use “crosses of convenience”—that is, they draw on data obtained in studies that they and other groups have conducted with the strains they were using to address specific research questions. One problem associated with this approach, however, is that often there are missing data. For example, consider the data that originally led to our development of MCM. The three studies on which the analysis was based involved four different inbred mouse strains, but only three different crosses of these animals were analyzed; data for the remaining possible crosses were not available. Without this information, however, the true haplotype structure of the QTL cannot be determined. To address this issue, we created a balanced panel of crosses from four inbred strains in which every strain was crossed with every other strain and have used this panel to map QTLs for open-field activity and alcoholinduced locomotion (Hitzemann et al. 2003; Malmanger et al. 2006). With this approach, the MCM algorithm markedly reduced the QTL CIs and correctly predicted QTL position and haplotype structure as determined by heterogeneous stock (HS) mapping, which is described in the following section.
Heterogeneous Stock Mapping
The problem associated with conducting de novo MCM rather than using a convenience sample of already available, but incomplete, crosses is that it requires a lot of work and many animals. Assuming that, as described above, at least about 600 animals are needed to identify a QTL using an intercross of two inbred strains, then the genetic makeup (i.e., genotype) and relevant behavioral and physical characteristics (i.e., the phenotype) of 2,400 animals would have to be determined to obtain a balanced panel for four inbred strains. Although genotyping has become much easier with the availability of high-throughput devices to map SNPs, the overall effort is still considerable and costly. These considerations have prompted the emergence of HS mapping as an alternative strategy that is precise and provides good information on haplotype structure.
In heterogeneous populations, all individuals have diverse genetic backgrounds. For example, one commonly used heterogeneous mouse stock was generated by interbreeding animals from eight genetically diverse inbred strains (Phillips 2002). HS mapping was first described by Talbot and colleagues (1999) who used it to identify QTLs associated with the phenotype of open-field activity. The investigators were able to map numerous QTLs with high precision. However, the analyses did not detect QTLs associated with this phenotype that previously had been mapped in an F2 intercross population. Mott and colleagues (2000) provided a solution to this problem by developing a mapping algorithm termed HAPPY, which was designed to map QTLs in any HS population derived from known inbred strains without requiring further pedigree information.6 (6With this algorithm, the analysis basically occurs in two steps:  the ancestral haplotype is reconstructed using dynamic programming and then  QTLs are analyzed using linear regression.) The HAPPY algorithm found the previously detected QTLs in the HS mapping and also determined that the QTLs had the expected haplotype structure. Knowing the QTL signature is of considerable value when integrating QTL, gene expression, and gene sequence data.
There can be differences between the results achieved with HS mapping and those achieved with mapping in F2 intercross populations (see figure 15). For example, when analyses of a QTL on chromosome 2 that is associated with alcohol-induced locomotor response were conducted using F2 animals obtained by crossing C57BL/6J and DBA/2J mice, the resulting QTL interval was very broad (Demarest et al. 1999). Moreover, the investigators determined that those QTL alleles that the animals had inherited from the C57B6/6J mice were associated with a decreased response to alcohol. The same researchers then attempted to map the QTLs related to the ethanol response phenotype in an HS that was formed by crossing eight inbred mouse strains, including C57BL/6J and DBA/2J animals (Demarest et al. 2001). The analysis relied on microsatellite genotyping and simply classified alleles as either being similar to those found in C57BL/6J or being different from C57BL/6J alleles. This analysis detected multiple QTLs in the region of interest; furthermore, the C57BL/6J alleles were associated with both increased and decreased ethanol response. These findings suggest that the HR mapping approach is more sensitive than the F2 intercross approach and generates a greater variety of QTLs because none of the data suggest that these multiple QTLs also were present in the F2 intercross (although they also would have been invisible to the type of analysis used by Demarest and colleagues ). Finally, Malmanger and colleagues (2006) performed QTL mapping for the ethanol response phenotype in an HS population generated by crossing four inbred strains (i.e., C57BL/6J, DBA/2J, LP/J, and BALB/cJ mice).7 (7The mapping was done at generation 19, which represents an approximately 10-fold expansion of the genetic map because with each generation additional recombinations are added that allow for finer mapping. In addition, the researchers used a dense SNP panel.) This approach also detected a QTL peak in the region of interest that spanned a region of 1 to 2 million nucleotides. Moreover, the investigators determined the haplotype structure of the QTL and noted that the B6 allele was associated with decreased ethanol response. The integration of these data (i.e., position and haplotype of the QTL) with gene expression databases suggests a strong candidate QTG called Scgn5 (also known as 7B2 and Sgne1), which encodes a protein called secretogranin 5 (Malmanger et al. 2006).
Figure 15. Three strategies for mapping a quantitative trait locus (QTL) on mouse chromosome 2 that is associated with acute ethanol locomotor response. The characteristic (i.e., phenotype) tested is the difference in activity between the administration of saline and the administration of 1.5 g/kg ethanol, measured in 5-minute intervals between 0 and 20 minutes after the injection. The top panel illustrates the result of a QTL mapping analysis in a C57BL/6J x DBA/2J F2 intercross (N = 600) (Demarest et al. 1999). The second panel illustrates mapping of the same phenotype in heterogeneous stock [HS-NPT] mice (N = 500) at generation 32 (Demarest et al. 2001). Data were analyzed in a marker-by-marker design; all markers were microsatellites and were classified as C57BL/6J–like or different. A positive F value indicates that a non-C57 allele is associated with an increased ethanol response. The HS analysis detected several QTLs that were not found in the F2 intercross analysis. The bottom panel shows the results of mapping the same phenotype using heterogeneous stock [HS4] animals (N = 575) at generation 19 and using a panel of closely spaced SNPs as markers (Malmanger et al. 2006). The bar at the top shows the haplotype structure across the region of interest.
NOTE: The LOD (logarithm [base 10] of odds) is a measure of the degree of linkage between a given DNA region or gene and a specific trait.
Currently, four mouse HS populations are available to investigators. One of these, the HS/Ibg, which was formed by crossing eight inbred laboratory mouse strains,8 (8Two of these strains are no longer available.) is available through the Institute for Behavioral Genetics. The other three populations are maintained by the first author and include the HS4 population described in the previous paragraph, the HS–NPT population (see Valdar et al. 2006), and the HS–CC population (an eight-way cross that contains three mouse strains derived from the wild). These HS populations are freely available.
QTL mapping has become an important aspect of efforts to determine the genetic basis of complex behaviors, such as alcohol-drinking behaviors. With new approaches to gene mapping, such as multicross mapping and HS mapping, which improve the accuracy with which QTLs can be located on the chromosomes, the identification of additional candidate QTGs likely is only a matter of time.
The authors declare that they have no competing financial interests.
Beck, J.A.; Lloyd, S.; Hafezparast, M.; et al. Genealogies of mouse inbred strains. Nature Genetics 24(1):23–25, 2000. PMID: 10615122
Belknap, J.D. Empirical estimates of Bonferroni corrections for use in chromosome mapping studies with the BXD recombinant inbred strains. Behavior Genetics 22:677–684, 1992. PMID: 1290453
Christiansen, M.J.; Feenstra, B.; Skovgaard, I.M.; and Andersen, S.B. Genetic analysis of resistance to yellow rust in hexaploid wheat using a mixture model for multiple crosses. TAG. Theoretical and Applied Genetics 112:581–591, 2006. PMID: 16395570
Darvasi, A., and Soller, M. Advanced intercross lines, an experimental population for fine genetic mapping. Genetics 141:1199–1207, 1995. PMID: 8582624
Demarest, K.; McCaughran, J., Jr.; Mahjubi, E.; et al. Identification of an acute ethanol response quantitative trait locus on mouse chromosome 2. Journal of Neuroscience 19:549–561, 1999. PMID: 9880575
Demarest, K.; Koyner, J.; McCaughran, J., Jr.; et al. Further characterization and high-resolution mapping of quantitative trait loci for ethanol-induced locomotor activity. Behavior Genetics 31:79–91, 2001. PMID: 11529277
Dietrich, W.; Katz, H.; Lincoln, S.E.; et al. A genetic map of the mouse suitable for typing intraspecific crosses. Genetics 131:423–447, 1992. PMID: 1353738
Dietrich, W.F.; Miller, J.; Steen, R.; et al. A comprehensive genetic map of the mouse genome. Nature 380:149–152, 1996. PMID: 8600386
Fehr, C.; Shirley, R.L.; Belknap, J.K.; et al. Congenic mapping of alcohol and pentobarbital withdrawal liability loci to a <1 centimorgan interval of murine chromosome 4: Identification of Mpdz as a candidate gene. Journal of Neuroscience 22(9):3730–3738, 2002. PMID: 11978849
Flint, J. Analysis of quantitative trait loci that influence animal behavior. Journal of Neurobiology 54:46–77, 2003. PMID: 12486698
Flint, J.; Corley, R.; DeFries, J.C.; et al. A simple genetic basis for a complex psychological trait in laboratory mice. Science 269:1432–1435, 1995. PMID: 7660127
Gershenfeld, H.K.; Neumann, P.E.; Mathis, C.; et al. Mapping quantitative trait loci for open-field behavior in mice. Behavior Genetics 27:201–210, 1997. PMID: 9210791
Gora-Maslak, G.; McClearn, G.E.; Crabbe, J.C.; et al. Use of recombinant inbred strains to identify quantitative trait loci in psychopharmacology. Psychopharmacology 104:413–424, 1991. PMID: 1780413
Hitzemann, R.; Demarest, K.; Koyner, J.; et al. Effects of genetic cross on the detection of quantitative trait loci and a novel approach to mapping QTLs. Pharmacology, Biochemistry, and Behavior 67:767–772, 2000. PMID: 11166067
Hitzemann, R.; Malmanger, B.; Cooper, S.; et al. Multiple cross mapping (MCM) markedly improves the localization of a QTL for ethanol-induced activation. Genes, Brain, and Behavior 1:214–222, 2002. PMID: 12882366
Hitzemann, R.; Malmanger, B.; Reed, C.; et al. A strategy for the integration of QTL, gene expression, and sequence analyses. Mammalian Genome 14:733–747, 2003. PMID: 14722723
Jagodic, M., and Olsson, T. Combined-cross analysis of genome-wide linkage scans for experimental autoimmune encephalomyelitis in rat. Genomics 88:737–744, 2006. PMID: 17010567
Khatkar, M.S.; Thomson, P.C.; Tammen, I.; and Raadsma, H.W. Quantitative trait loci mapping in dairy cattle: Review and meta-analysis. Genetics, Selection, Evolution 26:163–190, 2004. PMID: 15040897
Koyner, J.; Demarest, K.; McCaughran, J., Jr; et al. Identification and time dependence of quantitative trait loci for basal locomotor activity in the BXD recombinant inbred series and a B6D2 F2 intercross. Behavior Genetics 30:159–170, 2000. PMID: 11105390
Lander, E., and Kruglyak, L. Genetic dissection of complex traits: Guidelines for interpreting and reporting linkage results. Nature Genetics 11:241–247, 1995. PMID: 7581446
Li, R.; Lyons, M.A.; Wittenburg, H.; et al. Combining data from multiple inbred line crosses improves the power and resolution of quantitative trait loci mapping. Genetics 169:1699–1709, 2005. PMID: 15654110
Malmanger, B.; Lawler, M.; Coulombe, S.; et al. Further studies on using multiple-cross mapping (MCM) to map quantitative trait loci. Mammalian Genome 17:1193–1204, 2006. PMID: 17143586
Mott, R.; Talbot, C.J.; Turri, M.G.; et al. A method for fine mapping quantitative trait loci in outbred animal stocks. Proceedings of the National Academy of Sciences of the United States of America 97:12649–12654, 2000. PMID: 11050180
Park, Y.G.; Clifford, R.; Buetow, K.H.; and Hunter, K.W. Multiple cross and inbred strain haplotype mapping of complex-trait candidate genes. Genome Research 13:118–121, 2003. PMID: 12529314
Paterson, A.H.; Lander, E.S.; Hewitt, J.D.; et al. Resolution of quantitative traits into Mendelian factors by using a complete linkage map of restriction fragment length polymorphisms. Nature 335(6192):721–726, 1988. PMID: 2902517
Phillips, T. Animal models for the genetic study of human alcohol phenotypes. Alcohol Research & Health 26:202–207, 2002. PMID: 12875048
Plomin, R., and McClearn, G.E. Quantitative trait loci (QTL) analyses and alcohol-related behaviors. Behavior Genetics 23:197–211, 1993. PMID: 8512533
Plomin, R.; McClearn, G.E.; Gora-Maslak, G.; and Neiderhiser, J.M. Use of recombinant inbred strains to detect quantitative trait loci associated with behavior. Behavior Genetics 21:99–116, 1991. PMID: 2049054
Shirley, R.L.; Walter, N.A.; Reilly, M.T.; et al. Mpdz is a quantitative trait gene for drug withdrawal seizures. Nature Neuroscience 7:699–700, 2004. PMID: 15208631
Talbot, C.J.; Nicod, A.; Cherny, S.S.; et al. High-resolution mapping of quantitative trait loci in outbred mice. Nature Genetics 21:305–308, 1999. PMID: 10080185
Valdar, W.; Solberg, L.C.; Gauguier, D.; et al. Genome-wide genetic association of complex traits in heterogeneous stock mice. Nature Genetics 38:879–887, 2006. PMID: 16832355
Wade, C.M.; Kulbokas, E.J., 3rd; Kirby, A.W.; et al. The mosaic structure of variation in the laboratory mouse genome. Nature 420:574–578, 2002. PMID: 12466852
Wergedal, J.E.; Ackert-Bicknell, C.L.; Beamer, W.G.; et al. Mapping genetic loci that regulate lipid levels in a NZB/B1NJxRF/J intercross and a combined intercross involving NZB/B1NJ, RF/J, MRL/MpJ, and SJL/J mouse strains. Journal of Lipid Research 48:1724–1734, 2007. PMID: 17496333
Wittenburg, H.; Lyons, M.A.; Li, R.; et al. Association of a lithogenic Abcg5/Abcg8 allele on chromosome 17 (Lith9) with cholesterol gallstone formation in PERA/EiJ mice. Mammalian Genome 16:495–504, 2005. PMID: 16151694
Yalcin, B.; Fullerton, J.; Miller, S.; et al. Unexpected complexity in the haplotypes of commonly used inbred strains of laboratory mice. Proceedings of the National Academy of Sciences of the United States of America 101:9734–9739, 2004. PMID: 15210992