Jody Hey                  Evolutionary Genetics

  Professor    -     Department of Genetics     -   Rutgers University

Hey Lab Research Publications Software, Data Contacts, People



A Reduction of "Species" Resolves the Species Problem ------ Jody Hey , January 1997


< PREVIOUS||NEXT>

EMPIRICAL CONSIDERATIONS

Biologists often face the question of whether a sample of organisms comes from one or more than one species. Under the genetic species concept it is also possible that some or all of the organisms are not part of any species. In practice, assessments of genetic drift are expected to fall into two different categories, instantaneous and recent. An instantaneous determination is an assessment based on ongoing patterns of reproduction in contemporaneous organisms. For obligately sexual organisms, an assessment of genetic species status would be the same as an assessment of Mendelian population status (Dobzhansky, 1950). Those organisms that are exchanging genes are necessarily also sharing in a process of genetic drift. In this case, an assessment of genetic species status is reduced to determining if the pattern of genetic drift has sharp partitions such that there are distinct groups having gene flow within and little gene flow between. For organisms that are not exchanging genes, an instantaneous assessment of drift could be made on ecological and demographic grounds by an assessment of demographic exchangeability. However, this is tantamount to measuring the fundamental niche for each of the organisms in the sample (Templeton, 1989) which is generally impractical. An alternative to instantaneous measures are assessments of recent patterns of genetic drift based on patterns of genetic variation. With electrophoretic data on protein variation or with comparative DNA sequence data, the patterns of variation can be interpreted in terms of the genetic drift that has occurred in the time since the variation arose. However, the genetic species concept is a contemporaneous one, and so an assessment of genetic drift based on patterns of genetic variation must admit two kinds of uncertainty. First, it is possible that a group of organisms that seem to have a recent history of genetic drift, may not currently share genetic drift. Secondly, a group of organisms may currently share genetic drift, but this may be due to a recent mixture of historically separated organisms that did not occur in a single species. The current situation may be one genetic species, but the recent history which is reflected in the pattern of genetic variation, may include zero or multiple genetic species.

It is not the purpose of this report to develop practical criteria for the identification and delineation of genetic species. Rather the purpose is to show that an assessment of genetic drift is required if species are to be identified and distinguished. In this view, the task of identifying species and understanding the details of the causes of speciation falls squarely within the domain of population genetics. Since the time of (Wright, 1931) much of the field of population genetics has consisted of research on ways to assess genetic drift, and on the effects of genetic drift. The point that a population genetic approach must be used to identify species and understand the causes of species has been repeatedly emphasized by Templeton (1981, 1989, 1994).

It is useful to provide a qualitative description of some criteria that can be used for the case of a set of DNAs, one from each member in a sample of organisms. It is possible to describe the kinds of gene tree histories that can occur for a set of DNAs under different evolutionary models (Hudson, 1990) and it is possible to estimate the true tree for a sample of DNAs if there is some variation in their sequences. Although DNA sequences and gene tree estimates are not the only way to study genetic drift, they are increasingly referred to in the context of the population genetic causes of speciation (Hey, 1994; Templeton, 1994; Baum and Shaw, 1995).

Consider a sample of two sets of homologous DNAs, one from each of several organisms from each of two candidate species and consider the null model that the sample comes from a single species (Templeton, 1994). The gene tree history of the entire sample will, in the absence of recombination, be representable as a bifurcating diagram (Fig. 1). If the ancestral DNAs collectively underwent genetic drift, then the times between successive nodes of the tree are also a function of a genetic drift process. For several quite simple demographic and linkage/selection models the distribution of times between nodes has been solved (Tavaré, 1984; Hudson and Kaplan, 1988; Takahata, 1988). These theoretical compositions are called "coalescent" models (Kingman, 1982), reflecting the pattern of a collapsing sample as one proceeds from the present into the past. The general prediction is that the most recent nodes of a tree will be more closely spaced in time than the more distant nodes, and that the expected time between successive nodes is proportional to the population size (Hudson, 1990). However the details of this prediction vary considerably depending on the demographic model (Tajima, 1989).

If a sample includes DNAs from two groups of organisms with a collective history of a single genetic drift process followed by a shift (speciation) to two drift processes, then the structure of the gene tree will differ in two general ways from the tree for a sample from a single genetic species. First, the distribution of times between successive nodes after the speciation will be the result of two independent systems of genetic drift. In the terms of a coalescent model, the waiting time between successive nodes will be a function of two random variables instead of a single one. Second, the tree will be partitioned into sections that are exclusive of DNAs from one of the sampled groups (Baum and Shaw, 1995). Figure 3 shows three examples of multiple DNAs from each of two independently drifting populations, one with a faster rate of drift (i.e. short time intervals between nodes) and one with a slower rate of drift. In Fig. 3A, the two samples form separate subtrees; while in Fig. 3B, one sample forms a gene tree that is nested within the gene tree for the other sample. Groups Y and Z of Fig. 3A and group Z in Fig. 3B are monophyletic, meaning that the group includes all of the descendants of a particular common ancestor (Hennig, 1979, p73). Figure 3C shows relatively little partitioning of the gene tree with respect to sample. This pattern could occur if the ancestors of one group recently became geographically separated from others of the same genetic species, so that relatively little drift has occurred within each group.

If genetic drift proceeds among the descendants of groups Y and Z, then the gene trees depicted for group Y in Fig. 3B and groups Y and Z in Fig. 3C will eventually be replaced by monophyletic gene trees (see e.g. (Avise and Ball, 1990)). The emergence of monophyletic gene trees is caused by the forward shift of the pattern of ancestry that occurs within a group of DNAs that share genetic drift (Fig. 2).

FIGURE 3 not loaded

 

Fig. 3 Three hypothetical gene trees. In each case, four DNAs have been randomly sampled from each of two genetic species, Y and Z. Species Y is larger with a slower rate of drift and larger intervals between nodes than species Z. See text for further explanation.

The condition of having previously identified candidate species (as in Fig. 3) is useful for articulating various patterns of genetic drift. However, in practice, researchers must consider cases where multiple species may exist, but where the DNAs have not been labeled beforehand as belonging to candidate species. This uncertainty adds considerably to the difficulty of identifying species simply because a posteriori hypotheses have much stricter criteria of statistical acceptance than a priori hypotheses. For sexual species, this burden can be reduced by generating hypotheses regarding genetic species using data from one locus. A second, unlinked, locus can then be studied to test these hypotheses, which are now a priori (Hey and Kliman, 1993; Hey, 1994).

For sexual organisms in a genetic species, different parts of the genome will have different gene tree histories, though nearly all portions are expected to share genetic drift. The actual rate of drift will vary among genomic regions by chance and because natural selection and variation in recombination rates will cause the rates of genetic drift to vary across the genome. One kind of natural selection that can frustrate a gene tree assessment of species status for sexual organisms, and thus require the study of multiple portions of the genome, is balancing selection. This type of natural selection occurs when there exists a stable pattern of multiple sequences, or alleles, for some region of the genome. The persistence of multiple functional forms will create gene trees like those in Fig. 3A, in this case with designations Y and Z referring to different alleles. Genetic drift will occur within each allele class, but natural selection prevents the replacement of one allele class by descendants of the other.



 

 



< PREVIOUS||NEXT>
© 1997 Jody Hey