Jody Hey                  Evolutionary Genetics

  Professor    -     Department of Genetics     -   Rutgers University

Hey Lab Research Publications Software, Data Contacts, People


 

Spreadsheet for

Hey, J., and R. M. Kliman. 2002. Interactions between natural selection, recombination and gene density in the genes of Drosophila. Genetics 160:595-608.

 

Miscellaneous DNA Sequence Data Sets in SITES Format

SITES reads data files that contain multiple aligned DNA sequences, and it conducts analyses that are often useful in a population genetics context. This page contains FTP links to a number of data files that have been compiled and that can be analyzed using SITES. In general these are data sets in which multiple copies of a genomic region (i.e. homologous DNA sequences) have been obtained from a species or population. Most of these files also contain data from multiple populations or multiple closely related species. Please consult the original references that are listed to obtain information about the design of sampling schemes and the source of samples.

These data sets are in unix style text format (i.e. with end-of-line characters, but no carriage returns). They should be readable by text editors and by SITES regardless of what operating system is used.

These files are in zip format, and need to be unzipped. If these FTP links do not work with your browser, you can try to FTP directly to eve.rutgers.edu. The files are in subdirectories of the /pub/data directory. Also in those directories are the native, unzipped, forms of the files.

Almost all of these data sets were generated from genbank sequences.

More data sets will be added as I have time. If you have a data set for which SITES might be a useful tool, and you would like to make it available to others, please let me know, and we can arrange for a link to it to be placed on this page.




Drosophila melanogaster subgroup data sets


bullet asense sequences from melanogaster, simulans,mauritiana and sechellia: Hilton,H; Kliman,RM; Hey,J 1994 Using hitchhiking genes to study adaptation and divergence during speciation within the Drosophila melanogaster complex. Evolution 48, 1900-1913.
bullet Original Adh data from D. melanogaster:
Kreitman, M., 1983 Nucleotide polymorphism at the alcohol dehydrogenase locus of Drosophila melanogaster. Nature 304: 412-417.
bullet bride-of-sevenless (boss) data for melanogaster and simulans:
Ayala,FJ; Hartl,DL (1993): Molecular drift of the Bride of Sevenless (boss) gene in Drosophila. Mol. Biol. Evol. 10, 1030-1040.
bullet cubitus interruptus sequences from melanogaster, simulans, mauritiana and sechellia:
Berry,AJ; Ajioka,JW; Kreitman,M (1991): Lack of polymorphism on the Drosophila fourth chromosome resulting from selection. Genetics 129, 1111-1117.
Hilton,H; Kliman,RM; Hey,J (1994): Using hitchhiking genes to study adaptation and divergence during speciation within the Drosophila melanogaster complex. Evolution 48, 1900-1913.
bullet Mitochondrial cytochrome B sequences from melanogaster simulans and yakuba
Ballard,JWO; Kreitman,M (1994): Unraveling selection in the mitochondrial genome of Drosophila. Genetics 138, 757-772.
bullet Esterase 6 data for melanogaster and simulans:
Cooke,PH; Oakeshott,JG (1989): Amino acid polymorphisms for esterase 6 in Drosophila melanogaster. Proc. Natl. Acad. Sci. USA 86, 1426.
Hasson,E. and Eanes,W.F. (1966) Contrasting histories of three gene regions associated with In(3L)Payne of Drosophila melanogaster Genetics 144 (4), 1565-1575
Karotam,J; Delves,AC; Oakeshott,JG (1993): Conservation and change in structural and 5' flanking sequences of esterase 6 in sibling Drosophila species. Genetica 88, 11-28.
bullet even-skipped regulatory region data from melanogaster and simulans:
Ludwig,MZ; Kreitman,M (1995): Evolutionary dynamics of the enhancer regions of even-skipped in Drosophila. Mol. Biol. Evol. 12, 1002-1011.
bullet G6pdh data in melanogaster and simulans:
Eanes,WF; Kirchner,M; Yoon,J (1993): Evidence for adaptive evolution of the G6pd gene in the Drosophila melanogaster and Drosophila simulans lineages. Proc. Natl. Acad. Sci. USA 90, 7475-7479.
bullet Heat Shock protein 83 (Hsp83) gene, partial cds for melanogaster
This gene is close to an inversion breakpoint, the data is from two types of chromosomes
Hasson,E. and Eanes,W.F. (1996) Contrasting histories of three gene regions associated with In(3L)Payne of Drosophila melanogaster Genetics 144 (4), 1565-1575
bullet larval cuticle protein pseudogene sequences from melanogaster:
unpublished data, from Pritchard and Schaeffer, genbank accession numbers U17196-U17205
bullet MST26A and Mst26B data from melanogaster:
Aguadé,M; Miyashita,N; Langley,CH (1992): Polymorphism and divergence in the Mst26A male accessory gland gene region in Drosophila. Genetics 132, 755-770.
bullet myosin alkali light chain data from melanogaster and simulans:
Leicht,BG; Muse,SV; Hanczyc,M; Clark,AG (1995): Constraints on intron evolution in the gene encoding the myosin alkali light chain in Drosophila. Genetics 139, 299-308.
bullet period locus data from melanogaster, simulans,mauritiana and sechellia:
Kliman,RM; Hey,J (1993): DNA sequence variation at the period locus within and among species of the Drosophila melanogaster complex. Genetics 133, 375-387.
bullet glucose-6-phosphate isomerase data from melanogaster and simulans:
McDonald,J.H. and Kreitman,M.E. The glucose-6-phosphate isomerase locus in four species of Drosophila. Unpublished (1994)
bullet prune sequences from melanogaster and simulans:
Simmons,GM; Kwok,W; Matulonis,P; Venkatesh,T (1994): Polymorphism and divergence at the prune locus in Drosophila melanogaster and D. simulans. Mol. Biol. Evol. 11, 666-671.
bullet rhabdovirus resistance ref(2)P sequences in melanogaster:
Wayne,ML; Contamine,D; Kreitman,M (1996): Molecular population genetics of ref(2)P, a locus which confers viral resistance in Drosophila. Mol. Biol. Evol. 13, 191-199.
bullet Rh3 sequences from melanogaster and simulans:
Ayala,FJ; Chang,BSW; Hartl,DL (1993): Molecular evolution of the Rh3 gene in Drosophila. Genetica 92, 23-32.
bullet superoxide dismutase variation in melanogaster :
Hudson,RR; Bailey,K; Skarecky,D; Kwiatowski,J; Ayala,FJ (1994): Evidence for positive selection in the superoxide dismutase (Sod) region of Drosophila melanogaster. Genetics 136, 1329-1340.
bullet vermillion data from multiple populations of melanogaster:
Begun,DJ; Aquadro,CF (1995): Molecular variation at the vermillion locus in geographically diverse populations of Drosophila melanogaster and D. simulans. Genetics 140, 1019-1032.
bullet white locus data from melanogaster:
Kirby,DA; Wolfgang,S (1995): Haplotype test reveals departure from neutrality in a segment of the white gene of Drosophila melanogaster. Genetics 141, 1483-1490.
bullet yolk protein 2 data in melanogaster, simulans, mauritiana and sechellia:
Hey,J; Kliman,RM (1993): Population genetics and phylogenetics of DNA sequence variation at multiple loci within the Drosophila melanogaster species complex. Mol. Biol. Evol. 10, 804-822.
bullet zeste data from melanogaster, simulans, mauritiana and sechellia:
Hey,J; Kliman,RM (1993): Population genetics and phylogenetics of DNA sequence variation at multiple loci within the Drosophila melanogaster species complex. Mol. Biol. Evol. 10, 804-822.


Human and Great Ape data sets


bullet Mitochondrial Control Region data from Human Mongolian Populations
Kolman,CJ; Sambuughin,N; Bermingham,E (1996): Mitochondrial DNA analysis of Mongolian populations and implications for the origin of New World founders. Genetics 142, 1321-1334.
bullet Mitochondrial NADH dehydrogenase subunit 3 (ND3) data from humans
Nachman,MW; Brown,WM; Stoneking,M; Aquadro,CF (1996): Nonneutral mitochondrial DNA variation in Humans and Chimpanzees. Genetics 142, 953-963.
bullet Human mitochondrial control region data of Vigilant et al, with 1 chimp
Vigilant,L; Stoneking,M; Harpending,H; Hawkes,K; Wilson,AC (1991): African populations and the evolution of human mitochondrial DNA. Science 253, 1503-1507.
bullet Beta Globin sequences, 2 human population samples, also 1 chimp and 1 gorilla
Fullerton,SM; Harding,RM; Boyce,AJ; Clegg,JB (1994): Molecular and population genetic analysis of allelic sequence diversity at the human beta-globin locus. Proc. Natl. Acad. Sci. USA 91, 1805-1809.
bullet Pyruvate Dehydrogenase E1-alpha subunit data from humans, chimpanzees, gorillas and organutans.
The human data is published in Hey 1997, Molecular Biology and Evolution 14:166 The ape data is not published, if you use it, please acknowledge Jody Hey
bullet Pyruvate Dehydrogenase E1-alpha subunit data from humans
Hey,J (1997): Mitochondrial and nuclear genes present conflicting portraits of human origins. Mol. Biol. Evol. 14, 166-172.
bullet Mitochondrial cytochrome oxidase II (COII) data from humans, chimpanzees, pygmy chimpanzees and gorillas
Ruvolo,M; Pan,D; Zehr,S; Goldberg,T; Disotell,TR; vonDornum,M (1994): Gene trees and homioid phylogeny. Proc. Natl. Acad. Sci. USA 91, 8900-8904.
Horai, S., Y. Satta, K. Hayasaka, R. Kondo, T. Inoue, T. Ishida, S. Hayashi and N. Takahata.1992. Man's place in hominoidea revealed by mitochondrial DNA genealogy. J. Mol. Evol. 35: 32-42.
Anderson et al., (1981) Nature 290:457-465


Escherichia coli and related enteric bacteria data sets


bullet glyceraldehyde-3-phosphate dehdrogenase (gapA) data for a wide variety of enteric bacteria
Two independent data sets that overlap almost completely in sequenced region
The E. coli sequences and most of the Salmonella sequences are from:
Nelson,K; Whittam,TS; Selander,RK (1991): Nucleotide polymorphism and evolution in the glyceraldehyde-3-phosphate dehydrogenase gene (gapA) in natural populations of Salmonella and Escherichia coli. Proc. Natl. Acad. Sci. USA 88, 6667-6671.
One Salmonella sequence and all the others are from:
Lawrence,JG; Ochman,H; Hartl,DL (1991): Molecular and evolutionary relationships among enteric bacteria. J. Gen. Microbiol. 137, 1911-1921
bullet Malate dehydrogenase (mdh) sequences from E. coli and Salmonella
Boyd,EF; Nelson,K; Wang,FS; Whittam,TS; Selander,RK (1994): Molecular gentic basis of allelic polymorphism in malate dehydrogenase (mdh) in natural populations of Escherichia coli and Salmonella enterica. Proc. Natl. Acad. Sci. USA 91, 1280-1284.
bullet 6-phosphogluconate dehydrogenase (gnd) in many enteric bacteria
This is a large data set, 137 sequences, with many sequences from each of many taxa.
several references:
Thampapillai,Lan, and Reeves (1994)Mol. Biol. Evol. 11 (6), 813-828
Nelson and Selander (1994) Proc. Natl. Acad. Sci. U.S.A. 91, 10227-1023
Dykhuizen and Green (1991)J. Bacteriol. 173, 7257-7268
Bisercic, Feutrier and Reeves 1991 J. Bacteriol. 173, 3894-3900
bullet Alkaline phosphotase data from E. coli
DuBose,R.F., Dykhuizen,D.E. and Hartl,D.L. (1988) Genetic exchange among natural isolates of bacteria: Recombination within the phoA gene of Escherichia coli Proc. Natl. Acad. Sci. U.S.A. 85, 7036-7040
bullet Proline permease (putP) data from E. coli and Salmonella
Nelson,K. and Selander,R.K. (1992) Evolutionary genetics of proline permease gene (putP) and the control region of the proline utilization operon in populations of Salmonella and Escherichia coli J. Bacteriol. 174, 6886-6895 (1992)
bullet PTS enzyme III cel (celC) from E. coli
Hall,B.G. and Sharp,P.M. 1992 Molecular population genetics of Escherichia coli: DNA sequence diversity at the celC, crr and gutB loci of natural isolates Mol. Biol. Evol. 9, 654-665
bullet PTS enzyme III glc (crr) from E. coli
Hall,B.G. and Sharp,P.M. 1992 Molecular population genetics of Escherichia coli: DNA sequence diversity at the celC, crr and gutB loci of natural isolates Mol. Biol. Evol. 9, 654-665
bullet PTS enzyme III glucitol (gutB) from E. coli
Hall,B.G. and Sharp,P.M. 1992 Molecular population genetics of Escherichia coli: DNA sequence diversity at the celC, crr and gutB loci of natural isolates Mol. Biol. Evol. 9, 654-665

Data from Kliman, R. M., P. Andolfatto, J. A. Coyne, F. Depaulis, M. Kreitman, A. J. Berry, J.  McCarter, J.  Wakeley and J.  Hey, 2000 The population genetics of the origin and divergence of the Drosophila simulans complex species. Genetics 156: 1913-31.

In this paper we analyzed variation at 14 different loci, among D. simulans D. mauritiana and D. simulans.  Each of these data sets is  in SITES format. Right click (save as) to copy these files to your local drive. Please see Kliman et al.,  for details about the data.

Homepage