|
Miscellaneous DNA Sequence Data Sets in SITES Format
SITES reads data files that contain multiple aligned DNA sequences, and it conducts analyses
that are often useful in a population genetics context. This page contains FTP links to a number of data
files that have been compiled and that can be analyzed using SITES. In general these are data sets in
which multiple copies of a genomic region (i.e. homologous DNA sequences) have been obtained from a
species or population. Most of these files also contain data from multiple populations or multiple
closely related species. Please consult the original references that are listed to obtain information about
the design of sampling schemes and the source of samples.
These data sets are in unix style text format (i.e. with end-of-line characters,
but no carriage returns). They should be readable by text editors and by SITES regardless of
what operating system is used.
These files are in zip format, and need to be unzipped. If these FTP links do not work with your
browser, you can try to FTP directly to eve.rutgers.edu.
The files are in subdirectories of the /pub/data directory. Also in those directories are the native, unzipped, forms
of the files.
Almost all of these data sets were generated from genbank sequences.
More data sets will be added as I have time. If you have a data set for which SITES might be a useful tool, and you would like to make it available to others, please let me know, and we can arrange for a link to it to be placed on this page.
 |
asense sequences from melanogaster, simulans,mauritiana and sechellia:
Hilton,H; Kliman,RM; Hey,J 1994 Using hitchhiking genes to study
adaptation and divergence during speciation within the Drosophila
melanogaster complex. Evolution 48, 1900-1913. |
 |
Original Adh data from D. melanogaster: Kreitman, M., 1983 Nucleotide polymorphism at the alcohol dehydrogenase locus of Drosophila melanogaster. Nature 304: 412-417. |
 |
bride-of-sevenless (boss) data for melanogaster and simulans: Ayala,FJ; Hartl,DL (1993): Molecular drift of the Bride of Sevenless (boss) gene in Drosophila. Mol. Biol. Evol. 10, 1030-1040. |
 |
cubitus interruptus sequences from melanogaster, simulans, mauritiana and sechellia: Berry,AJ;
Ajioka,JW; Kreitman,M (1991): Lack of polymorphism on the
Drosophila fourth chromosome resulting from selection. Genetics 129,
1111-1117.
Hilton,H; Kliman,RM; Hey,J (1994): Using hitchhiking genes to study
adaptation and divergence during speciation within the Drosophila
melanogaster complex. Evolution 48, 1900-1913. |
 |
Mitochondrial cytochrome B sequences from melanogaster simulans and yakuba
Ballard,JWO; Kreitman,M (1994): Unraveling selection in the
mitochondrial genome of Drosophila. Genetics 138, 757-772. |
 |
Esterase 6 data for melanogaster and simulans:
Cooke,PH; Oakeshott,JG (1989): Amino acid polymorphisms for esterase 6
in Drosophila melanogaster. Proc. Natl. Acad. Sci. USA 86, 1426.
Hasson,E. and Eanes,W.F. (1966) Contrasting histories of three gene regions associated with In(3L)Payne of Drosophila melanogaster Genetics 144 (4), 1565-1575
Karotam,J; Delves,AC; Oakeshott,JG (1993): Conservation and change in
structural and 5' flanking sequences of esterase 6 in sibling Drosophila
species. Genetica 88, 11-28. |
 |
even-skipped regulatory region data from melanogaster and simulans:
Ludwig,MZ; Kreitman,M (1995): Evolutionary dynamics of the enhancer
regions of even-skipped in Drosophila. Mol. Biol. Evol. 12, 1002-1011. |
 |
G6pdh data in melanogaster and simulans: Eanes,WF;
Kirchner,M; Yoon,J (1993): Evidence for adaptive evolution of
the G6pd gene in the Drosophila melanogaster and Drosophila simulans
lineages. Proc. Natl. Acad. Sci. USA 90, 7475-7479. |
 |
Heat Shock protein 83 (Hsp83) gene, partial cds for melanogaster
This gene is close to an inversion breakpoint, the data is from two types of chromosomes
Hasson,E. and Eanes,W.F. (1996) Contrasting histories of three gene regions associated with In(3L)Payne of Drosophila melanogaster Genetics 144 (4), 1565-1575 |
 |
larval cuticle protein pseudogene sequences from melanogaster: unpublished data, from Pritchard and Schaeffer,
genbank accession numbers U17196-U17205 |
 |
MST26A and Mst26B data from melanogaster: Aguadé,M; Miyashita,N; Langley,CH (1992): Polymorphism and divergence in the Mst26A male accessory gland gene region in Drosophila. Genetics 132, 755-770. |
 |
myosin alkali light chain data from melanogaster and simulans: Leicht,BG;
Muse,SV; Hanczyc,M; Clark,AG (1995): Constraints on intron
evolution in the gene encoding the myosin alkali light chain in
Drosophila. Genetics 139, 299-308. |
 |
period locus data from melanogaster, simulans,mauritiana and sechellia:
Kliman,RM; Hey,J (1993): DNA sequence variation at the period locus
within and among species of the Drosophila melanogaster complex.
Genetics 133, 375-387. |
 |
glucose-6-phosphate isomerase data from melanogaster and simulans: McDonald,J.H. and Kreitman,M.E. The glucose-6-phosphate isomerase locus in four species of
Drosophila. Unpublished (1994) |
 |
prune sequences from melanogaster and simulans: Simmons,GM;
Kwok,W; Matulonis,P; Venkatesh,T (1994): Polymorphism and
divergence at the prune locus in Drosophila melanogaster and D. simulans. Mol. Biol. Evol. 11, 666-671. |
 |
rhabdovirus resistance ref(2)P sequences in melanogaster: Wayne,ML;
Contamine,D; Kreitman,M (1996): Molecular population genetics
of ref(2)P, a locus which confers viral resistance in Drosophila. Mol. Biol. Evol. 13, 191-199. |
 |
Rh3 sequences from melanogaster and simulans: Ayala,FJ;
Chang,BSW; Hartl,DL (1993): Molecular evolution of the Rh3
gene in Drosophila. Genetica 92, 23-32. |
 |
superoxide dismutase variation in melanogaster : Hudson,RR;
Bailey,K; Skarecky,D; Kwiatowski,J; Ayala,FJ (1994): Evidence for positive selection in the superoxide dismutase (Sod) region of Drosophila melanogaster. Genetics 136, 1329-1340. |
 |
vermillion data from multiple populations of melanogaster: Begun,DJ; Aquadro,CF (1995): Molecular variation at the vermillion locus in geographically diverse populations of Drosophila melanogaster and D. simulans. Genetics 140, 1019-1032. |
 |
white locus data from melanogaster: Kirby,DA; Wolfgang,S (1995): Haplotype test reveals departure from neutrality in a segment of the white gene of Drosophila melanogaster. Genetics 141, 1483-1490. |
 |
yolk protein 2 data in melanogaster, simulans, mauritiana and sechellia: Hey,J; Kliman,RM (1993): Population genetics and phylogenetics of DNA sequence variation at multiple loci within the Drosophila melanogaster species complex. Mol. Biol. Evol. 10, 804-822. |
 |
zeste data from melanogaster, simulans, mauritiana and sechellia: Hey,J; Kliman,RM (1993): Population genetics and phylogenetics of DNA sequence variation at multiple loci within the Drosophila melanogaster species complex. Mol. Biol. Evol. 10, 804-822. |
 |
Mitochondrial Control Region data from Human Mongolian Populations
Kolman,CJ; Sambuughin,N; Bermingham,E (1996): Mitochondrial DNA analysis of Mongolian populations and implications for the origin of New World founders. Genetics 142, 1321-1334. |
 |
Mitochondrial NADH dehydrogenase subunit 3 (ND3) data from humans
Nachman,MW; Brown,WM; Stoneking,M; Aquadro,CF (1996): Nonneutral
mitochondrial DNA variation in Humans and Chimpanzees. Genetics 142,
953-963. |
 |
Human mitochondrial control region data of Vigilant et al, with 1 chimp
Vigilant,L; Stoneking,M; Harpending,H; Hawkes,K; Wilson,AC (1991):
African populations and the evolution of human mitochondrial DNA.
Science 253, 1503-1507. |
 |
Beta Globin sequences, 2 human population samples, also 1 chimp and 1 gorilla Fullerton,SM;
Harding,RM; Boyce,AJ; Clegg,JB (1994): Molecular and
population genetic analysis of allelic sequence diversity at the human
beta-globin locus. Proc. Natl. Acad. Sci. USA 91, 1805-1809. |
 |
Pyruvate Dehydrogenase E1-alpha subunit data from
humans, chimpanzees, gorillas and organutans. The human data is published in Hey 1997, Molecular Biology and Evolution 14:166
The ape data is not published, if you use it, please acknowledge Jody Hey |
 |
Pyruvate Dehydrogenase E1-alpha subunit data from humans Hey,J (1997): Mitochondrial and nuclear genes present conflicting portraits of human origins. Mol. Biol. Evol. 14, 166-172. |
 |
Mitochondrial cytochrome oxidase II (COII) data from humans, chimpanzees, pygmy chimpanzees and gorillas Ruvolo,M;
Pan,D; Zehr,S; Goldberg,T; Disotell,TR; vonDornum,M (1994): Gene trees and homioid phylogeny. Proc. Natl. Acad. Sci. USA 91, 8900-8904.
Horai, S., Y. Satta, K. Hayasaka, R. Kondo, T. Inoue, T. Ishida, S. Hayashi and N. Takahata.1992. Man's place in hominoidea revealed by mitochondrial DNA genealogy. J. Mol. Evol. 35: 32-42.
Anderson et al., (1981) Nature 290:457-465 |
 |
glyceraldehyde-3-phosphate dehdrogenase (gapA) data for a wide variety of enteric bacteria
Two independent data sets that overlap almost completely in sequenced region
The E. coli sequences and most of the Salmonella sequences are from:
Nelson,K; Whittam,TS; Selander,RK (1991): Nucleotide polymorphism and evolution in the glyceraldehyde-3-phosphate dehydrogenase gene (gapA) in natural populations of Salmonella and Escherichia coli. Proc. Natl. Acad. Sci. USA 88, 6667-6671.
One Salmonella sequence and all the others are from:
Lawrence,JG; Ochman,H; Hartl,DL (1991): Molecular and evolutionary relationships among enteric bacteria. J. Gen. Microbiol. 137, 1911-1921 |
 |
Malate dehydrogenase (mdh) sequences from E. coli and Salmonella Boyd,EF;
Nelson,K; Wang,FS; Whittam,TS; Selander,RK (1994): Molecular gentic basis of allelic polymorphism in malate dehydrogenase (mdh) in natural populations of Escherichia coli and Salmonella enterica. Proc. Natl. Acad. Sci. USA 91, 1280-1284. |
 |
6-phosphogluconate dehydrogenase (gnd) in many enteric bacteria This is a large data set, 137 sequences, with many sequences from each of many taxa.
several references:
Thampapillai,Lan, and Reeves (1994)Mol. Biol. Evol. 11 (6), 813-828
Nelson and Selander (1994) Proc. Natl. Acad. Sci. U.S.A. 91, 10227-1023
Dykhuizen and Green (1991)J. Bacteriol. 173, 7257-7268
Bisercic, Feutrier and Reeves 1991 J. Bacteriol. 173, 3894-3900 |
 |
Alkaline phosphotase data from E. coli
DuBose,R.F., Dykhuizen,D.E. and Hartl,D.L. (1988) Genetic exchange among natural isolates of bacteria: Recombination within the phoA gene of Escherichia coli Proc. Natl. Acad. Sci. U.S.A. 85, 7036-7040 |
 |
Proline permease (putP) data from E. coli and Salmonella
Nelson,K. and Selander,R.K. (1992) Evolutionary genetics of proline permease gene (putP) and the control region of the proline utilization operon in populations of Salmonella and Escherichia coli J. Bacteriol. 174, 6886-6895 (1992) |
 |
PTS enzyme III cel (celC) from E. coli
Hall,B.G. and Sharp,P.M. 1992 Molecular population genetics of Escherichia coli: DNA sequence diversity at the celC, crr and gutB loci of natural isolates Mol. Biol. Evol. 9, 654-665 |
 |
PTS enzyme III glc (crr) from E. coli
Hall,B.G. and Sharp,P.M. 1992 Molecular population genetics of Escherichia coli: DNA sequence diversity at the celC, crr and gutB loci of natural isolates Mol. Biol. Evol. 9, 654-665 |
 |
PTS enzyme III glucitol (gutB) from E. coli
Hall,B.G. and Sharp,P.M. 1992 Molecular population genetics of Escherichia coli: DNA sequence diversity at the celC, crr and gutB loci of natural isolates Mol. Biol. Evol. 9, 654-665 |
Data from Kliman, R. M.,
P. Andolfatto, J. A. Coyne, F. Depaulis, M. Kreitman, A. J. Berry, J.
McCarter, J. Wakeley and J. Hey, 2000 The population genetics of the
origin and divergence of the Drosophila simulans complex species. Genetics 156:
1913-31.
In this paper we analyzed variation at 14 different loci, among D.
simulans D. mauritiana and D. simulans. Each of these data sets
is in SITES format. Right click (save as) to copy these files to your
local drive. Please see Kliman et al., for details
about the data.
|