Welcome to flyDIVaS_v1.2 Last updated: June 2 2016 flyDIVaS was developed by Craig E. Stanley Jr. and Rob J. Kulathinal flyDIVaS is a comparative genomics database resource of Drosophila divergence and selection and is based on current genomic assemblies, FlyBase annotations, and OrthoDB orthology calls of the original 12 Drosophila sequenced species (Clark et al. 2007). Please cite this original paper, as well as ours, when using these data in publications Clark AG et al. 2007. Evolution of genes and genomes on the Drosophila phylogeny. Nature 450:203-18 Stanley CE and Kulathinal RJ. 2016. flyDIVaS: A comparative genomics resource for Drosophila divergence and selection. in review @G3:Genes|Genomes|Genetics. Version History: flyDIVaS_v1.2 -Corrected missing value of dS in Gene Summary section of web interface for Dmel_Dsim -Corrected capitalization and order of columns for Dmel_Dsim analysis results on Downloads page flyDIVaS_v1.1 -Corrected sign on lnl values for models M7 and M8 -Corrected LRT and associated p-values for tests of selection using M7 and/or M8 General information: You can download a tarball (.tar.gz file) containing a complete set of alignments for each of four taxonomic datasets. 1. Dmel_Dsim 2. mel_subgroup: Dmel, Dsim, Dsec, Dyak, Dere 3. mel_group: Dmel, Dsim, Dsec, Dmel, Dyak, Dere, Dana 4. **12_species: Dmel, Dsim, Dsec, Dyak, Dere, Dana, Dpse, Dper, Dwil, Dmoj, Dvir, Dgri **Divergence estimates should be used with caution due to the saturation of dS at the 12 species phylogenetic distance (see Box 2 in Larracuente et al. 2008). Also, please be aware of the potential for false positives, especially when using the M7 v M8 comparison. Based on our pipeline, each dataset contains three sets of alignments for each gene. fasta CDS files were first downloaded from FlyBase. These nucleotide files are then translated and aligned using MUSCLE to generate the first set of amino acid alignments (unmasked), gene_trx_AA..afa. Then, these aligned fasta files were backtranslated into nucleotide files, creating our second set of unmasked alignments, gene_trx_DNA.afa. Finally we apply masks that filter out nucleotides that flank aligned indels, gene_trx_DNA.afa.mask, primarily for use in downstream analyses such as PAML. i) gene_trx_AA.afa (protein alignment, unmasked) ii) gene_trx_DNA.afa (CDS alignment, unmasked) iii) gene_trx_DNA.afa.mask (CDS alignment, masked) In addition, each taxonomic dataset contains a large table summarizing PAML results for each gene. Columns are described below. gene: FBgn and FBtr (FlyBase accessions for gene and longest transcript) symbol: FlyBase gene symbol cg: FlyBase CG number id: FBgn trx: FBtr species: Number of species in this dataset coverage: Average percentage of unambiguous bases across all species in dataset omega: Estimate of dN/dS according to PAML model M0 dN: Estimate of dN according to Nei and Gojobori (1985) dS: Estimate of dS according to Nei and Gojobori (1985) lnl_1: log-likelihood for PAML model M1a lnl_2: log-likelihood for PAML model M2a lrt_1v2: log-likelihood ratio test statistic for models M1a vs M2a pval_1v2: p-value for lrt_1v2 test pval_1v2_adjust: FDR for lrt_1v2 test pos_12: Does this gene show a statistically significant signal of positive selection via M1avM2a? lnl_7: log-likelihood for PAML model M7 lnl_8: log-likelihood for PAML model M8 lnl_8a: log-likelihood for PAML model M8a lrt_7v8: log-likelihood ratio test statistic for models M7 vs M8 lrt_8v8a: log-likelihood ratio test statistic for models M8 vs M8a pval_7v8: p-value for lrt_7v8 test pval_7v8_adjust: FDR for lrt_7v8 test pos_78: Does this gene show a statistically significant signal of positive selection via M7vM8? pval_8v8a: p-value for lrt_8v8a pval_8v8a_adjust: FDR for lrt_8v8a test pos_88a: Does this gene show a statistically significant signal of positive selection via M8avM8?