NR-PKS
non-reducing polyketide synthase diversity in fungi

Dataset contains supplementary data for 2014 article on phylogenomic "roadmap" for non-reducing polyketide synthases. It contains/references the original version of dataset.


Updated/expanded versions of the analysis will be linked below, when available.


21/09/2015: added the results of comparison of DTL-RANGER results with ALE results (both amalgamated and original topologies were compared, results on original BI consensus were benchmarked on 352 bipartitions shared with the amalgamated tree

15/08/2015: corrected badly linked PDF version of the reconciled gene tree in Downloads section - link pointed to reference NR-PKS listing instead

The species history is based on 149 fungal genomes (Caenorhabditis elegans was used as an outgroup). A curated alignment of 21 single-copy orthologs (chosen based on FUNYBASE recommendations) was used to ascertain the phylogeny. Relaxed, log-normal autocorrelated clock with soft bounds under a birth death prior was used to calibrate the tree and obtain the chronogram.
A set of KS-AT modules extracted from 413 non-reducing PKS genes was used to construct the gene tree. Reconciliation data is based on ALE v0.3 and DTL-RANGER (with sampling multiple optimal reconciliations at random).

species tree infographic (with numbered nodes), additional tree infographic containing BI/ML supports


gene tree infographic with annotated events* and available functional annotations


* - individual reports on reconciliation/evolutionary events inferred by ALE are available after clicking on gene node numbers (e.g. g766) on the gene tree graphic

  1. Species tree data:
    • concatenated alignment of orthologs: PHYLIP
    • partition file: NEXUS
    • Bayesian Inference species tree (callibrated chronogram, taxon numbers by NCBI/Taxonomy): Newick
    • Maximum Likelihood species tree (taxon numbers by NCBI/Taxonomy): Newick
    • list of species/model genomes: PDF
    • list of single-copy orthologs: PDF
    • dating constraints used: PDF
    • rendering of Bayesian Inference species tree topology: PDF
    • rendering of Maximum Likelihood species tree topology: PDF

  2. Gene tree data:
    • final alignment of 414 KS-AT modules (including FUM1 HR-PKS outgroup): PHYLIP
    • Bayesian Inference gene tree (w/o FUM1 outgroup):Newick
    • Bayesian Inference gene tree, post-ALE reconciliation (no supports): Newick
    • Reference NR-PKSs of known activity/end products: PDF
    • rendering of gene tree (post-ALE reconciliation), with annotated gene structure and domain architecture: PDF

  3. Other supplementary data:
    • mapping of sequence identifiers to gene names: CSV(Excel)
      NOTE: positive identifiers correspond to NCBI/GenBank GIs, negative identifiers are from the internal database (with sequences from Ensembl/JGI/individual published genomes)
    • splicing sites significantly associated with monophyletic clades (post-mapping to domain structure): PDF
    • CHGs (Candidate Homolog Groups) elucidated with MCL: FASTA
    • concordance of DTL-RANGER results with ALE predictions (note: statistics for the original BI consensus tree are based on 352 shared bipartitions): PDF