Skip to content

Replacement tree splicer, date-aware, using ete #120

@lentinj

Description

@lentinj

The OneZoom tree is made up of lots of different parts:

  • BespokeTree/include_noAutoOTT: A hand-curated set of trees (either original work or lifted from papers), missing OTT annotations, these are added and saved as BespokeTree/include_OTT3.7 (Capital PHY extension = need to add OTTs)
  • OpenTreeParts/OpenTree_all: Chunks of the OpenTree that have been downloaded
  • OpenTreeParts/OT_required: Orphan chunks of the OpenTree that were downloaded before, and no longer exist
  • token_to_oz_tree_file_mapping.py: Mapping of splice name to full filename, a joining edge length & name for the root node (which the sub-tree probably doesn't have)

The newick format has been extended with a syntax to splice trees together, and starting from Base.PHY, this is what happens.

For generating a dated tree, we need to prototype a replacement splicer that:

  • Uses ete to do this, and see how well it copes with this many trees in memory
  • Translates subtrees from branchlength into date (A non-extinct leaf is date 0, the rest follows from branch length).
  • Splices as-per the splicing syntax described
  • Produces a tree with date annotations instead of branch lengths
  • Writes out a date-annotated tree, which can then be thrown at the date interpolator in a subsequent step. Whether the rest of the pipeline keeps working with dates or we convert back to branch lengths is undecided atm.

Possibly we want to rework the syntax to be slightly more NHX-friendly (i.e. have annotations that include the splicing operations), but isn't essential.

For this to work, we'll also need to:

  • Rework Base.PHY to contain date annotations instead of branch lengths (as the only internal subtree).

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions