Still under development! Stable release out as a version, but only if there is a closely related mitogenome available.
Documentation: Wiki
Source install Run the below commands:
git clone https://github.com/npbhavya/MitoBee.git
cd MitBee
mamba create -y -n mitobee python=3.13
conda activate mitobee
pip install -e .
Once I have a stable version release, I will upload them to conda and pip as well
Note: This code works only on paired end metagenomes for now.
This workflow is made modular;
mitobee runIf there is a representative closely related genome mitogenome, provide that as the host seq and get started
#Running mitobee with test files available in the repo
mitobee run --input test-files/metagenomes --extn fastq.gz \
--pattern_r1 _R1 --pattern_r2 _R2 \
--host_seq test-files/am-dh4.fasta \
--output output
mitobee treeOnce the mitochondrial genomes are built from each metagenome sample, run this module to build a tree with these mitogenomes and other references
#After the mitogenomes are made from the mitobee run results. Add other references to build a tree
#Once again example with test files
mitobee tree --input test-files/mitogenomes --extn fasta --output output -k all
mitobee searchIf there is a no closely related mitogenome available, then this step can be run first to search against a set of mitogenomes or mito genes This module will provide an overview of which reference to use
#If a closely related mitochondrial genome is not available, but a gene is, like cox or rRNA genes
#Download the reference genes you would like to use of the closely related genomes
#to search against mitogenomes refernece set
mitobee search --input test-files/mitogenomes --extn fastq.gz \
--pattern_r1 _R1 --pattern_r2 _R2 \
--ref_seq test-files/ref-set-genome --output output \
-k all --mode mitogenome
#to search against mitogenomes refernece gene set
mitobee search --input test-files/mitogenomes --extn fastq.gz \
--pattern_r1 _R1 --pattern_r2 _R2 \
--ref_seq test-files/ref-set-genes --output output \
-k all --mode genes
Input files:
- Input directory with metagenomes
- Reference directory
- If running
runortreemodule, provide a (one) reference genome. - If running
genemodule, provide a reference gene set
- If running
Output files: Provide the output folder, contains subdirectories
- PROCESSING: Folder containing intermediate files
- REPORTS: Final results including the mitogenome fasta files from (hopefully) each metagenome sample
Also inlcudes the QC reports, to include stats on how many reads were processed, and not
