|
|
# Binning #
|
|
|
|
|
|
Single genomes are reconstructed from metagenome assemblies using [MetaBAT](https://bitbucket.org/berkeleylab/metabat/src/master/) serving one of the main purpose of genosysmics. MetaBAT use Tetra Nucleotide Frequency and Abundance of each contigs to build a pairwise distance matrix between contigs and reconstruct genomes bins.
|
|
|
|
|
|
|
|
|
# Binning Strategies #
|
|
|
|
|
|
Genosysmics offers two binning strategies named single and co-binning, together with assembly strategies (single and co-assembly) there are four ways to recovers MAGs from metagenomes :
|
|
|
- single-binning from single-assembly
|
|
|
- single-binning from co-assembly
|
|
|
- co-binning from single-assembly
|
|
|
- co-binning from co-assembly
|
|
|
|
|
|
## Single-binning ##
|
|
|
|
|
|
Single-binning consist in the reconstruction of genomes bins per assembly. Cluster reads or Sample reads are mapped against their assembly (cluster assembly or sample assembly). Genosysmics use "jgi_summarize_bam_contig_depths" metabat utility to recover depth of coverage of each contig. Then, genomes bins are reconstructed from the specified assembly.
|
|
|
|
|
|
## Co-binning ##
|
|
|
|
|
|
Co-binning reconstruct genomes bins from a concatenation of several assemblies.
|
|
|
In this case, assemblies are first concatenated into a unique Fasta file (concatenated cluster assembly or sample concatenated assembly). Then reads are back mapped against the concatenated file. Genosysmics use "jgi_summarize_bam_contig_depths" metabat utility to recover coverage of each contig then genomes bins are reconstructed from the concatenated assembly.
|
|
|
|
|
|
|
|
|
## MAGs filtering ##
|
|
|
|
|
|
MAGs may suffer from contamination. Genosysmics use contig taxonomy produce by [CAT](https://github.com/dutilh/CAT) to detect and filter out potential contaminant. For a given MAGs, genosymics will kept only contigs that share the same taxonomy as the consensus taxonomy of the MAGs (i.e, the taxonomy with the highest number of contigs). If the consensus taxonomy is constraint to family level, all genus and species defined for this family will be kept, thus avoiding a to stringent filter.
|
|
|
|
|
|
|
|
|
|
|
|
-----
|
... | ... | |