|
|
|
---
|
|
|
|
title: Binning
|
|
|
|
---
|
|
|
|
# :construction: work in progress #
|
|
|
|
MAGs are reconstructed from metagenome assemblies using [MetaBAT](https://bitbucket.org/berkeleylab/metabat/src/master/) serving one of the main purpose of Magneto. MetaBAT use Tetra Nucleotide Frequency and Abundance of each contigs to build a pairwise distance matrix between contigs and reconstruct genomes bins.
|
|
|
|
MAGs are reconstructed from metagenome assemblies using several binning tools :
|
|
|
|
- [MetaBAT](https://bitbucket.org/berkeleylab/metabat/src/master/)
|
|
|
|
- [Concoct](https://github.com/BinPro/CONCOCT)
|
|
|
|
- [SemiBin2](https://github.com/BigDataBiology/SemiBin)
|
|
|
|
|
|
|
|
The use of several binning tools with different methodologies is aimed at **obtaining and grouping complementary results**. This makes it possible to take advantage of the specific features and positive points of each tool, while overcoming the weaknesses of each. Each tool will **provide bins that the other tools would not necessarily have found**, making it possible to obtain more genomes.
|
|
|
|
|
|
|
|
### Consensus Binning ###
|
|
|
|
|
|
|
|
Even if the use of several tools makes it possible to obtain a greater number and variety of bins, a certain number of **similar genomes** can be reconstructed. To overcome this, [DAS Tool](https://github.com/cmks/DAS_Tool) is used to perform consensus binning, i.e. to calculate an optimized and non-redundant set of bins from bins produced using several binning tools.
|
|
|
|
|
|
|
|
Binning part can be started with the following command :
|
|
|
|
|
|
|
|
```
|
|
|
|
magneto run binning **snakemake.args
|
| ... | ... | @@ -12,22 +22,23 @@ This will implicitly ran the [assembly](Modules/assembly) module. |
|
|
|
|
|
|
|
# Binning Strategies #
|
|
|
|
|
|
|
|
Magneto offers two binning strategies named single sample binning (SB) and multi sample binning (CB), together with assembly strategies (single and co-assembly) there are four ways to recovers MAGs from metagenomes :
|
|
|
|
- single-binning from single-assembly
|
|
|
|
- single-binning from co-assembly
|
|
|
|
- co-binning from single-assembly
|
|
|
|
- co-binning from co-assembly
|
|
|
|
Magneto offers two binning strategies named single sample binning (SB) and co-abundance binning (CB), together with assembly strategies (single and co-assembly) there are four ways to recovers MAGs from metagenomes :
|
|
|
|
- single-binning from single-assembly (**SASB**)
|
|
|
|
- single-binning from co-assembly (**SACB**)
|
|
|
|
- co-abundance binning from single-assembly (**CASB**)
|
|
|
|
- co-abundance from co-assembly (**CACB**)
|
|
|
|
|
|
|
|
Edit [config](Magneto-configuration) file to choose one or multiple strategies.
|
|
|
|
|
|
|
|
Edit config file to choose one or multiple strategies
|
|
|
|
:warning: It is not possible to launch multiple strategies containing a different assembly method. For example **SASB and CASB are not compatible**, but you can activate **SASB and SACB for the same run**.
|
|
|
|
|
|
|
|
## Single sample binning ##
|
|
|
|
|
|
|
|
Single sample binning consist in the reconstruction of genomes bins per assembly. Cluster reads or sample reads are back mapped against their assembly (cluster assembly or sample assembly). Magneto use "jgi_summarize_bam_contig_depths" metabat utility to recover depth of coverage of each contig. Then, genomes bins are reconstructed from the specified assembly.
|
|
|
|
Single sample binning consist in the reconstruction of genomes bins per assembly using abundance information. Cluster reads or sample reads are back mapped against their assembly (cluster assembly or sample assembly). Magneto use specific tools utility to recover **abundance** of each contig. Then, genomes bins are reconstructed from the specified assembly.
|
|
|
|
|
|
|
|
## Multi sample binning ##
|
|
|
|
## Co-abundance binning ##
|
|
|
|
|
|
|
|
Multi sample binning reconstruct genomes bins using co abundance. Cluster reads or sample reads are back mapped against **all** the assemblies (cluster assembly or sample assembly)
|
|
|
|
In this case, assemblies are first concatenated into a unique Fasta file (concatenated cluster assembly or sample concatenated assembly). Then reads are back mapped against the concatenated file. Magneto use "jgi_summarize_bam_contig_depths" metabat utility to recover coverage of each contig then genomes bins are reconstructed from the concatenated assembly.
|
|
|
|
Multi sample binning reconstruct genomes bins per assembly using co-abundance information. Cluster reads or sample reads are back mapped against **all** the assemblies (cluster assembly or sample assembly). Magneto use specific tools utility to recover **co-abundance** of each contig. Then, genomes bins are reconstructed from the specified assembly.
|
|
|
|
|
|
|
|
-----
|
|
|
|
[Previous - Genes collection (Module)](Modules/genes_collection)
|
| ... | ... | |
| ... | ... | |