Metagenomic assembly, removing redundant contigs.
What is the purpose of the following excerpt from paper using metagenomic assemblies:
Redundancies of sequences from the same organism within the metagenome
were removed by clustering all contigs at 95% identity with CD-hit
v4.6.6 (72), and only the longest contig per cluster was kept
I understand what is being done but I don’t know why. I could see use if binning to reduce computation time perhaps but otherwise I am not sure? Would it able reduce annotation time or something similar?
The paper in question under the methods section “Metagenome sequencing and assembly.”:
journals.asm.org/doi/10.1128/mSphere.00165-19
• 28 views
Read more here: Source link