Removal of host sequences without reference genome
Suppose to have a collection of viral reads from NGS (Illumina) technology in fastq format. After the usual pre-processing step (addressed by fastp), I need to remove the host sequences (contaminants) without having the reference genome (I cannot use bowtie2 and samtools for mapping, of course). I have ready some approaches, but I am still not sure. Please, can someone suggest an appropriate strategy/starting point/approach? Thanks for your support.
• 22 views
Read more here: Source link