Remove vector sequences from genome database
I’m building a database containing Refseq genome sequences from selected bacterial species, which will be used for Nanopore sequencing of environmental samples.
In order to eliminate chances of false positives, I used the UniVec database to locate any potential contamination and got substantial hits to several vectors.
I am pretty new to bioinformatics and therefore I wanted to hear if anyone has any ideas of how to mask/remove the contamination from the genome sequences?
• 41 views
Read more here: Source link