Extract fastq reads by lists of sequences
I have lists of sequence which I would like to find fastq reads that contain these sequences.
Is there a tool or any possible programming to find fastq reads from specific lists of sequences??
My lists of sequences look like following,
I have used
grep to do this one by one but it’s taking too long (I have 40k 19mers).
grep -A 2 -B 1 "CTCAAAAAAAAACAAAGGA" input.fastq |grep -v "^--$" > output.fastq
Also, there is a problem with overlapping reads.
• 36 views
Read more here: Source link