How to search for primer sequences in fastq files generated after amplicon sequencing

Hi all,

I need some help with grep or any other command that will help do the job. I am very new to the command line. Any help is appreciated, thank you.

I recently did some amplicon sequencing of a multiplexed PCR reaction. I used nearly 90 primer pairs to multiplex a PCR reaction to generate amplicons. Sequencing libraries of these amplicons were made and the libraries read on a MiSeq instrument. 4 such reactions, differing in some primer pairs were used for sequencing. I now have the fastq files. Now i want to see the representation of each primer product in the fastq file, do decide which primer pool I should proceed with for my actual experiments. The MiSeq run was single-end and so I want to look for the forward primer sequence in the resultant fastq files.

I have been using grep to get answers but i only know how to do it individually


c for count
^ to search for string at the beginning of the sequence

The results that I get from this is


Then I take the number and paste it in an excel file. I know- terrible!!!

I have been searching for help similar to what i need but with no positive outcome.

My request here is:

I have a tab delimited file forwardprimers.txt with; (col1) primer name (col2) primer sequence, for 90 primers

I have 4 fastq files to query these primer sequences.

Is there a way to query the sequences in primer file with fastq file and get the counts for each primer name in a new output file. Thank you.

Read more here: Source link