Recently, I have been asked to do preprocessing of some fastq files produced by Illumina (I don’t know which machine produced data).
This is information of a fastq file (forward);
when I asked adapter sequences from the company, they provided me them as D710-501 TCCGCGAATATAGCCT (This is for one sample of forward and reverse).
When I checked the header of the fastq file, it can be seen as TCCGCGAA+AGGCTATA
On the other hand, at Illumina’s documentation the information is as below:
TruSeq DNA and RNA CD Indexes
Index 1 (i7) Adapters
I want to remove adapters from fastq files. I am a little bit confused about how to specify adapter sequences in an adapter file that will be used as input in fastp or Trimmomatic.
Is it okay to write as TCCGCGAATATAGCCT in the adapter fasta file or should I specify all? I mean like this (replacing i7 in the illumina documentation with sequences given at the header of the fastq file);