separate read1 and read2 from merged fastq file and align against reference genome

separate read1 and read2 from merged fastq file and align against reference genome

0

Hi, I am processing a merged fastq file.

I used the following command to separate read1s and read2s in separate files for alignment using bwa mem.

paste - - - - - - - - < merged.fq
| tee >(cut -f 1-4 | tr "\t" "\n" > read1.fq)
| cut -f 5-8 | tr "\t" "\n" > read2.fq

here is the read 1s from the first 3 sequences:

@SRR10359518.1.1 1 length=26
TTATGAAATTCCTAGGCAAATGGATG
+SRR10359518.1.1 1 length=26
??????????????????????????
@SRR10359518.2.1 2 length=26
CCCTTATGCAGCTCGAGAAGGCGGAC
+SRR10359518.2.1 2 length=26
??????????????????????????
@SRR10359518.3.1 3 length=26
TCAGTCGTCCCAACATCGGACGCTTC
+SRR10359518.3.1 3 length=26
??????????????????????????

here is the read 2s of the same first 3 sequences:

@SRR10359518.1.2 1 length=26
TGGGTATCCTAAGTTTCTGGGCTAAN
+SRR10359518.1.2 1 length=26
??????????????????????????
@SRR10359518.2.2 2 length=26
TAGCAACCACAGATCCAACATGATTC
+SRR10359518.2.2 2 length=26
??????????????????????????
@SRR10359518.3.2 3 length=26
CCTCCAAGCAAACCCCACTGACCCCN
+SRR10359518.3.2 3 length=26
??????????????????????????

When I run the alignment

bwa mem ref.Genome read1.fastq read2.fastq -o my.sam

I get the following error that paired reads have different names:

paired reads have different names: “SRR10359518.1.1”, “SRR10359518.1.2”

Do you have any idea how I can fix the issue?

Thanks


fastq


alignment


regex


headers

• 28 views

Read more here: Source link