Bad Per sequence GC content

Hello, Biostars!

I have two fastq files of pair-end reads, which I want to use for SNV calling. Quality checking in FastQC showed bad Per base sequence content and a couple of warnings in both Per sequence GC content and Sequence Length Distribution – you can see it in the pictures below.
Per base sequence content before trimming
GC content before trimming
Sequence Length Distribution

My idea was to cut off first 6 bases and around 10 in the end. I used Trimmomatic with the following command:

TrimmomaticPE -threads 32 -phred33 R1.fastq R2.fastq Trimmed/FP.fastq Trimmed/FUN.fastq Trimmed/RP.fastq Trimmed/RUN.fastq ILLUMINACLIP:TruSeq3-PE-2.fa:2:30:10 HEADCROP:6 SLIDINGWINDOW:4:30 CROP:90

After this I got a pretty strange GC content, which appears to be worse than it used to be before trimming, and the Sequence Length Distribution is still has a warning.

Bad Per sequence GC content
enter image description here

The basic statistics before and after trimming is the following:
Statistics before trimming
Statistics after trimming

Does anyone have any idea why this happened, and what to do to improve the quality of data? Any help is appreciated!

Read more here: Source link