Why does Cutadapt output much larger files than I am inputting?

Why does Cutadapt output much larger files than I am inputting?

0

I am using usegalaxy.org to work with paired end RNAseq data. I am using Cutadapt to trim adapter sequences, and the Cutadapt output files are larger than the files I am inputting. Example, my first sample SRR6467550, the forward read input fastsanger.qz is 2.1 GB. After using Cutadapt, the output fastsanger.qz is 8.1 GB. This is causing my disk quota to fill much faster and making it difficult to work with the amount of data I have (226 samples, I am going to have to work in batches as is). Is this problem avoidable in any way? Is there a way to obtain an output that is smaller?
My full input for reference:

Paired-end collection: My Data

Read 1 (3′): AGATCGGAAGAGCACACGTCTGAACTCCAGTCA

Read 2 (3′): AGATCGGAAGAGCGTCGTGTAGGGAAAGAGTGT

Minimum length (R1): 20

Quality cutoff: 20

Outputs Selector: Report: Cutadapt’s per-adapter statistics. You can use this file with MultiQC.


Cutadapt


Galaxy

• 33 views

updated 40 minutes ago by

105k

written 2 hours ago by

0

Read more here: Source link