Question: Mean and SD read length from a range of fastq files
Hi all,
I’m trying to write some code to generate mean read length data from a range of fastq files.
awk ‘{if(NR%4==2) print NR”t”$0″t”length($0)}’ HG1.fastq > readLength.txt
i’ve got as far as here from looking through other posts and trying to improve but i’m stuck on a couple of things. This command only works on a single file and will report the length of each read within that file separately.
I want to run a single command so the mean and Standard Dev of read lengths from all .fastq files within a folder are reported in a single .txt file, one sample per line. I gues SD might be difficult to calculate in a command so even just the mean read length.
e.g.the first 5 files in my folder are:
ru1.fastq
ru2.fastq
hg3.fastq
hg25.fastq
ru7.fastq
obviously i’m a bit of a novice at this so all help would be appreciated !!
thanks a lot
• 6.0k views
Read more here: Source link