What is the most optimal way to count the nucleotide bases of each fastq file in my directory using UNIX commands?
I have a bunch of fastq files, and I need to write a one line UNIX command that will write the word count (wc) of how many nucleotides EACH file contains, not the total. It should look like this:
321903 1.fastq 314156 2.fastq 13515 3.fastq ...
and so on.
So far I have
cat *.fastq | awk ‘NR%4 == 2 {print $0}’| tr -d ‘n’ | wc -c
but that doesn’t work. I can’t find the answer this specific anywhere.
• 2.7k views
Read more here: Source link