Split multi-fasta file and keep structure

Split multi-fasta file and keep structure


Hey everyone,

I have a multi-fasta file, and when I want to split into individual fasta files, I use a command like this

    cat myfile | awk '{
        if (substr($0, 1, 1)==">") {filename=(substr($0,2) ".fna")}
        print $0 > filename

However, each individual fasta file represents a contig, and each contig belong to a given bacterial genome. So, if I have a multi-fasta like this


Using the above command will generate 4 individual fasta files. My objective is to split all files, so that PS_A_1 and PS_A_2 are concatenated in the same file (PS_A.fasta). The same for PS_B and so on.

Thanks a lot!



Read more here: Source link