Sorting and writing multifasta entries to new fasta files

Sorting and writing multifasta entries to new fasta files

0

Hi, first post here.
So I’m trying take the CDS out of various species’ orthologous sequences. I’m running on a Linux server, and am mainly aiming to use BioPython or Linux programs for this.

I’ve run OrthoFinder on 28 species of seaweed, which gave out roughly 10,000 orthogroup sequences fasta files, each of which is a a multi-fasta file. I’ve concatenated each of them into one huge multifasta file, and now I want to extract the fasta files according to their species into a new multifasta file (so 10k files -> 1 file -> 28 files, one per species).

How do I do this? I’m still fairly new to BioPython, so I’m still wrapping my head around things. I know I’ll definitely need SeqIO, not sure what other libraries I’ll need. I already have a text file with all the species listed, one per line.

Thanks heaps for any help.
Lachlan


BioPython


OrthoFinder


fasta

• 19 views

Read more here: Source link