Exclude specified range of bases from multiple sequences in a FASTA file

Exclude specified range of bases from multiple sequences in a FASTA file

0

Hi,
I am trying to eliminate a range of bases from sequences within a FASTA file in multiple places based on the header ID and positions that I mention.

For example; I have file; A.fa

>ID1
TTGTTCAACGGATCCACCTGTTGCCAAGAGTGCTTCAGTACATTGCTCACGGCTGAATCCCATATCCATCAAAGCACAAGATTTGAATTCACTCGAGGATCTGCTTCGTCGACCATTGGAAATGAAAAAATTACAATTACACATTGAATTTGTAAAGCTTGAAATTAATGAACTTACCAAAATAGATTTGCACACAGAAGCAACAGCTTGGCCGTGTTACAACTTGTAACGGGTAAAGACAAAATCGCTAACAACGGTTGTAGGCCACCATGTTCCACAAATTCACGACA

>ID2
ATGGTCGTCCGTTGAATTGT**TACTCAAAAT**TGCGTCGACAAATTTCATCACGTTCATAATGTAGTCAATGAGAACGATTGGAATGCGTTCGGAAGTAGATGATGAAGTCTGTGCAGATTCTTGTTCTGTATTCCCAGTTGCATTT

>ID3
TCTGCA**TTCT**GTCCA**TTGTC**ATCTCTGTGATTGTTGTACGGTGACGTACTTGCTTCTTCTTAGTCTTCATCTTCATCATCATTGCTACCTGCATTCATATCCGGATTATTTGTATAAGATTATTGGAAATGCCTAGCTACACAAATCCTTAAAATAAAAATAGGAAAAAAGTGTAAAAAAATAAAAGAAAAAAAATATTGAATGTAACTCACCTAAAGTAATA

I have another file with FASTA headers and with specified positions (X.txt) that looks like;

  ID start end 

  ID2 20...30 

  ID3  6...10, 15...20

I would like to modify the file A.fa in such a way that in the sequence ID2, I exclude bases between 20 and 30, in ID3 i exclude bases between 6 to 10 & 15 to 20 to create B.fa which looks like below;

>ID1
 TTGTTCAACGGATCCACCTGTTGCCAAGAGTGCTTCAGTACATTGCTCACGGCTGAATCCCATATCCATCAAAGCACAAGATTTGAATTCACTCGAGGATCTGCTTCGTCGACCATTGGAAATGAAAAAATTACAATTACACATTGAATTTGTAAAGCTTGAAATTAATGAACTTACCAAAATAGATTTGCACACAGAAGCAACAGCTTGGCCGTGTTACAACTTGTAACGGGTAAAGACAAAATCGCTAACAACGGTTGTAGGCCACCATGTTCCACAAATTCACGACA
>ID2 
ATGGTCGTCCGTTGAATTGTTGCGTCGACAAATTTCATCACGTTCATAATGTAGTCAATGAGAACGATTGGAATGCGTTCGGAAGTAGATGATGAAGTCTGTGCAGATTCTTGTTCTGTATTCCCAGTTGCATTT

>ID3
 TCTGCAGTCCATTTCTGTGATTGTTGTACGGTGACGTACTTGCTTCTTCTTAGTCTTCATCTTCATCATCATTGCTACCTGCATTCATATCCGGATTATTTGTATAAGATTATTGGAAATGCCTAGCTACACAAATCCTTAAAATAAAAATAGGAAAAAAGTGTAAAAAAATAAAAGAAAAAAAATATTGAATGTAACTCACCTAAAGTAATA

I have more than 100 IDs and different positions in X.txt to modify A.fa.
Any help would be appreciated.
Thank you very much.


bases


FASTA


file


exclude


Assembly

• 23 views

Read more here: Source link