Split read based on conserved sequence repeat

Split read based on conserved sequence repeat

0

I have reads that contain repeats of 10 nt (conserved sequence is known). I wish to split the reads into subunits, using the 10 nt as “marker” to know where to split.

As example (the conserved sequence is cccgggttta):
>
acagtacccgggtttaatcgatcgatcgtacccgggtttagtacgtacgatcgtcccgggtttatgctgtcgtc

To get:
>
acagtacccgggttta
>
atcgatcgatcgtacccgggttta
>
gtacgtacgatcgtcccgggttta
>
tgctgtcgtc

Help is appreciated, thank you


conserved


repeat


Split-Read

• 9 views

Read more here: Source link