A regex to convert operon names to genes?

A regex to convert operon names to genes?

0

Hi,

I would like to convert operon names to gene names (and the reverse). I think this should be possible with a regex, but I’m not fluent enough with regexes to crack it up.

Conventionally, operons are named like this:

genes           operon_name  strand
oneA,oneB,oneC  oneABC       +
oneA,oneB,oneC  oneCBA       -
oneA,oneB,twoD  oneAB-twoD   +

Occasionally operons can also come out as “oneA-oneB-oneC” or “someID-someotherID”.

Any tip on how to get this to work, preferably in R? It doesn’t have to work in all cases, but it’d help a lot if it allowed me to reduce the amount of manual intervention.

Thanks a lot.


regex


r


gene


operon

• 15 views

Read more here: Source link