I want to use HMST-Seq anayzer (www.sciencedirect.com/science/article/pii/S2001037020304232) tool for my RRBS data analysis (directional) but I am stuck at the first step. In order to run HMST-Seq analyzer pipeline, I need CpG.txt or CpG.bed file as an input. So, I first performed bismark analysis on my control and mutated .fastq files and then use .txt/.bed file that bismark outputted to run HMST-Seq analyzer but it failed. However, I am unable to identify the required file as bismark produces CpG.txt file and I’m not sure which one should I feed to HMST-Seq analyzer.
Also, I looked at their demo input CpG.txt file and the header looked like this (their demo run goes well):
chr1 3010973 3010973 100.0 22 chr1 3557623 3557623 30.0 10 chr1 3612391 3612391 100.0 11 chr1 3661625 3661625 5.56 18 chr1 3661511 3661511 9.09 11
Whereas header of my bismark output CpG.txt file looks totally different like this:
NB551229:61:HKFNMBGXB:1:11101:22518:1840_1:N:0:TGACCA - 13 29215126 z NB551229:61:HKFNMBGXB:1:11101:22518:1840_1:N:0:TGACCA - 13 29215128 z NB551229:61:HKFNMBGXB:1:11101:22518:1840_1:N:0:TGACCA - 13 29215242 z NB551229:61:HKFNMBGXB:1:11101:16178:1843_1:N:0:TGACCA - 22 7263020 z NB551229:61:HKFNMBGXB:1:11101:14118:1840_1:N:0:TGACCA + 10 1746041 Z
So, I am not sure what other additional steps I should perform to get the file similar to their demo file and which is acceptable for HMST-Seq analyzer. I am new to methylation analysis, and would really appreciate if someone can help in solving this puzzle.
P.S. I tried contacting authors but they were of no use… Also, they don’t have github page for this where I could raise the issue.
Read more here: Source link