bash script
Hello everyone,
I have a file like this:
chr1_169894240_G_T_b38 chr1_169894240_G_T_b38
chr1_169894240_G_T_b38 chr1_169891332_G_A_b38
chr1_169891332_G_A_b38 chr1_169891332_G_A_b38
chr1_169661963_G_A_b38 chr1_169661963_G_A_b38
chr1_169661963_G_A_b38 chr1_169697456_A_T_b38
chr1_169697456_A_T_b38 chr1_169697456_A_T_b38
chr1_27636786_T_C_b38 chr1_27636786_T_C_b38
chr1_196651787_C_T_b38 chr1_196651787_C_T_b38
chr6_143501715_T_C_b38 chr6_143501715_T_C_b38
I want to extract info just like:
chr1_169894240 chr1_169894240
I don’t want to have other info. I just want chr_pos
I am confuse how to extract this info because the length is varying. In one case its 9 length and in other its 10. So if i use cut command for some its showing write value like chr_pos but for some its showing chr_pos_
Can anyone please help me out with this.
$ sed -r 's/_w_w_w{3}//g' test.txt
$ awk -v OFS="t" -F '[_t]' '{print $1"_"$2,$6"_"$7}' test.txt
$ parallel --colsep "_|t" echo {1}_{2} {6}_{7} :::: test.txt | sed 's/s/t/'
chr1_169894240 chr1_169894240
chr1_169894240 chr1_169891332
chr1_169891332 chr1_169891332
chr1_169661963 chr1_169661963
chr1_169661963 chr1_169697456
chr1_169697456 chr1_169697456
chr1_27636786 chr1_27636786
chr1_196651787 chr1_196651787
chr6_143501715 chr6_143501715
For the win, can even do a fancy regex with sed
cat data.tsv
chr1_169894240_G_T_b38 chr1_169894240_G_T_b38
chr1_169894240_G_T_b38 chr1_169891332_G_A_b38
chr1_169891332_G_A_b38 chr1_169891332_G_A_b38
chr1_169661963_G_A_b38 chr1_169661963_G_A_b38
chr1_169661963_G_A_b38 chr1_169697456_A_T_b38
chr1_169697456_A_T_b38 chr1_169697456_A_T_b38
chr1_27636786_T_C_b38 chr1_27636786_T_C_b38
chr1_196651787_C_T_b38 chr1_196651787_C_T_b38
chr6_143501715_T_C_b38 chr6_143501715_T_C_b38
sed 's/_[ATGC]_[ATGC]_[a-z][0-9]*//g' data.tsv
chr1_169894240 chr1_169894240
chr1_169894240 chr1_169891332
chr1_169891332 chr1_169891332
chr1_169661963 chr1_169661963
chr1_169661963 chr1_169697456
chr1_169697456 chr1_169697456
chr1_27636786 chr1_27636786
chr1_196651787 chr1_196651787
chr6_143501715 chr6_143501715
Traffic: 1151 users visited in the last hour
Read more here: Source link