command for isolate unique reads with unique subject id
So as you see below there is 1 file that contains a different column. So as some reads are multiple align to different subject id, so I want to isolate only one read which contains the highest bit score but before that, first I want to filter out subject id on the basis of alignment like my subject id is 150bp and I want only those subject_id which are aligning to the query minimum 20bp upstream and downstream from the 75bp middle point of the subject_id fasta file.
Query_id Subject_id %_identity alignment_length mismatches gap_openings query_start query_end subject_start subject_end e-value bit_score
8d4dc875-da7b-4b31-8782-365968de9c74 101426 100 81 0 0 70 150 81 1 9.80E-40 161
4b1a4327-fba6-49d1-bd93-01df56452128 1589 96.55 87 3 0 260 346 87 1 1.70E-39 160
4c415d86-0c12-4b07-ab19-0d7fcbe8b1c7 1006 100 62 0 0 267 328 79 18 9.90E-28 121
4c415d86-0c12-4b07-ab19-0d7fcbe8b1c7 1006 98.48 66 0 1 199 263 150 85 1.50E-26 117
4c415d86-0c12-4b07-ab19-0d7fcbe8b1c7 5045 100 61 0 0 268 328 78 18 3.60E-27 119
4c415d86-0c12-4b07-ab19-0d7fcbe8b1c7 5045 91.55 71 0 3 58 122 150 80 6.70E-21 99
4c415d86-0c12-4b07-ab19-0d7fcbe8b1c7 4733 98.48 66 0 1 199 263 150 85 1.50E-26 117
4c415d86-0c12-4b07-ab19-0d7fcbe8b1c7 4733 100 58 0 0 268 325 58 1 2.20E-25 113
8717832e-e326-4b2d-8497-a59aa3f23642 1592 88.17 93 10 1 169 260 58 150 4.90E-33 139
8717832e-e326-4b2d-8497-a59aa3f23642 1592 92.86 28 2 0 134 161 22 49 1.10E-06 51
8717832e-e326-4b2d-8497-a59aa3f23642 1592 100 13 0 0 114 126 1 13 4.80E+01 26
455c720c-635e-4c03-8977-6272e169e567 101427 100 82 0 0 27 108 82 1 3.90E-40 162
d0071450-54e7-4f84-a6d4-b25ea757c5ea 1592 89.39 66 7 0 151 216 27 92 2.10E-24 110
d0071450-54e7-4f84-a6d4-b25ea757c5ea 1592 89.13 46 3 1 233 276 105 150 2.30E-12 70
d0071450-54e7-4f84-a6d4-b25ea757c5ea 1592 100 16 0 0 123 138 1 16 6.70E-01 32
a28de8d0-e08b-43b5-9de2-07df4404ea8c 37060 93.41 91 3 2 66 153 92 2 1.90E-34 144
a28de8d0-e08b-43b5-9de2-07df4404ea8c 37060 84.44 45 0 2 17 61 137 100 1.60E-07 54
a28de8d0-e08b-43b5-9de2-07df4404ea8c 37060 100 8 0 0 1 8 148 141 5.80E+04 16
2116d815-5edb-4124-998b-398be6161c56 1592 93.81 97 4 2 185 280 53 148 1.20E-38 157
2116d815-5edb-4124-998b-398be6161c56 1592 100 13 0 0 126 138 1 13 4.80E+01 26
ea6cc8d6-8f6a-4699-96ee-eff64d87d4be 1459 96.88 64 0 2 91 152 52 115 3.80E-22 103
ea6cc8d6-8f6a-4699-96ee-eff64d87d4be 1459 96.55 29 1 0 59 87 18 46 4.80E-07 52
ea6cc8d6-8f6a-4699-96ee-eff64d87d4be 1459 93.1 29 0 1 154 180 119 147 4.50E-04 43
f8261c28-e1ce-41a8-82cf-5f9ef39e248b 27775 92.45 106 5 1 96 198 15 120 1.70E-42 170
f8261c28-e1ce-41a8-82cf-5f9ef39e248b 27775 100 15 0 0 211 225 133 147 5.30E+00 29
Query_id Subject_id %_identity alignment_length mismatches gap_openings query_start query_end subject_start subject_end e-value bit_score
4c415d86-0c12-4b07-ab19-0d7fcbe8b1c7 1006 100 62 0 0 267 328 79 18 9.90E-28 121
8717832e-e326-4b2d-8497-a59aa3f23642 1592 88.17 93 10 1 169 260 58 150 4.90E-33 139
d0071450-54e7-4f84-a6d4-b25ea757c5ea 1592 89.39 66 7 0 151 216 27 92 2.10E-24 110
a28de8d0-e08b-43b5-9de2-07df4404ea8c 37060 93.41 91 3 2 66 153 92 2 1.90E-34 144
2116d815-5edb-4124-998b-398be6161c56 1592 93.81 97 4 2 185 280 53 148 1.20E-38 157
ea6cc8d6-8f6a-4699-96ee-eff64d87d4be 1459 96.88 64 0 2 91 152 52 115 3.80E-22 103
f8261c28-e1ce-41a8-82cf-5f9ef39e248b 27775 92.45 106 5 1 96 198 15 120 1.70E-42 170
• 22 views
Traffic: 3025 users visited in the last hour
Read more here: Source link