Print line based on partial match
I have two files with several hundred entries in each. File 1 has several 5 base seqeunces and file 2 has higher number of entries but with longer sequences. The first 5 bases of sequences in file 2 matches that of file 1. I tried some grep and awk methods , but did not work out for a partial match case as above. So for example:
File 1:
ATGCC
TTGCA
GGAAC
........
........
File 2:
ATTTCGGGAAAATT
ATGCCTTAAGACCT
GGAACTAAGGGGA
............
............
Expected outcome:
ATGCCTTAAGACCT
GGAACTAAGGGGA
Any help is much appreciated !
Thanks !
• 1.2k views
Read more here: Source link