I have two dataframe.
One is vcf. Its content is :
head(vcf) X.CHROM POS ID CHROM_POS 1 chr1 100000421 rs1047982323 chr1_100000421 2 chr1 100000827 rs1375386196 chr1_100000827 3 chr1 100001753 rs866745787 chr1_100001753 4 chr1 100001904 rs1416462966 chr1_100001904 5 chr1 100002334 rs1220478954 chr1_100002334 6 chr1 100002490 rs181634796 chr1_100002490**
and the other is mashr. Its content is:
head(mashr) RSID1 RSID2_ 1 chr1_169894240 chr1_169894240 2 chr1_169894240 chr1_169891332 3 chr1_169891332 chr1_169891332 4 chr1_169661963 chr1_169661963 5 chr1_169661963 chr1_169697456 6 chr1_169697456 chr1_169697456
I want to count the number of matches between these two dataframe in terms of chr_pos. and number of chr_pos in vcf dataframe missed in mash.
I wrote this command:
which(vcf$CHROM_POS == mashr$RSID1)
but its showing error:
integer(0) Warning message: In vcf$CHROM_POS == mashr$RSID1 : longer object length is not a multiple of shorter object length
I know that this error is related to the fact that the length is varying.
Can anyone tell me how to do this.
I want to find the number of similar
chrom_pos between the two dataframe and
chrom_pos missed between the two dataframe
• 23 views
Read more here: Source link