Extracting variations in the gene regions and from 100 bp of gene boundary from multiple VCF files

Extracting variations in the gene regions and from 100 bp of gene boundary from multiple VCF files

0

Hi,

I sincerely hope that I am not repeating an already answered question. I couldn’t find the answer to my exact problem.

I have three VCF files derived using bcftools (isec). Those three files contain similar variations compared to the reference sequence. End of the day, I have

  • Three VCF files representing three varieties (include only the common variations)
  • Reference FASTA file
  • Annotation (gff3) file for reference.

What I want to do is extract variations found in;

  1. Gene region
  2. 100 bp from TSS/+1 and the stop codon

Please note this is a 5 MB region (not a whole-genome, so there are no chromosomes).

I appreciate it if someone can help me in this regard.
Thank you!


VCF


Variations

• 21 views

Read more here: Source link