Is subtelomeric region and pericentromeric region defined in human genome?
I’ve been trying to see if there’s any coordinates for these but doesn’t have much luck. Saw a bunch of people defining it by +-2MB around the centromere gap and 30kb away from the telomere. I was wondering if you guys have any knowledge about this. Thanks!
Centromeres and telomeres are highly repetitive regions that, as far as I know, still defy assembly for each chromosome. Another issue is that their lengths vary from chromosome to chromosome and from cell to cell.
One paper I found very useful in terms of information provided about subtelomeric regions is this one. It provides a fasta file as a supplementary where subtelomeric regions are defined as 500kb regions on both p and q arms of chromosomes of hg19 assembly.
I recently wondered about the same question and tried to find an answer as I did not find any attempt to provide a kind of “annotation” of subtelomeric regions in hg19 genome assembly.
Starting from data of the paper I cited previously, I tried to create an annotation of subtelomeres in hg19.
I produced 2 bed files with coordinates of subtelomeres in hg19:
hg19_subtelomeres.bed: which contains the “strict” coordinates of subtelomeres based on the paper fasta file, and telomeres regions from the UCSC genome browser.
hg19_extended_subtelomeres.bed: where I removed the gaps existing between subtelomeres coordinates and telomeres, by extending the subtelomeres to where telomeres end/start (depending on which chromosome arm you consider).
I would be very interested if others could contribute, give feedbacks or provide an alternative approach to define subtelomeric regions. I am currently working on a method to define coordinates of pericentromeric regions, this time based on repetive elements composition along chromosomes. Maybe I will have the opportunity to share it here too when it will be available.