UCSC Gene Table Exon Frames Generating Stop Codons

Hi,

I’m using UCSC gene tables, and I am running into trouble with interpreting exon frames. In some cases, using the exon frame from the tables creates stop codons, which shouldn’t be happening in coding regions.

As an example, from the hg19 gene NM_001369291 on chromosome 22, I have this line from the gene table:

733 NM_001369291    chr22   +   19466988    19508131    19467079    19506431    19  19466988,19467680,19468475,19470212,19471384,19481849,19483503,19484908,19486623,19492884,19494908,19495288,19496052,19502271,19502487,19504049,19504339,19506366,19508003, 19467094,19467740,19468568,19470350,19471528,19481905,19483552,19484970,19486674,19493004,19495040,19495387,19496214,19502410,19502571,19504168,19504416,19506432,19508131, 0   CDC45   cmpl    cmpl    0,0,0,0,0,0,2,0,2,2,2,2,2,2,0,0,2,1,-1,

Where the first list of positions is a list of exon starts, and the last list of numbers is a list of exon frames. 19495288 corresponds to a frame of 2, but using a sequence of the exon from UCSC, only a frame of 1 creates a transcript where no stop codons are made:

>hg19_ncbiRefSeqCurated_NM_001369291.1_22 range=chr22:19495289-19495387 5'pad=0 3'pad=0 strand=+ repeatMasking=none
TCTTCCCCTGAAGCAGGTGAAGCAGAAGTTCCAGGCCATGGACATCTCCT
TGAAGGAGAATTTGCGGGAAATGATTGAAGAGTCTGCAAATAAATTTGG

Is there something I am missing with interpreting the exon frames of the gene table? Unless I am mistaken, the gene table is 0 indexed, and the fasta entry for the exon is 1 indexed.

Thanks in advance!

Read more here: Source link