Transcription Start Site

Transcription Start Site

2

What are the best databases to check out the transcription start sites of specific genes in human genome?


TSS

• 130 views

 wget -q  -O - "http://hgdownload.cse.ucsc.edu/goldenPath/hg19/database/wgEncodeGencodeBasicV19.txt.gz" | gunzip -c  | awk '(int($7)< int($8)) {if($4=="+") {printf("%s\t%d\t%d\t%s\t%s\n",$3,$7,int($7)+1,$2,$4);}else {printf("%s\t%d\t%d\t%s\t%s\n",$3,int($8)-3,$8,$2,$4);}}' 


chr1    69090   69091   ENST00000335137.3   +
chr1    139306  139309  ENST00000423372.3   -
chr1    367658  367659  ENST00000426406.1   +
chr1    622031  622034  ENST00000332831.2   -
chr1    739134  739137  ENST00000599533.1   -
chr1    818042  818043  ENST00000594233.1   +
chr1    861321  861322  ENST00000342066.3   +
chr1    866442  866445  ENST00000598827.1   -
chr1    894617  894620  ENST00000327044.6   -
chr1    896073  896074  ENST00000338591.3   +

Basically any GTF file, from RefSeq, Ensembl, GENCODE. It is the start coordinate of the entries with type transcript. Be aware that for genes on the bottom strand it would be the end coordinate, but most GTFs even have a TSS entry that you can use directly.


Login
before adding your answer.

Traffic: 2630 users visited in the last hour

Read more here: Source link