GENCODE – Human Release 32 Statistics

Statistics about the GENCODE Release 32

The statistics derive from the gtf file that contains only the annotation of the main chromosomes.

For details about the calculation of these statistics please see the README_stats.txt file.

General stats

Total No of Genes 60609
Protein-coding genes 19965
Long non-coding RNA genes 17910
Small non-coding RNA genes 7576
Pseudogenes 14749
– processed pseudogenes 10668
– unprocessed pseudogenes 3556
– unitary pseudogenes 228
– polymorphic pseudogenes 42
– pseudogenes 18
Immunoglobulin/T-cell receptor gene segments
– protein coding segments 408
– pseudogenes 237
Total No of Transcripts 227462
Protein-coding transcripts 83986
– full length protein-coding 57935
– partial length protein-coding 26051
Nonsense mediated decay transcripts 15811
Long non-coding RNA loci transcripts 48351
 
Total No of distinct translations 62256
Genes that have more than one distinct translations 13734

Further details on this version’s gene and transcript types

biotype genes transcripts
IG_C_gene 14 23
IG_C_pseudogene 9 9
IG_D_gene 37 37
IG_J_gene 18 18
IG_J_pseudogene 3 3
IG_pseudogene 1 1
IG_V_gene 144 144
IG_V_pseudogene 188 188
lncRNA 16849 75141
miRNA 1881 1881
misc_RNA 2212 2227
Mt_rRNA 2 2
Mt_tRNA 22 22
non_stop_decay 0 91
nonsense_mediated_decay 0 15811
polymorphic_pseudogene 42 64
processed_pseudogene 10171 10176
protein_coding 19965 83986
pseudogene 18 38
retained_intron 0 28437
ribozyme 8 8
rRNA 52 52
rRNA_pseudogene 500 500
scaRNA 49 49
scRNA 1 1
snoRNA 942 954
snRNA 1901 1901
sRNA 5 5
TEC 1061 1155
TR_C_gene 6 6
TR_D_gene 4 4
TR_J_gene 79 79
TR_J_pseudogene 4 4
TR_V_gene 106 106
TR_V_pseudogene 33 33
transcribed_processed_pseudogene 495 495
transcribed_unitary_pseudogene 130 138
transcribed_unprocessed_pseudogene 923 938
translated_processed_pseudogene 2 2
translated_unprocessed_pseudogene 2 2
unitary_pseudogene 98 98
unprocessed_pseudogene 2631 2632
vaultRNA 1 1

Read more here: Source link