genes)
structure content, mean hydrophobicity, percentage of residues exposed to
solvent, sequence compositional entropy, number of homologues, alignment
entropy
non-coding genes)
of BLASTX hits, hit score, frame score
nucleotide triplets, sequence score, codon-bias, most-like CDS (MLCDS),
length-percentage, score-distance
transcript length ratio, Fickett score, hexamer usage bias
(GC, CT, TAG, TGT, ACG, TCG), conservation score, ORF length and
proportion
k = [1,5])
mean exon length, standard deviation of stop
codon frequency, txCdsPredict
protein-coding and lncRNA sequences
and (ii) only dissimilar sequences. Accuracy on set A for human: 91.54% ☼
Mouse: 92.21% ◙ On set B for human:
91.45% ☼ Mouse: 92.2% ◙ MCC on set
A for human: 83.17% ☼ Mouse: 84.59% ◙
On set B for human: 82.99% ☼ Mouse:
84.69% ◙ AUC on set A for human:
96.39% ☼ Mouse: 96.62% ◙ On set B for
human: 96.39% ☼ Mouse: 96.64% ◙
features, ribosomal interaction related features, protein conservation
scores
fruit fly, arabidopsis
conservation, polyA abundance, RNA secondary structure conservation, ORF
score, expression specificity score
combinations (for k = [2,5])
(coverage, length), sequence length, coding potential score, k-mer
score based on frequency
◙ Sensitivity for human: 92.3% ☼ Mouse: 93.8% ◙ Specificity for
human: 91.5% ☼ Mouse: 94.1% ◙ F score for human: 91.9% ☼ Mouse:
95.6% ◙ MCC for human: 83.8% ☼ Mouse: 85.6%
tested on animals and plants (both protein-coding and non-coding genes)
point
score, BLASTX: hits, significance, total bit
score, frame entropy
neural network
indicator)
features (length, coverage, hexamer score of longest ORF, entropy density
profile), UTR coverage, GC content of UTRs, Fickett score, HMMER index
mouse, wheat, zebrafish, chicken
distance to protein-coding transcript, distance ratio, EIIP value
from PLEK and CPC2
centrality, average degree, assortativity, maximum degree, minimum
degree, clustering coefficient, motif frequency
identity, ratio of alignment length and mRNA
length, ratio of alignment length and ORF
length, transposable elements, sequence divergence from transposable element, ORF length,
Ficket score
lengths, frequency of 64 codons
rice, tomato, sorghum, vine grape, maize
length, GC content, Ficket-score, hexamer score, maximum ORF length, ORF
coverage, mean ORF coverage, codon bias
nematode, rice, tomato
zebrafish (88.4%),
nematode (93.3%), tomato (93.3%), rice (96.3%)
Read more here: Source link