What is Kegg orthology and what is it used for? : bioinformatics

Kegg orthology is a database of genes with known identity and function. You can submit genomic data to be referenced against that database. I love using kegg to produce genome summaries such as what metabolic pathways are complete and incomplete, and what percentage of genes in the genome are dedicated to a specific task.

www.genome.jp/kegg/kaas/

I use this to determine which pathways are complete and incomplete. The outputs are visuals of metabolic pathways and are relatively easy to interpret. For example, I could use this information to say that my organism of study possesses genes necessary to metabolize glucose and fructose, but lacks genes necessary to metabolize mannose. Now this could all be done manually with tedious blast searches and cross checking, but KEGG is much more efficient.

www.kegg.jp/blastkoala/

I use this to see what percentage of genes in the genome are dedicated to a specific task (e.g. 10% of genes are for transcription, 8% of genes are for translation, 18% of genes are for carbohydrate metabolism, etc..). The output is an easy to interpret pie chart.

Kegg will also output a list of Kegg ortholog ID numbers (e.g. K00131 refers to glyceraldehyde-3-phosphate dehydrogenase) for each genome you submit. With this data, you can compare two different genomes to see what genes they share and what genes they don’t share. You can submit this information to something like bioinfogp.cnb.csic.es/tools/venny/ to visualize this nicely. You could also submit the list of KEGG identifiers to something like pathways.embl.de/ to produce good quality metabolic pathway maps.

These are just the things I use KEGG for and I’m sure there are many more applications!

Read more here: Source link