[BiO BB] KEGG vs GO

Michael Ashburner (Genetics) ma11 at gen.cam.ac.uk
Thu Apr 6 12:45:51 EDT 2006


I think that there is some confusion in this thread.

1. There is the Gene Ontology.  Its terms are used (primarily)
for the annotation of gene products.  Both the Ontology and the
annotations contributed by the members of the GO Consortium database
are available from the GO site.

2. There is the KEGG Orthology, available from the KEGG site.
This is _both_ an ontology, seen, for example, by
opening KO up to its 3rd level: http://www.genome.ad.jp/dbget-bin/get_htext?KO+-s+F+-f+F+C
_and_ annotations of classes of gene product, seen if it is opened up
to level 4:
http://www.genome.ad.jp/dbget-bin/get_htext?KO+-s+F+-f+F+D


It would be easy for us to make a mapping between the Gene Ontology
and KO (level 3), except that the KO includes domains outwith the GO
(e.g.  01500 Human Diseases, and its child terms).  In fact we will
do that and make it available as a ko2go mapping file on GO. We do not
need the "SwissProt Relational Database" to do this. Indeed, KEGG already
provide many of these mappings to the GO.

Mapping to level 4 is more problematic.  The KO presents three levels:

Ontology terms ("Levels 1-3")
	e.g.: 00010 Glycolysis / Gluconeogenesis PATH:ko00010] [GO:0006096 0006094]
Families of proteins ("Level 4")
	e.g.  K00845 E2.7.1.2, glk; glucokinase [EC:2.7.1.2] [COG:COG0837] [GO:0004340]
Genes, whose products are members of this family
	e.g. Genes HSA: 2645(GCK)

While for those Level 4 terms that are enzymes a 'mapping' of KO to the GO
would not be hard, it gets more difficult further down.  Consider the term:
K06051 DLL; delta
This is a child of (among others)
Notch signaling pathway [PATH:ko04330] {which would map to the GO)
and has children:
HSA: 10683(DLL3) 28514(DLL1) 54567(DLL4)
MMU: 13388(Dll1) 13389(Dll3) 54485(Dll4)
RNO: 114125(Dll3) 311332(Dll4_predicted) 84010(Dll1)
XLA: 379238(MGC52561)
DRE: 30120(dlc) 30131(dla) 30138(dld) 30141(dlb)
DME: CG3619-PA(Dmel_CG3619)
Which are clearly individual gene products.

Thus, I conclude, that KO's: K06051 DLL; delta  is a _genus_
of gene products.  This is conceptually very different from the GO,
despite what may seem to be superficial similarities.

So, contra Lucy, the difference between the GO and KO has nothing to
do with manual vs automatic annotation, or on the 'focus' of the KO,
but rather they differ in their underlying structure.

Michael 


=====
Envelope-to: ma11 at gen.cam.ac.uk
Delivery-date: Wed, 05 Apr 2006 11:14:38 +0100
X-Cam-SpamDetails: scanned, SpamAssassin (score=0)
X-Cam-AntiVirus: No virus found
X-Cam-ScannerInfo: http://www.cam.ac.uk/cs/email/scanner/
X-Original-To: bio_bulletin_board at bioinformatics.org
Delivered-To: bio_bulletin_board at bioinformatics.org
X-Cam-SpamDetails: Not scanned
X-Cam-AntiVirus: No virus found
Date: Wed, 05 Apr 2006 11:13:14 +0100
From: Dan Bolser <dmb at mrc-dunn.cam.ac.uk>
User-Agent: Mozilla Thunderbird 1.0.7-1.1.fc4 (X11/20050929)
X-Accept-Language: en-us, en
MIME-Version: 1.0
To: "The general forum at Bioinformatics.Org" <bio_bulletin_board at bioinformatics.org>
Subject: Re: [BiO BB] KEGG vs GO
Content-Transfer-Encoding: 7bit
X-BeenThere: bio_bulletin_board at bioinformatics.org
X-Mailman-Version: 2.1.5
List-Id: "The general forum at Bioinformatics.Org" <bio_bulletin_board.bioinformatics.org>
List-Unsubscribe: <https://bioinformatics.org/mailman/listinfo/bio_bulletin_board>, 
<mailto:bio_bulletin_board-request at bioinformatics.org?subject=unsubscribe>
List-Archive: <http://bioinformatics.org/pipermail/bio_bulletin_board>
List-Post: <mailto:bio_bulletin_board at bioinformatics.org>
List-Help: <mailto:bio_bulletin_board-request at bioinformatics.org?subject=help>
List-Subscribe: <https://bioinformatics.org/mailman/listinfo/bio_bulletin_board>, 
<mailto:bio_bulletin_board-request at bioinformatics.org?subject=subscribe>
X-Keywords: 

lucifer at slimy.greenend.org.uk wrote:
> "Samantha Fox" <bioinfosm at gmail.com> writes:
> 
>> I was wondering how KEGG and GO differ from a broad perspective of 
>> grouping functionally related genes.  So a KEGG pathway lists all 
>> genes that kind of work together, and a similar GO term would also 
>> contain such > a gene list.
> 
> 
> IIRC, KEGG is manually created from the literature whilst GO also 
> contains automatic/electronic annotation based on sequence homology.  
> KEGG also focuses more on metabolic pathways, whilst GO covers a more 
> comprehensive set of cellular processes and molecular functions.
> 
> Hope that helps,

It should be possible to 'cross correlate' KEGG an GO in a number of 
different ways using one of the SWISSPROT relational databases. However 
you should know that generally 'ontology mapping' is an open problem :)

Good luck!


> Lucy
> -- 
> Lucy McWilliam
> http://www.chiark.greenend.org.uk/~lucifer/
> _______________________________________________
> Bioinformatics.Org general forum  -  BiO_Bulletin_Board at bioinformatics.org
> https://bioinformatics.org/mailman/listinfo/bio_bulletin_board

_______________________________________________
Bioinformatics.Org general forum  -  BiO_Bulletin_Board at bioinformatics.org
https://bioinformatics.org/mailman/listinfo/bio_bulletin_board




More information about the BBB mailing list