ADOMETA
- ADoption
of Orphan METabolic Activities
|
Currently, predictions are available for three organisms: S. cerevisiae, E. coli and B. subtilis. We will add predictions for other organisms in the near future. To see predictions for the organisms above, select the organism of interest from the "Select an organism" drop down menu. |
Getting predictions |
Queries can be made in a number of ways. For instance, one can query using "EC #" (Enzyme Nomenclature numbers), which represent enzymatic activities and are strings of four digits separated by periods . For more information on EC numbers, please see the Enzyme Nomenclature website. You can also see below for several examples. Alternatively, one can use reaction name to query. For example, select the query type "By Reaction Name" from the "Select Type of Query" drop down menu and input the name, for instance, " ketol-acid reductoisomerase" in the query box and click "submit". However, unlike the standard EC #, reaction names are usually heterogeneous and/or ambiguous and is not recommended to use in querying. |
Examples |
Example 1. Assume you want to see predictions for E. coli for the EC number "2.6.1.19, 4-aminobutyrate transaminase", choose "By EC #" from the "Select a Type of Query" drop down menu and input the EC # without any prefix or suffix in the query box and click "submit". Example 2. Choose B. subtilis from the organism list --> choose "By EC #" as the type of query--> choose any candidate gene set--> input "5.1.3.13" in the query box--> click "submit'. Example 3. Choose S. cerevisiae from the organism list--> choose EC # as the type of query -->choose a candidate gene set --> input "2.1.1.10" in the query box --> click "submit". |
Candidate set |
Only genes from a candidate set are tested for any given orphan activity of interest. For each organism, top 20 predictions are available based on: 1)
Genes with no known metabolic function. This set includes all
genes of unknown functions and genes of no known metabolic
functions; |
Determining the status of metabolic activities |
For various reasons, opinions on the status of one activity (being orphan or assigned to genes) in a specific organism are not always consensus across major metabolic databases or metabolic models. On the prediction page, we list the status of metabolic activities in KEGG, Swissprot or well-established metabolic models (iJR904 for E. coli and iLL672 for S. cerevisiae) for corresponding organisms. If the status of an activity shows "Assigned to genes" in a database or model, it means the activity has been assigned to genes in that source. In contrast, if the status shows "Orphan", it means the activity is a local orphan for the specific organism of interest (i.e., not assigned to genes in that organism) based on the database or model. A reaction will be listed as "Global orphan" if no responsible sequence could be identified not only in the three organisms considered but in all known organisms (as of Feb 2006). We provide organism-specific lists of reactions to facilitate browsing. For B. subtilis, the list of reactions that have genes assigned is obtained from KEGG. For E. coli and S. cerevisiae, the lists are obtained from the well-curated metabolic models iJR904 and iLL672, respectively. Therefore, it is possible that a reaction assigned genes in other sources appears "orphan" in the lists or vice versa. It is even trickier to determine the list of orphan activities existing in an organism. It is ideal to determine the existence of certain activities in a specific organisms based on biochemical experiments. However in practice, it is nearly impossible to test exhaustively the presence/abscence of all activities in all organisms. Therefore, we have obtained KEGG reference pathways and if part of the pathway ECs are assigned in one orgnaism, assumed that all pathway reactions without known enzymes are orphan in the specific organism. This assumption is obviously simplistic and sometimes may lead to false positives. However, as we observed, if one activity is in fact absent in an organism, the neighborhood is usually composed of large percentage of gaps, meaning the branch is possibly missing in that organism. These gaps usually lead to poor prediction, as indicated in the p-value. |
Why sometimes no predictions are made? |
Sometimes no predictions will be displayed for an orphan activity of interest. There are several possible reasons:
For similar reasons, sometimes our algorithm does not output "exactly" the top 20 predictions - a user may see five or six top predictions and the others are too non-specific to be picked up by our algorithm. When no predictions are displayed, a user can choose different candidate gene sets and/or select another querying EC # to see if predictions are available. |