DaGO-Fun - Database for GO-based Functional Annotation Analysis

Browsing Tools

Browsing Resources

Protein Resources

Protein Interactions

Annotation Analysis

APFP-FN: Automated Protein Function Prediction Tool

The development of fast and relatively inexpensive sequencing technology has yielded complete genome sequences for human, mouse and many other organisms including crucial microbial pathogens of humans, animals and plants. This has also yielded the lack of functional annotations for most newly sequenced genes and proteins. It will likely be difficult to determine the function of these proteins experimentally for several reasons. These include:

  • Possible relationship of the function to the native environment in which a particular organism lives.
  • Inclusion of many genes in the genome for securing its survival in a particular environment, which may have no use in the environment created in the laboratory, and
  • It may even, in many cases, be almost impossible to imitate the natural host, with its myriad other micro-organisms, and thereby determine the exact function of gene or gene product by experiment alone.

The only effective route toward the elucidation of the function of uncharacterized proteins may be a combination of experimental approaches and predictions through computational analysis. Our system framework follows these steps.

  • Generate Functional Interaction Networks enhanced by integrating data from different sources (Homology-based, Genomic context and High throughput).
  • Use Gene Ontology (GO) and prediction algorithms to predict functions of uncharacterized proteins based on the functional interaction networks.

Prediction along these lines will give a first hint towards functionality that later can be subjected to experimental verification.

Important Notes:
  • We are still putting together a set of python codes in Graphical User Interface (GUI) for predicting protein functions that users can run locally on their computer aas their running time can get longer depending on the data under consideration.
  • Our annotation prediction model uses direct interacting neighbors combined with second level interacting neighbors to achieve efficient trade-off between the scalability issue, prediction improvement and genomic coverage. Thus, we are predicting functions of MTB uncharacterized proteins using the functional organization of level-1 and level-2 interacting partners of the protein under consideration in the functional network.
  • Throughout this annotation prediction process, instead of using exact matches only, relationships between GO terms in the GO DAG structure are considered through the GO term's semantic similarity. These combinations improved the prediction quality and the genome coverage. Click here to download the related paper for more details.