Showing posts with label 2003. Show all posts
Showing posts with label 2003. Show all posts

Saturday, February 13, 2010

Ontologies: formalising biological knowledge for bioinformatics

Citation: Jonathan Bard. Ontologies: formalising biological knowledge for bioinformatics. Bioessays, May 2003, 25(5):501-506.
Link: NCBI PubMed

Summary

Ontologies are becoming increasingly important in bioinformatics because they can be linked to the information in databases and their knowledge then used to query the databases. This direct connection allows for faster searching in databases and less ambiguity than in string-based searches. Also, lots of data contains hierarchical relationships and relational databases do not handle hierarchies very well. The result is rich ontologies, which are independent of their associated databases and linked to them through term IDs.

The Gene Ontology (GO) is used to integrate genetic data about gene products with our knowledge of their properties. The GO catalogues its knowledge in three essentially non-overlapping ways: their location within cells, the process to which they contribute, and the functions they fulfill.

Tuesday, April 21, 2009

Requirements of Phylogenetic Databases

Citation: Luay Nakhleh, Daniel Miranker, Francois Barbancon, William H. Piel, Michael Donoghue. Requirements of Phylogenetic Databases, Third IEEE International Symposium on Bioinformatic and Bioengineering, vol. 0, no. 0, pp. 141, 2003.
Link: IEEE CS Digital Library

Summary

This work examines the impact of phylogenetic databases on the need and use of phylogenetic data. It evaluates the drawbacks of unnormalized Newick format in existing databases, e.g. TreeBASE, and suggests using normalized data model by providing a list of potential application/queries that a biologist may wish to see integrated into their phylogenetic DBMS.

There are two major drawbacks of the unnormalized Newick format:
  • The database cannot directly support queries concerning the relationships between the taxa and the structure of the phylogeny.
  • Some processes (e.g. hybridization, horizontal gene transfer etc.) result in graph structures, which are not supported by Newich format.

Authors of this paper identify six different categories of users of phylogenetic databases: (1) casual users, (2) visualization, (3) study development, (4) super-tree algorithms, (5) simulation studies, and (6) comparative genomics.

Definitions

Phylogeny: A phylogeny is a rooted, leaf-labeled tree, whose leaves represent a set of operational taxa, and whose internal nodes represent the (hypothetical) ancestral taxa. A phylogeny on a set of taxa represents the evolutionary history of the taxa in from their most recent ancestor (at the root of the tree).
Tree of Life A phylogenetic tree that represents the evolutionary history of all species in the world. It is expected that when finished, the Tree of Life will contain millions of species.