Phenoscape data repository

From phenoscape
Revision as of 16:24, 12 March 2009 by Crk18 (talk | contribs) (New page: The Phenoscape data repository is a relational database, which holds phenotypic data from the model organism ''Danio Rerio'' (Zebrafish) and the evolutionary organisms belong to the clade ...)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)

The Phenoscape data repository is a relational database, which holds phenotypic data from the model organism Danio Rerio (Zebrafish) and the evolutionary organisms belong to the clade of Ostariophysi. This page describes the schema of this data repository and outlines some data transformation techniques used to integrate data captured in different formats at different locations in this repository.

Data Repository

The Phenoscape data repository has been implemented as a PostgreSQL relational database, and at present housed on the development database server at NESCent.

Schema

The schema of the Phenoscape data repository is based on the Open Biomedical Database (OBD) data format developed at the Berkeley Bioinformatics Open-source Projects (BBOP). OBD is based upon the Resource Description Framework (RDF) format for capturing metadata about Web (and Semantic Web) resources such as Web pages and Web services.

The philosophy of OBD is to represent every conceptual entity, be it a type or a token (synonymously a class or an object, or a concept or an instance) as a Node. Binary relations between these nodes are represented as Statements, specifically Link Statements. OBD also allows for reification, which is vital to the life sciences with their emphasis on evidence codes and attributions. For this purpose, OBD provides Literal Statements to capture metadata about Nodes and Link Statements, such as the source publication, evidence codes, specimens used, and so forth.