Difference between revisions of "Phenoscape data loader"
(→Functioning of the Phenoscape Data Loader) |
|||
Line 10: | Line 10: | ||
===Loads the requisite ontologies into the database=== | ===Loads the requisite ontologies into the database=== | ||
The annotations of character matrices use terms that are defined in life science ontologies. These ontologies include the [http://bioportal.bioontology.org/ontologies/38703 Teleost Taxonomy Ontology (TTO)], the [http://www.obofoundry.org/ro/ Relations Ontology], the [[Teleost_Anatomy_Ontology|Teleost Anatomy Ontology (TAO)]], and the [http://bioontology.org/wiki/index.php/PATO:Main_Page Phenotype and Trait Ontology (PATO)]. The data loader loads all these definitions into the relational database | The annotations of character matrices use terms that are defined in life science ontologies. These ontologies include the [http://bioportal.bioontology.org/ontologies/38703 Teleost Taxonomy Ontology (TTO)], the [http://www.obofoundry.org/ro/ Relations Ontology], the [[Teleost_Anatomy_Ontology|Teleost Anatomy Ontology (TAO)]], and the [http://bioontology.org/wiki/index.php/PATO:Main_Page Phenotype and Trait Ontology (PATO)]. The data loader loads all these definitions into the relational database | ||
+ | |||
+ | ===Loads the data from ZFIN model organism database === | ||
+ | [http://zfin.org ZFIN] hosts data relating mutant phenotypes of the ''Danio Rerio'' (Zebrafish) organism to specific genes and genotypes. The data loader transcribes this data into OBD format and loads it into the database | ||
===Loads the data from the curated NeXML files into the database=== | ===Loads the data from the curated NeXML files into the database=== | ||
Line 16: | Line 19: | ||
===Logs incomplete annotations into a log file=== | ===Logs incomplete annotations into a log file=== | ||
The data loader does not load incomplete annotations from the NeXML files into the database. Incomplete annotations contain null values for taxa or phenotype or both. Instead, it logs these incomplete annotations on a file-specific basis in this [[Problem Log Format]]. Curators can then work on finishing these annotations which will be subsequently loaded into the database in the next execution of the data loader. | The data loader does not load incomplete annotations from the NeXML files into the database. Incomplete annotations contain null values for taxa or phenotype or both. Instead, it logs these incomplete annotations on a file-specific basis in this [[Problem Log Format]]. Curators can then work on finishing these annotations which will be subsequently loaded into the database in the next execution of the data loader. | ||
+ | |||
+ | ===Reasons with the data === | ||
+ | Lastly, the data loader invokes the [[OBD Reasoner]] to infer implicit knowledge from the assertions, in the form of new assertions. These inferred assertions are also added to the database. | ||
==[[OBD API Documentation]]== | ==[[OBD API Documentation]]== |
Revision as of 15:36, 12 March 2009
The Phenoscape Data Loader is being developed as a Perl module. This section offers an overview of the functioning of the Phenoscape Data Loader.
Contents
- 1 Functioning of the Phenoscape Data Loader
- 1.1 Downloads curated NeXML files from the Phenoscape SVN repositories
- 1.2 Loads the requisite ontologies into the database
- 1.3 Loads the data from ZFIN model organism database
- 1.4 Loads the data from the curated NeXML files into the database
- 1.5 Logs incomplete annotations into a log file
- 1.6 Reasons with the data
- 2 OBD API Documentation
Functioning of the Phenoscape Data Loader
The Phenoscape Data Loader performs the following steps in sequence
Downloads curated NeXML files from the Phenoscape SVN repositories
Ichthyologists curate scientific publications using the Phenex character matrix annotator and these curations are stored in an SVN repository. The data loader downloads these data files on a daily basis for subsequent uploading into the relational database
Loads the requisite ontologies into the database
The annotations of character matrices use terms that are defined in life science ontologies. These ontologies include the Teleost Taxonomy Ontology (TTO), the Relations Ontology, the Teleost Anatomy Ontology (TAO), and the Phenotype and Trait Ontology (PATO). The data loader loads all these definitions into the relational database
Loads the data from ZFIN model organism database
ZFIN hosts data relating mutant phenotypes of the Danio Rerio (Zebrafish) organism to specific genes and genotypes. The data loader transcribes this data into OBD format and loads it into the database
Loads the data from the curated NeXML files into the database
The data loader transforms the curated data from NeXML syntax to a set of relational tuples (records), which are then sequentially inserted into the database
Logs incomplete annotations into a log file
The data loader does not load incomplete annotations from the NeXML files into the database. Incomplete annotations contain null values for taxa or phenotype or both. Instead, it logs these incomplete annotations on a file-specific basis in this Problem Log Format. Curators can then work on finishing these annotations which will be subsequently loaded into the database in the next execution of the data loader.
Reasons with the data
Lastly, the data loader invokes the OBD Reasoner to infer implicit knowledge from the assertions, in the form of new assertions. These inferred assertions are also added to the database.
OBD API Documentation
For code specific details Specific details of OBD related classes and interfaces as documented by Cartik Kothari. These will be updated very often and are meant to be used as an addendum to the The OBOEdit Javadoc