Difference between revisions of "KB-OWL"
Jim Balhoff (talk | contribs) (→Data model) |
Jim Balhoff (talk | contribs) (→Phenotypes) |
||
Line 3: | Line 3: | ||
==Data model== | ==Data model== | ||
===Phenotypes=== | ===Phenotypes=== | ||
− | Phenotypes in the new data model are modeled as OWL class expressions. They are now constructed in an inverse fashion from the form in the original KB | + | Phenotypes in the new data model are modeled as OWL class expressions. They are now constructed in an inverse fashion from the form in the original KB — rather than describing classes of particular qualities which inhere in particular structures, we describe parts which bear particular qualities (at the level of instance data the model is the same — ''inheres_in'' is the inverse of ''bearer_of''). |
For example, "serrated dorsal fin": | For example, "serrated dorsal fin": |
Revision as of 19:19, 5 January 2012
This page describes the data model, data loading process, and query methods being developed for a new version of the Phenoscape Knowledgebase built on top of RDF and making use of standard OWL reasoners.
Contents
Data model
Phenotypes
Phenotypes in the new data model are modeled as OWL class expressions. They are now constructed in an inverse fashion from the form in the original KB — rather than describing classes of particular qualities which inhere in particular structures, we describe parts which bear particular qualities (at the level of instance data the model is the same — inheres_in is the inverse of bearer_of).
For example, "serrated dorsal fin":
- old: exhibits some (serrated and inheres_in some dorsal_fin)
- new: has_part some (dorsal_fin and bearer_of some serrated)
We can now model absence correctly. The previous version of the KB used a quality class called 'absent' in a way which produces unintended inferences. In the new model we use universal quantification with negation. For "dorsal fin, absent":
- has_part only (not dorsal_fin)
For a localized absence we would use a nested has_part:
- has_part some (head and has_part only (not scale))
The has_part formulation also provides the benefit of automatic "present" inferences for structures used in any phenotype, not just "present" phenotypes.
Taxa and phenotype annotations
Taxa in the new data model are modeled as OWL individuals, rather than OWL classes. Thus, the taxonomic hierarchy is represented directly in RDF as a tree of subclade_of property relationships between taxa, rather than OWL subclass axioms. This model allows for richer modeling of how phenotypes are propagated due to taxon membership and ancestral states than a class-based taxonomy ([1]).
A given phenotype expression (has_part some ...) describes the condition of an individual organism, not a whole taxon. So to create a phenotype annotation for a taxon, the relationship of this organism to the taxon must be captured. We use two forms, depending on the intended meaning:
- Observation annotation: has_member some (has_part some (dorsal_fin and bearer_of some serrated))
- Ancestral state annotation: has_progenitor some (has_part some (dorsal_fin and bearer_of some serrated))
The actual phenotype annotation can be created as a class assertion axiom for the given taxon:
- Ictalurus_punctatus Type (has_member some (has_part some (dorsal_fin and bearer_of some serrated)))