Difference between revisions of "Data Repository and Data Services"

From phenoscape
(Replacing page with ' Category:Informatics')
 
(One intermediate revision by the same user not shown)
Line 1: Line 1:
  
 
==OBD Reasoner==
 
This section documents the inherited code in Perl and embedded SQL, that extracts implicit inferences from the downloaded ontologies and annotations of ZFIN and Phenoscape phenotypes.
 
 
===Relationships Used===
 
 
'''Transitive Relationships (A R B, B R C => A R C)'''
 
 
Transitive relationships are the simplest inferences to be extracted and comprise the majority of new assertions added by the reasoner. Transitive relationships include (ontology in brackets):
 
* is_a (OBO Relations)
 
* has_part (OBO Relations)
 
* part_of (OBO Relations)
 
* integral_part_of (OBO Relations)
 
* has_integral_part (OBO Relations)
 
* proper_part_of (OBO Relations)
 
* has_proper_part (OBO Relations)
 
* improper_part_of (OBO Relations)
 
* has_improper_part (OBO Relations)
 
* location_of (OBO Relations)
 
* located_in (OBO Relations)
 
* derives_from (OBO Relations)
 
* derived_into (OBO Relations)
 
* precedes (OBO Relations)
 
* preceded_by (OBO Relations)
 
* develops_from (Zebrafish Anatomy)
 
* anterior_to (Spatial Ontology)
 
* posterior_to (Spatial Ontology)
 
* proximal_to (Spatial Ontology)
 
* distal_to (Spatial Ontology)
 
* dorsal_to (Spatial Ontology)
 
* ventral_to (Spatial Ontology)
 
* surrounds (Spatial Ontology)
 
* surrounded_by (Spatial Ontology)
 
* superficial_to (Spatial Ontology)
 
* deep_to (Spatial Ontology)
 
* left_of (Spatial Ontology)
 
* right_of (Spatial Ontology)
 
* complete_evidence_for_feature(Sequence Ontology)
 
* evidence_for_feature (Sequence Ontology)
 
* derives_from (Sequence Ontology)
 
* member_of (Sequence Ontology)
 
* exhibits (Phenoscape Ontology)
 
 
'''Relation (role) compositions'''
 
 
Relation (role) compositions are of the form A R1 B, B R2 C => A (R1|R2) C. For example, given A is_a B and B part_of C then A part_of C. The reasoner extracts such inferences and adds them to the database. Specifically, the following relation composition templates are used:
 
* A is_a B, B R C => A R C, where R can be any relation
 
* A R B, B is_a C => A R C, where R can be any relation
 
* Reflexive relations (A is_a A)
 
* Sub relations A R B, R is_a R2 => A R2 B
 
** An example: If A father_of B and father_of is_a parent_of, then A parent_of B
 
* Relation chains
 
** Relation chains are a special case of relation composition. Component relations are accumulated into an assembly relation. Specifically, instances of
 
the relation ''inheres_in_part_of'' are accumulated from instances of the relations of ''inheres_in'' and ''part_of''.
 
IF A inheres_in B and B part_of C, THEN A inheres_in_part_of C
 
 
'''Intersection Relations'''
 
 
Phenotype annotations are typically "post-composed", where an entity and quality are combined into a Compositional Description. For example, an annotation about the quality ''decreased size (PATO:0000587)'' of the entity ''Dorsal Fin (TAO:0001173)'' may be post-composed into a Compositional Description that looks like ''PATO:0000587^OBO_REL:inheres_in(TAO:0001173)''. Instances of ''is_a'' and ''inheres_in'' relations are extracted from post compositions like this. In the above example, the reasoner extracts:
 
 
# PATO:0000587^OBO_REL:inheres_in(TAO:0001173)              OBO_REL:inheres_in              TAO:0001173, and
 
# PATO:0000587^OBO_REL:inheres_in(TAO:0001173)              OBO_REL:is_a                    PATO:0000587
 
 
=== Sweeps ===
 
 
A reasoner functions over several sweeps. In each sweep, new implicit inferences are derived from the explicit annotations (as described in the previous sections) and added to the database. In the following sweep, inferences added from the previous sweep are used to extract further inferences. This process continues until no additional inferences are added in a sweep. This is when the ''deductive closure of the inference procedure'' is reached. No further inferences are possible and the reasoner exits.
 
 
==Web services==
 
Each service may support multiple media types.  The desired media type can be specified by appending <code>?media=json</code> or similar to the request URL.  URI specifications are defined (loosely) using [http://bitworking.org/projects/URI-Templates/draft-gregorio-uritemplate-00.html URI Templates].
 
===Term info===
 
'''URI'''
 
 
<BASE URI>/term/{term_id}
 
 
'''Returns'''
 
 
JSON:
 
<javascript>
 
{
 
    "id" : "TAO:0001700",
 
    "name" : "caudal-fin stay",
 
    "definition" : "Bone that is located anterior to the caudal procurrent rays. Caudal fin stays are unpaired bone.",
 
    "parents" :
 
    [
 
        {
 
            "relation" : {
 
                "id" : "OBO_REL:is_a",
 
                "name" : "is_a"
 
            },
 
            "target" : {
 
                "id" : "TAO:0001514",
 
                "name" : "bone"
 
            }
 
 
        },
 
        {
 
            "relation" : {
 
                "id" : "OBO_REL:part_of",
 
                "name" : "part_of"
 
            },
 
            "target" : {
 
                "id" : "TAO:0000862",
 
                "name" : "caudal fin skeleton"
 
            }
 
        }
 
    ],
 
    "children" : [] // if there are children, this content should be in the same format as the parents list
 
}
 
// how should xrefs, etc. be represented, property_value definitions?
 
</javascript>
 
 
OWL-RDF:
 
 
Todo...
 
 
'''Error'''
 
 
If there is no term with the given ID, the service should return "404 Not Found".
 
 
====Handling of anonymous post-compositions====
 
 
===Autocomplete===
 
'''URI'''
 
 
<BASE URI>/term/search?text=[input]&name=[true|false]&syn=[true|false]&def=[true|false]&ontology=[ont1,ont2,...]&limit=[count]
 
 
All URI parameters are optional except for <code>text</code>.  Default values are name=true, syn=false, def=false.  The "ontology" parameter should be a comma-separated list of ontology prefixes to search within.  If not given, the default is to search all ontologies. Specifying "ZFIN" for the ontology should be a search for gene nodes, by gene name.  The "limit" parameter limits the number of results to the given integer.
 
 
'''Returns'''
 
 
JSON:
 
<javascript>
 
{
 
    "matches" : [
 
        {  // overall format
 
            "id" : "TAO:0001514",
 
            "name" : "bone",
 
            "match_type" : "name" | "syn" | "def",
 
            "match_text" : "this is the term name, synonym name, or definition that matched"
 
        },
 
        {  // a name example
 
            "id" : "TAO:0001514",
 
            "name" : "bone",
 
            "match_type" : "name",
 
            "match_text" : "bone"
 
        },
 
        {  // a synonym example
 
            "id" : "TAO:0001795",
 
            "name" : "ceratohyal foramen",
 
            "match_type" : "syn",
 
            "match_text" : "bericiform foramen"
 
        },
 
        {  // a definition example
 
            "id" : "TAO:0000488",
 
            "name" : "ceratobranchial bone",
 
            "match_type" : "def",
 
            "match_text" : "Ceratobranchials are bilaterally paired cartilage bones that form part of the ventral branchial arches. They articulate medially with the hypobranchials and laterally and dorsally with the epibranchials.  Ceratobranchials 1-5 ossify in the ceratobranchial cartilages."
 
        }
 
    ],
 
    "search_term" : "bone",
 
    "total" : 1859
 
}
 
</javascript>
 
 
'''Error'''
 
 
If there are no terms matching the given input, a document should still be returned, containing an empty results list.
 
 
===Anatomy search summary===
 
'''URI'''
 
 
<BASE URI>/phenotypes/summary/anatomy/{term_id}
 
 
<code>term_id</code> is an anatomy search term.  This service returns a summary of all the phenotype annotations involving that anatomy term (or its descendants, via reasoning).  The summary is grouped by quality attribute term - all annotations with qualities descending from the same closest attribute should be grouped into the same count.
 
 
'''Returns'''
 
 
JSON:
 
<javascript>
 
{
 
    "term" : { "id" : "TAO:xxxxx", "name" : "some anatomical part"},
 
    "qualities" : [
 
        {
 
            "id" : "PATO:000xxxxx",
 
            "name" : "shape",
 
            "taxon_annotations" : {
 
                "annotation_count": 5,
 
                "taxon_count" : 3
 
            },
 
            "genotype_annotations" : {
 
                "annotation_count" : 3,
 
                "genotype_count" : 2
 
            }
 
        },
 
        {
 
            // another quality attribute
 
        } // etc.
 
    ]
 
}
 
</javascript>
 
 
===Anatomy taxon annotations results===
 
'''URI'''
 
 
<BASE URI>/phenotypes/anatomy/{anatomy_term_id}/taxa/{quality_term_id}?attribute=[true|false]
 
 
<code>anatomy_term_id</code> is an anatomy search term.  This service returns all of the phenotype annotations involving that anatomy term (or its descendants, via reasoning) and the given quality term (or its descendants, via reasoning).  In the result, the values of the outer "entity" and "quality" keys are the search inputs.
 
 
The "attribute" query parameter is optional - if "true", the service should first find the corresponding attribute for the given quality, and use that in the search instead of the given quality.  The default should be "false".
 
 
'''Returns'''
 
 
JSON:
 
<javascript>
 
{
 
    "entity" : { "id" : "TAO:34242", "name" : "some anatomical part" },
 
    "quality" : { "id" : "PATO:1234", "name" : "some attribute quality"},
 
    "annotations" : [
 
    {
 
        "taxon" : { "id" : "TTO:34242", "name" : "some species" },
 
        "entity" : { "id" : "TAO:34242", "name" : "some anatomical part" },
 
        "quality" : { "id" : "PATO:34242", "name" : "some quality" }
 
    },
 
    {
 
        "taxon" : { "id" : "TTO:34242", "name" : "some species" },
 
        "entity" : { "id" : "TAO:34242", "name" : "some anatomical part" },
 
        "quality" : { "id" : "PATO:34242", "name" : "some quality" }
 
    },
 
    {
 
        "taxon" : { "id" : "TTO:34242", "name" : "some species" },
 
        "entity" : { "id" : "TAO:34242", "name" : "some anatomical part" },
 
        "quality" : { "id" : "PATO:34242", "name" : "some quality" }
 
    }
 
    ]
 
}
 
</javascript>
 
 
===Anatomy genotypes annotations results===
 
'''URI'''
 
 
<BASE URI>/phenotypes/anatomy/{anatomy_term_id}/genes/{quality_term_id}
 
 
<code>anatomy_term_id</code> is an anatomy search term.  This service returns all of the phenotype annotations for genotypes involving that anatomy term (or its descendants, via reasoning) and the given quality term (or its descendants, via reasoning).  In the result, the values of the outer "entity" and "quality" keys are the search inputs.
 
 
The "attribute" query parameter is optional - if "true", the service should first find the corresponding attribute for the given quality, and use that in the search instead of the given quality.  The default should be "false".
 
 
'''Returns'''
 
 
JSON:
 
<javascript>
 
{
 
    "entity" : { "id" : "TAO:34242", "name" : "some anatomical part" },
 
    "quality" : { "id" : "PATO:1234", "name" : "some attribute quality"},
 
    "annotations" : [
 
          {
 
            "genotype" : { "id" : "ZFIN:34242", "name" : "some genotype" },
 
            "entity" : { "id" : "TAO:34242", "name" : "some anatomical part" },
 
            "quality" : { "id" : "PATO:34242", "name" : "some quality" }
 
        },
 
        {
 
            "genotype" : { "id" : "ZFIN:34242", "name" : "some genotype" },
 
            "entity" : { "id" : "TAO:34242", "name" : "some anatomical part" },
 
            "quality" : { "id" : "PATO:34242", "name" : "some quality" }
 
        },
 
        {
 
            "genotype" : { "id" : "ZFIN:34242", "name" : "some genotype" },
 
            "entity" : { "id" : "TAO:34242", "name" : "some anatomical part" },
 
            "quality" : { "id" : "PATO:34242", "name" : "some quality" }
 
        }
 
    ]
 
}
 
</javascript>
 
 
===Gene search summary===
 
'''URI'''
 
 
<BASE URI>/phenotypes/summary/gene/{term_id}
 
 
<code>term_id</code> is an gene symbol search id.  This service returns all the phenotype annotations involving that gene.  Identical annotations from different genotypes are lumped together.  Each annotation has a list of all the genotypes which exhibited it.  Each annotation includes the term from TAO matching the ZFA term in "teleost_entity".
 
 
'''Returns'''
 
 
JSON:
 
<javascript>
 
{
 
    "gene" : { "id" : "ZFIN:xxxxx", "name" : "some gene name"},
 
    "annotations" : [
 
        {
 
            "entity" : { "id" : "ZFA:34242", "name" : "some anatomical part" },
 
            "quality" : { "id" : "PATO:34242", "name" : "some quality" },
 
            "teleost_entity" : { "id" : "TAO:34242", "name" : "some anatomical part" }
 
            "genotypes" : [
 
                { "id" : "ZFIN:xxxxx", "name" : "some genotype" },
 
                { "id" : "ZFIN:xxxxx", "name" : "some genotype" },
 
                { "id" : "ZFIN:xxxxx", "name" : "some genotype" }
 
            ]
 
        },
 
        {
 
            "entity" : { "id" : "ZFA:34242", "name" : "some anatomical part" },
 
            "quality" : { "id" : "PATO:34242", "name" : "some quality" },
 
            "teleost_entity" : { "id" : "TAO:34242", "name" : "some anatomical part" }
 
            "genotypes" : [
 
                { "id" : "ZFIN:xxxxx", "name" : "some genotype" },
 
                { "id" : "ZFIN:xxxxx", "name" : "some genotype" },
 
                { "id" : "ZFIN:xxxxx", "name" : "some genotype" }
 
            ]
 
        }
 
    ]
 
}
 
</javascript>
 
  
 
[[Category:Informatics]]
 
[[Category:Informatics]]

Latest revision as of 00:44, 9 January 2009