Difference between revisions of "Phenoscape Grant Renewal Workshop/Notes"
From phenoscape
(→Idea bin) |
(→Idea bin) |
||
Line 70: | Line 70: | ||
#* Mouse is done but needs to be rearranged (MP decomposition) | #* Mouse is done but needs to be rearranged (MP decomposition) | ||
#* Amniotes need to be done from the ground up | #* Amniotes need to be done from the ground up | ||
+ | |||
+ | == Wednesday afternoon == | ||
+ | |||
+ | * Problem of lossy transformation of legacy character data | ||
+ | ** EQ annotation is often at a lesser granularity (or level of detail) than the original free text description | ||
+ | ** This isn't due to any technological obstacles but rather due to the limited resources available for annotation. | ||
+ | ** Annotation effort is best focused to a granularity level where it suffices to solve a use-case of interest. | ||
+ | ** Annotating at very granular levels is entirely possible but can take a lot of time because many of the ontology terms needed are likely not to be present yet in the ontology. | ||
+ | ** How does this impact phylogeny and data matrix reconciliation, or phylogenetic reconstruction on the basis of the EQ annotations? | ||
+ | * Use-case for development | ||
+ | ** Heterochrony would be interesting if one can pull it out of the database | ||
+ | ** Developmental ordering relationships currently used are confined to develops_from and transforms-into. | ||
+ | *** This is missing yet for mouse. | ||
+ | *** Wouldn't really allow inferring heterochrony. | ||
+ | *** Would allow though to place and compare the placement of phenotypes into a developmental chain of transformations. | ||
+ | * Demonstration of Phenoscape KB | ||
+ | ** Some of the features we are developing (the 3 principle queries) would be very useful to MOD users | ||
+ | * Informatics goals for Phenoscape 2: | ||
+ | ** Supporting units for measurements | ||
+ | ** Allowing others to compare their phenotype data to the knowledge base | ||
+ | ** Generic ontological queries, such as a SPARQL endpoint | ||
+ | ** Private data overlay | ||
+ | ** Linked Data and semweb integration | ||
+ | ** Tree visualization | ||
+ | ** Triple store technology evaluation | ||
+ | ** Species ID | ||
+ | ** Phylogenetically informative distance metric based on EQ assertions | ||
+ | ** NLP-based text processing, Mass curation | ||
== Idea bin == | == Idea bin == | ||
− | + | # Intermine connection (multiple model organisms, AJAX-based widget for displaying protein family tree) | |
+ | # Some phenotypes imply developmental abnormalities, differences, or variation. | ||
+ | #* For example, a "poorly ossified cranium" in an adult amphibian implies that the developmental process was delayed or did not complete. | ||
+ | # How does logical inference of homology propagate over develops-from relationships. | ||
+ | #* E.g., if A and B are asserted homologous, and C develops from A and D develops from B, are then C and D inferred as homologous, and are C and B inferred as homologous. |
Revision as of 14:11, 30 April 2009
Tuesday afternooon
- Will be extending the taxonomic scope to extant and extinct vertebrates
- Create an annotation database spanning the taxonomic scope and their anatomy ontologies
- Three multi-species anatomy ontologies are being developed: teleosts, amphibians, mammalian anatomy
- Tree mapping: show when an evolutionary phenotype first appeared
- EQ support in all involved model organism databases
- Structure of MP is not consistent with anatomy, or with PATO
- Mapping from MP to EQ syntax is being worked on
- MGI isn't in a position to use EQ internally, but for example Phenoscape could provide an EQ-view on mouse phenotypes, using the decomposition of MP into anatomy (or entity) and quality term cross-product
- Full EQ annotation for OMIM is being planned but not yet funded
- MGI, Xenbase, and ZFIN all have links from their phenotype data to OMIM
- Development of anatomy
- Different approaches between MGI and ZFIN:
- MGI uses complete anatomy at different developmental stages
- ZFIN uses one anatomy ontology and adds start and end dates to indicate the developmental period during which it appears (adult structures don't have an end date)
- Xenbase uses the ZFIN approach
- Candidate genes for evolutionary change could be derived from anatomy-annotated gene expression studies during development
- genes responsible for or associated with morphological changes during individual development could be candidates for evolutionary change
- evolutionary phenotype changes could be used to query morphological changes during development
- Can we enable queries to generate hyptheses based on ontogeny reflecting evolutionary history?
- Evolutionary developmental data could come from medaka, stickleback
- Different approaches between MGI and ZFIN:
- Sequence of steps:
- Ontology building
- Adult phenotype annotations for mutants in Xenbase
- Presently only 3 species for amphibians: Xenopus, Dermophis, Salamandria
- Development decoupling from anatomy ontology in mouse
- Expanding mammalian anatomy to include extinct species
- Anatomy for amniotes (the clade including mouse and dinosaurs), including extinct taxa (such as dinosaurs)
- Scope of this could be overwhelming
- Should be strongly driven (or staged) according to the character matrices to be annotated
- Chicken anatomy could provide a starting point?
- There is work on bird anatomy that uses a latin naming scheme
- Work on any smaller-scope anatomy (or multi-species anatomy) ontology will contribute terms that apply more broadly
- There may not be very many neomorphs between birds and mammals, though there are a few areas such as the digits in birds where the exact homology relationships aren't as clearly agreed upon.
- Mammalian phenotype decomposition
- Need support for QC'ing the results of automated decomposition
- Alignment of MP with mouse anatomy
- Annotation
- Which published character matrices are there for dinosaurs?
- Ontology building
- Ontology mapping and alignment
- Need to align teleost and amphibian anatomy to mouse.
- Mouse and chicken need to be mapped to amniote anatomy. Mouse is a better start because there are genetic data.
- Reconciling character-derived trees and character definition and use between different matrices and trees that share taxa (Paul)
- TaxonSearch
- Formalizing and generating grammar of character coding and character state definition
- Analyzing character usage (shared, not shared, rejected) between trees that share taxa
- Conflicting phylogenies for the same set of taxa often turn out to be based on character codings that are not consistent or not compatible
Wednesday morning
- Nomenclatural history of terms in ontologies
- Current exchange format standards don't support this really beyond obsoletion
- ZFIN tracks within the database the complete nomenclatural history of gene names
- RDBOM tracks literature and author attribution for anatomy terms but not yet nomenclatural history
- Character quality profiles for data matrices
- Addresses the question of "Where in the skeleton are the changes occurring that drive the phylogeny"
- Distribution of characters across the skeletal anatomy, at different levels of the hierarchy
- Distribution of missing data across the anatomy, at different hierarchy levels
- How many taxa have how much missing data, distributed on which anatomical parts
- Use case: Character redundancy
- Deriving a character profile across the anatomy could help visualize the redundancy between annotations
- Use case: Mapping the evolutionary characters that lead (and distinguish) to a model organism. Subsequently, see whether these evolutionary changes and the order in which they appear to the ontogeny of the organism.
- Will need to deal with character gaps and with character redundancy for this.
Specific goals:
- Expand taxonomic coverage of Phenoscape to Vertebrata
- Mouse is done but needs to be rearranged (MP decomposition)
- Amniotes need to be done from the ground up
Wednesday afternoon
- Problem of lossy transformation of legacy character data
- EQ annotation is often at a lesser granularity (or level of detail) than the original free text description
- This isn't due to any technological obstacles but rather due to the limited resources available for annotation.
- Annotation effort is best focused to a granularity level where it suffices to solve a use-case of interest.
- Annotating at very granular levels is entirely possible but can take a lot of time because many of the ontology terms needed are likely not to be present yet in the ontology.
- How does this impact phylogeny and data matrix reconciliation, or phylogenetic reconstruction on the basis of the EQ annotations?
- Use-case for development
- Heterochrony would be interesting if one can pull it out of the database
- Developmental ordering relationships currently used are confined to develops_from and transforms-into.
- This is missing yet for mouse.
- Wouldn't really allow inferring heterochrony.
- Would allow though to place and compare the placement of phenotypes into a developmental chain of transformations.
- Demonstration of Phenoscape KB
- Some of the features we are developing (the 3 principle queries) would be very useful to MOD users
- Informatics goals for Phenoscape 2:
- Supporting units for measurements
- Allowing others to compare their phenotype data to the knowledge base
- Generic ontological queries, such as a SPARQL endpoint
- Private data overlay
- Linked Data and semweb integration
- Tree visualization
- Triple store technology evaluation
- Species ID
- Phylogenetically informative distance metric based on EQ assertions
- NLP-based text processing, Mass curation
Idea bin
- Intermine connection (multiple model organisms, AJAX-based widget for displaying protein family tree)
- Some phenotypes imply developmental abnormalities, differences, or variation.
- For example, a "poorly ossified cranium" in an adult amphibian implies that the developmental process was delayed or did not complete.
- How does logical inference of homology propagate over develops-from relationships.
- E.g., if A and B are asserted homologous, and C develops from A and D develops from B, are then C and D inferred as homologous, and are C and B inferred as homologous.