Difference between revisions of "Talk:Data Jamboree 2"

From phenoscape
Line 49: Line 49:
 
Comparative phenotypes: discussion of relative size, shape characters:
 
Comparative phenotypes: discussion of relative size, shape characters:
 
Problem: Descriptors comparing size, shape among taxa within a pub cannot be extended to taxa outside the study (e.g., Rick's size of bone example with three character states: large (0), small (1), extremely small (2))
 
Problem: Descriptors comparing size, shape among taxa within a pub cannot be extended to taxa outside the study (e.g., Rick's size of bone example with three character states: large (0), small (1), extremely small (2))
How to do this with ontologies?  Judy pointed out that there is a (dynamic line)in annotation, between the depth of a structured vocabulary and free text.  I.e. where do you stop using an ontology and begin using free text?  Data specific to study should be free-text.
+
How to do this with ontologies?  Judy pointed out that there is a (dynamic line)in annotation, between the depth of a structured vocabulary and free text.  I.e. where do you stop using an ontology and begin using free text?  Data specific to study should be free-text.
  
Judy Blake intially suggested simply annotating our complex anatomical characters to "shape".  Through further discussion, we agree that annotating to a more granular level, i.e. "shape: width" would be better and more informative.  The weakness of the current PATO comes in here, in that there need to be more nodes between shape and all the terminal nodes (the descriptors such as "narrow" "broad" etc.).  This might allow more depth than just "shape:width".
+
Judy Blake intially suggested simply annotating our complex anatomical characters to "shape".  This indexing first pass is useful for users to be able to aggregate the data (vs. full curation).  Through further discussion, we agree that annotating to a more granular level, i.e. "shape: width" would be better and more informative.  The weakness of the current PATO comes in here, in that there need to be more nodes between shape and all the terminal nodes (the descriptors such as "narrow" "broad" etc.).  This might allow more depth than just "shape:width".
 +
 
 +
We need to index our comparative systematic studies at a level that is useful for the field.  It is like a library, "binning" index to multiple things.
  
 
John's idea for recording size comparison within a study:  Use shape: width and ALSO apply an internal grading system for these characters such that the least/smallest value is given 1.
 
John's idea for recording size comparison within a study:  Use shape: width and ALSO apply an internal grading system for these characters such that the least/smallest value is given 1.
Line 57: Line 59:
 
*least/smallest value is given 1
 
*least/smallest value is given 1
  
Eric example of incomplete/complete scute series:
+
Eric's example of incomplete/complete scute series:
  
 
*E: scute series; Q: in contact RE: skull
 
*E: scute series; Q: in contact RE: skull
Line 64: Line 66:
 
*E: scute series; Q: separated from RE: dorsal fin
 
*E: scute series; Q: separated from RE: dorsal fin
  
Judy: important to separate tasks of anatomy ontology development and annotation
+
Judy: important to separate tasks of 1) anatomy ontology development and 2) annotation of publications
*suggested having ontology dev. workshops and curation workshops
+
*suggested having ontology development workshops and curation workshops
*idea:- enter terms ahead of time and have experts fix it
+
*her suggestion: Curator needs to enter terms ahead of time and have experts fix the annotations (problem at our workshop currently is that people are not finding entities in ontology - need ontology work)
  
 
Mark submitted TAO request on Weberian vertebra - relationships that don't hold for all taxa
 
Mark submitted TAO request on Weberian vertebra - relationships that don't hold for all taxa
Line 73: Line 75:
  
 
Suzi: suggested having a small PATO workshop with ichthyologists, like anatomy workshop
 
Suzi: suggested having a small PATO workshop with ichthyologists, like anatomy workshop
 
 
**but - need to index study and a level that is useful for comparative studies
 
***like library - "binning" index to multiple things
 
 
Judy: Indexing first pass useful for users to be able to aggregate the data
 
*vs. full curation
 
  
 
curators understand details of system
 
curators understand details of system
  
 
JB: put 2-3 people together to do anatomical subtree: send out invite to ontology development
 
JB: put 2-3 people together to do anatomical subtree: send out invite to ontology development
 
anatonotation workshop (vs. ontol. dev)
 
*vs. our problem: aren't finding entities
 
*bringing in experts
 
 
  
 
'''''Judy's observation notes'''''
 
'''''Judy's observation notes'''''
Line 94: Line 84:
 
* Wasila and John talking about anatomical terms and what they should be
 
* Wasila and John talking about anatomical terms and what they should be
 
* Paula showing Mark how to log into sourceforge
 
* Paula showing Mark how to log into sourceforge
* Rick looking at characters for this paper
+
* Rick looking at characters for his paper
 
* Wasila from Peter: question on batching or one term per request
 
* Wasila from Peter: question on batching or one term per request
 
* Jeff helps Eric remember how to log into sourceforge
 
* Jeff helps Eric remember how to log into sourceforge

Revision as of 20:05, 3 October 2008

September 28, 2008 - Discussion on Query Prototype (Proto Alpha Version) of Phenoscape (Presented by Jim Balhoff)


  • Visualization
  1. Jim Balhoff suggested enabling drill down from higher taxa to lower ones. Typically, query for annotations will yield more results at higher levels in the taxonomy. Drilling down into the lower levels will serve to prune the results and narrow down to users' exact requirements. Monte Westerfield, and Paula Mabee seconded.
  2. Mark Sabaj suggested using a fish landscape with different predefined areas for visualization of results and guiding the search. Paula Mabee and John Lundberg approved.
  3. Todd Vision suggests displaying only those nodes of a taxonomy that have been annotated
  4. Judith Blake suggested using Cytoscape to browse through various nodes in the tree
  5. Monte Westerfield suggested linking phenotypes to genes
  6. Todd Vision suggested character correlations


  • Querying
  1. Todd Vision suggested using auto composition of search terms
  2. Monte Westerfield suggested using Boolean combinations of query parameters. Seconded by Judith Blake


Hilmar's notes

  • option for hierarchical indexing of results (taxonomy, phylogeny, but also anatomy ontology)
  • mapping characters on a tree: multiple phenotypes may match any particular query
    • map to different colors for indicators? use numbers as indexes?
    • mapping phenotypes onto trees cannot typically reconstruct character state changes, and hence traditional visualizations may be misleading?
  • ability to prune species with no data (values) for export
  • search interface: ability to combine taxon/entity/quality specifications (and, or, not)
  • graph navigation: Dbgraphnav, Cytoscape
  • clickable fish image for starting navigation
  • most common entry point is likely to be a simple one-field form for entering terms
  • phenotype query prototype: how do I get from here to the genes?
  • ability to see correlations between phenotypes

MGI batch query demo

  • users don't use complex query forms
  • auto-detect type of input tokens
  • allow download in different formats
  • computationally savvy users
  • pre-written SQL as available from GO website


September 29, 2008 (Monday) - Data Curation Session at Sylvan Lake Lodge Meeting Room


Wasila & Paula's notes

Comparative phenotypes: discussion of relative size, shape characters: Problem: Descriptors comparing size, shape among taxa within a pub cannot be extended to taxa outside the study (e.g., Rick's size of bone example with three character states: large (0), small (1), extremely small (2)) How to do this with ontologies? Judy pointed out that there is a (dynamic line)in annotation, between the depth of a structured vocabulary and free text. I.e. where do you stop using an ontology and begin using free text? Data specific to study should be free-text.

Judy Blake intially suggested simply annotating our complex anatomical characters to "shape". This indexing first pass is useful for users to be able to aggregate the data (vs. full curation). Through further discussion, we agree that annotating to a more granular level, i.e. "shape: width" would be better and more informative. The weakness of the current PATO comes in here, in that there need to be more nodes between shape and all the terminal nodes (the descriptors such as "narrow" "broad" etc.). This might allow more depth than just "shape:width".

We need to index our comparative systematic studies at a level that is useful for the field. It is like a library, "binning" index to multiple things.

John's idea for recording size comparison within a study: Use shape: width and ALSO apply an internal grading system for these characters such that the least/smallest value is given 1.

  • graded series of lengths, widths, etc.. give 1, 2, 3 ...
  • least/smallest value is given 1

Eric's example of incomplete/complete scute series:

  • E: scute series; Q: in contact RE: skull
  • E: scute series; Q: in contact RE: dorsal fin
  • E: scute series; Q: separated from RE: skull
  • E: scute series; Q: separated from RE: dorsal fin

Judy: important to separate tasks of 1) anatomy ontology development and 2) annotation of publications

  • suggested having ontology development workshops and curation workshops
  • her suggestion: Curator needs to enter terms ahead of time and have experts fix the annotations (problem at our workshop currently is that people are not finding entities in ontology - need ontology work)

Mark submitted TAO request on Weberian vertebra - relationships that don't hold for all taxa

Wasila: batching vs. one term request: ok to submit related terms in one request

Suzi: suggested having a small PATO workshop with ichthyologists, like anatomy workshop

curators understand details of system

JB: put 2-3 people together to do anatomical subtree: send out invite to ontology development

Judy's observation notes

  • Wasila and John talking about anatomical terms and what they should be
  • Paula showing Mark how to log into sourceforge
  • Rick looking at characters for his paper
  • Wasila from Peter: question on batching or one term per request
  • Jeff helps Eric remember how to log into sourceforge
  • Terry working on her paper
  • looking for "right" first ... low hanging fruit
  • Rick annotating with Wasila's help
  • Mark entering successful s.f. proposal
  • Eric looking for term in his paper
  • Terry asking Jeff how to enter post-composed terms


September 30, 2008 (Tuesday) - Project Personnel Meeting


Wasila's notes:

Advisor's feedback

  • Suzi: curators need to be aware that if they don't find the most appropriate term, they should not settle for the closest term but request what they need
  • Judy: curators should be aware of 2 jobs: annotation and ontology improvement
  • Suzi: add Wasila as ontology writer to Pato


October 1, 2008 (Wed morning) Wrap-up discussion with curators

Wasila's notes

Paula: suggestions for improving curation process?

  • Terry: would like to go through a paper first to deal with needed terms
    • also liked annotating easy terms to become familiar with ontology structure
  • John: some things, like joints, can be added in bulk
  • All: visualization issues - how to know what is there, and their relationships
    • Need large poster of all terms
    • Paula: PDF mark-up tool that highlights terms matching to TAO; characters not highlighted would likely require new term
    • Todd and Paula: use Phenex to highlight ontology terms because character and state descriptions from pub are pre-entered as free text
  • Mark: how does a curator decide to precompose vs. postcompose
    • Wasila: if entity will be used repeatedly (in single or multiple pubs) then add to TAO; if not, post-compose
    • Mark: when post-composing, would be helpful if Phenex would autocomplete to pre-composed cross-product term (if present in TAO) so that curator can be aware of similar term in ontology
  • Suzi: tough anatomy stuff: you can discuss as a group of curators then submit and enter in ontology
  • Paula: need to set up svn for Phenex files - WD will send instructions to new curators
  • Rick: can we break down single character into multiple characters?
    • Paula: make into multiple annotations - don't break down character
  • Pubs with same characters
    • Paula: curate it all separately
    • Jim: duplicates will come together in database

Paula: non-curation of present entities

  • curate for basal taxa or annotate presence for a higher level group with weak evidence code
  • propogate by evolutionary inference
  • Jim: phenotypes are exhibits; congregate exhibits to higher level seems to be ok to do (e.g., cypriniformes exhibit red and blue dorsal fin)
  • inheritance vs. propagation
  • might need more than one reationship between taxon and a phenotype; right now we have exhibits as the relationship
  • Paula: This relates to character mapping
  • Paula: Will followup with curators regarding completeness of literature surveyed