Difference between revisions of "Talk:Data Jamboree 2"
Paula Mabee (talk | contribs) |
Paula Mabee (talk | contribs) |
||
Line 45: | Line 45: | ||
− | '''''Wasila's notes''''' | + | '''''Wasila & Paula's notes''''' |
− | Comparative phenotypes | + | Comparative phenotypes: discussion of relative size, shape characters: |
− | size | + | Problem: Descriptors comparing size, shape among taxa within a pub cannot be extended to taxa outside the study (e.g., Rick's size of bone example with three character states: large (0), small (1), extremely small (2)) |
+ | How to do this with ontologies? Judy pointed out that there is a (dynamic line)in annotation, between the depth of a structured vocabulary and free text. I.e. where do you stop using an ontology and begin using free text? Data specific to study should be free-text. | ||
− | John's idea for recording size comparison within a study | + | Judy Blake intially suggested simply annotating our complex anatomical characters to "shape". Through further discussion, we agree that annotating to a more granular level, i.e. "shape: width" would be better and more informative. The weakness of the current PATO comes in here, in that there need to be more nodes between shape and all the terminal nodes (the descriptors such as "narrow" "broad" etc.). This might allow more depth than just "shape:width". |
+ | |||
+ | John's idea for recording size comparison within a study: Use shape: width and ALSO apply an internal grading system for these characters such that the least/smallest value is given 1. | ||
*graded series of lengths, widths, etc.. give 1, 2, 3 ... | *graded series of lengths, widths, etc.. give 1, 2, 3 ... | ||
*least/smallest value is given 1 | *least/smallest value is given 1 | ||
Line 72: | Line 75: | ||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
**but - need to index study and a level that is useful for comparative studies | **but - need to index study and a level that is useful for comparative studies | ||
***like library - "binning" index to multiple things | ***like library - "binning" index to multiple things |
Revision as of 20:00, 3 October 2008
September 28, 2008 - Discussion on Query Prototype (Proto Alpha Version) of Phenoscape (Presented by Jim Balhoff)
- Visualization
- Jim Balhoff suggested enabling drill down from higher taxa to lower ones. Typically, query for annotations will yield more results at higher levels in the taxonomy. Drilling down into the lower levels will serve to prune the results and narrow down to users' exact requirements. Monte Westerfield, and Paula Mabee seconded.
- Mark Sabaj suggested using a fish landscape with different predefined areas for visualization of results and guiding the search. Paula Mabee and John Lundberg approved.
- Todd Vision suggests displaying only those nodes of a taxonomy that have been annotated
- Judith Blake suggested using Cytoscape to browse through various nodes in the tree
- Monte Westerfield suggested linking phenotypes to genes
- Todd Vision suggested character correlations
- Querying
- Todd Vision suggested using auto composition of search terms
- Monte Westerfield suggested using Boolean combinations of query parameters. Seconded by Judith Blake
Hilmar's notes
- option for hierarchical indexing of results (taxonomy, phylogeny, but also anatomy ontology)
- mapping characters on a tree: multiple phenotypes may match any particular query
- map to different colors for indicators? use numbers as indexes?
- mapping phenotypes onto trees cannot typically reconstruct character state changes, and hence traditional visualizations may be misleading?
- ability to prune species with no data (values) for export
- search interface: ability to combine taxon/entity/quality specifications (and, or, not)
- graph navigation: Dbgraphnav, Cytoscape
- clickable fish image for starting navigation
- most common entry point is likely to be a simple one-field form for entering terms
- phenotype query prototype: how do I get from here to the genes?
- ability to see correlations between phenotypes
MGI batch query demo
- users don't use complex query forms
- auto-detect type of input tokens
- allow download in different formats
- computationally savvy users
- pre-written SQL as available from GO website
September 29, 2008 (Monday) - Data Curation Session at Sylvan Lake Lodge Meeting Room
Wasila & Paula's notes
Comparative phenotypes: discussion of relative size, shape characters: Problem: Descriptors comparing size, shape among taxa within a pub cannot be extended to taxa outside the study (e.g., Rick's size of bone example with three character states: large (0), small (1), extremely small (2)) How to do this with ontologies? Judy pointed out that there is a (dynamic line)in annotation, between the depth of a structured vocabulary and free text. I.e. where do you stop using an ontology and begin using free text? Data specific to study should be free-text.
Judy Blake intially suggested simply annotating our complex anatomical characters to "shape". Through further discussion, we agree that annotating to a more granular level, i.e. "shape: width" would be better and more informative. The weakness of the current PATO comes in here, in that there need to be more nodes between shape and all the terminal nodes (the descriptors such as "narrow" "broad" etc.). This might allow more depth than just "shape:width".
John's idea for recording size comparison within a study: Use shape: width and ALSO apply an internal grading system for these characters such that the least/smallest value is given 1.
- graded series of lengths, widths, etc.. give 1, 2, 3 ...
- least/smallest value is given 1
Eric example of incomplete/complete scute series:
- E: scute series; Q: in contact RE: skull
- E: scute series; Q: in contact RE: dorsal fin
- E: scute series; Q: separated from RE: skull
- E: scute series; Q: separated from RE: dorsal fin
Judy: important to separate tasks of anatomy ontology development and annotation
- suggested having ontology dev. workshops and curation workshops
- idea:- enter terms ahead of time and have experts fix it
Mark submitted TAO request on Weberian vertebra - relationships that don't hold for all taxa
Wasila: batching vs. one term request: ok to submit related terms in one request
Suzi: suggested having a small PATO workshop with ichthyologists, like anatomy workshop
- but - need to index study and a level that is useful for comparative studies
- like library - "binning" index to multiple things
- but - need to index study and a level that is useful for comparative studies
Judy: Indexing first pass useful for users to be able to aggregate the data
- vs. full curation
curators understand details of system
JB: put 2-3 people together to do anatomical subtree: send out invite to ontology development
anatonotation workshop (vs. ontol. dev)
- vs. our problem: aren't finding entities
- bringing in experts
Judy's observation notes
- Wasila and John talking about anatomical terms and what they should be
- Paula showing Mark how to log into sourceforge
- Rick looking at characters for this paper
- Wasila from Peter: question on batching or one term per request
- Jeff helps Eric remember how to log into sourceforge
- Terry working on her paper
- looking for "right" first ... low hanging fruit
- Rick annotating with Wasila's help
- Mark entering successful s.f. proposal
- Eric looking for term in his paper
- Terry asking Jeff how to enter post-composed terms
September 30, 2008 (Tuesday) - Project Personnel Meeting
Wasila's notes:
Advisor's feedback
- Suzi: curators need to be aware that if they don't find the most appropriate term, they should not settle for the closest term but request what they need
- Judy: curators should be aware of 2 jobs: annotation and ontology improvement
- Suzi: add Wasila as ontology writer to Pato
October 1, 2008 (Wed morning) Wrap-up discussion with curators
Wasila's notes
Paula: suggestions for improving curation process?
- Terry: would like to go through a paper first to deal with needed terms
- also liked annotating easy terms to become familiar with ontology structure
- John: some things, like joints, can be added in bulk
- All: visualization issues - how to know what is there, and their relationships
- Need large poster of all terms
- Paula: PDF mark-up tool that highlights terms matching to TAO; characters not highlighted would likely require new term
- Todd and Paula: use Phenex to highlight ontology terms because character and state descriptions from pub are pre-entered as free text
- Mark: how does a curator decide to precompose vs. postcompose
- Wasila: if entity will be used repeatedly (in single or multiple pubs) then add to TAO; if not, post-compose
- Mark: when post-composing, would be helpful if Phenex would autocomplete to pre-composed cross-product term (if present in TAO) so that curator can be aware of similar term in ontology
- Suzi: tough anatomy stuff: you can discuss as a group of curators then submit and enter in ontology
- Paula: need to set up svn for Phenex files - WD will send instructions to new curators
- Rick: can we break down single character into multiple characters?
- Paula: make into multiple annotations - don't break down character
- Pubs with same characters
- Paula: curate it all separately
- Jim: duplicates will come together in database
Paula: non-curation of present entities
- curate for basal taxa or annotate presence for a higher level group with weak evidence code
- propogate by evolutionary inference
- Jim: phenotypes are exhibits; congregate exhibits to higher level seems to be ok to do (e.g., cypriniformes exhibit red and blue dorsal fin)
- inheritance vs. propagation
- might need more than one reationship between taxon and a phenotype; right now we have exhibits as the relationship
- Paula: This relates to character mapping
- Paula: Will followup with curators regarding completeness of literature surveyed