Collaborative Phenotype Annotation

From phenoscape
Revision as of 20:06, 18 September 2011 by Jim Balhoff (talk | contribs) (Requirements from Phenoscape I curation workflow (that are expected to carry forward))

Requirements

A requirements and priorities document is in development and being reviewed by stakeholders.

Technology options

MX

Requirements from Phenoscape I curation workflow (that are expected to carry forward)

  • import matrix data from file source (usually NEXUS)
    • Supported. Can import character and state labels, matrix data from a NEXUS file. This is a one-time import before editing in MX.
  • hold a reference to a publication as the source for the matrix
    • Supported. Has a module for maintaining references (publications) which can be used throughout the system. References can be used to tag several different types of data in the system.
  • view and edit matrix
    • Supported. Several different matrix viewing and editing modes. Subsets of the matrix can be given a label and viewed/edited. In MX all the characters and OTUs constitute one global matrix - a labelled matrix is an arbitrary view into this big matrix via a specific list of characters and OTUs.
  • free text entry for characters and character states for a matrix
    • Supported.
  • free text entry for OTUs
    • Supported.
  • annotate OTUs with taxonomy ontology ID
    • Not directly supported. There is a taxonomy module which OTUs can reference. However for Phenoscape needs we would probably implement a separate ontology/controlled vocabulary field for OTUs.
  • add specimens for each OTU, select museum code from ontology
    • Supported but complex. MX has a pretty comprehensive specimen management module. However it is much more complex than that in Phenex.
  • ontology term autocomplete for term input
    • Supported, via Bioportal data services. However the EQ entry form is being updated and is somewhat in flux at the moment.
  • term info panel for terms selected in autocomplete
    • Not supported - would require implementation. A trivial but somewhat lacking implementation could provide a link in MX that would pop open the Bioportal term info page.
  • ontology tree browsing panel or linkout to term in BioPortal tree view
    • Not supported - would require implementation. A trivial implementation could provide a link in MX that would pop open the Bioportal term info page.
  • output to Excel report - consistency review; author page
    • Not supported - would require implementation. Could be nicely implemented as a public-viewable web page in MX.
  • output to KB - currently requires NeXML
    • Not completely supported. An OWL output format exists which could be the basis for future revisions of the KB based on OWL technology.

Requirements expected to newly arise (or arise at a more demanding level) in Phenoscape II

  • Integrate with ORB:
    • request temporary term
    • check temporary terms for official ID
    • use previously requested temporary terms in data
  • Improved UI usability:
    • UI (data entry) and data model (OWL output) support for pre-configured frequently occurring types of characters (such as presence/absence (neomorphic), qualitative, count, relative phenotype)
    • as few clicks as possible for reaching features for composing annotations
    • avoid right-clicks where possible
    • ability to attach images to character states or entities
    • interface that unifies access to pdfs, svn, matrix editor, orb, etc.
  • Support collaborative phenotype annotation
    • real-time teaching of the curation tool, practices, and results to project curators
    • simultaneous editing of different parts of single data matrix
    • ability to edit a data matrix without regard to current activities of other editors
    • ability to tie into real-time collaborative editing frameworks (such as Google’s upcoming one, codenamed BRIX)
    • ability to share pdfs
  • Support annotation of homology
    • evidence codes
    • attribution
  • Facilitate wider use and adoption
    • easy tool deployment to users, including software updates
    • easy deposition of annotation output to a shared repository
    • easy digitization of the published matrices
    • minimize or ideally obviate the need for maintaining 3rd party software dependencies (such as Mesquite, or SVN tools)
    • support for deposition into TreeBASE

Phenex

Requirements from Phenoscape I curation workflow (that are expected to carry forward)

  • import matrix data from file source (usually NEXUS)
  • hold a reference to a publication as the source for the matrix
  • view and edit matrix
  • free text entry for characters and character states for a matrix
  • free text entry for OTUs
  • annotate OTUs with taxonomy ontology ID
  • add specimens for each OTU, select museum code from ontology
  • ontology term autocomplete for term input
  • term info panel for terms selected in autocomplete
  • ontology tree browsing panel or linkout to term in BioPortal tree view
  • output to Excel report - consistency review; author page
  • output to KB - currently requires NeXML

Requirements expected to newly arise (or arise at a more demanding level) in Phenoscape II

  • Integrate with ORB:
    • request temporary term
    • check temporary terms for official ID
    • use previously requested temporary terms in data
  • Improved UI usability:
    • UI (data entry) and data model (OWL output) support for pre-configured frequently occurring types of characters (such as presence/absence (neomorphic), qualitative, count, relative phenotype)
    • as few clicks as possible for reaching features for composing annotations
    • avoid right-clicks where possible
    • ability to attach images to character states or entities
    • interface that unifies access to pdfs, svn, matrix editor, orb, etc.
  • Support collaborative phenotype annotation
    • real-time teaching of the curation tool, practices, and results to project curators
    • simultaneous editing of different parts of single data matrix
    • ability to edit a data matrix without regard to current activities of other editors
    • ability to tie into real-time collaborative editing frameworks (such as Google’s upcoming one, codenamed BRIX)
    • ability to share pdfs
  • Support annotation of homology
    • evidence codes
    • attribution
  • Facilitate wider use and adoption
    • easy tool deployment to users, including software updates
    • easy deposition of annotation output to a shared repository
    • easy digitization of the published matrices
    • minimize or ideally obviate the need for maintaining 3rd party software dependencies (such as Mesquite, or SVN tools)
    • support for deposition into TreeBASE