Collaborative Phenotype Annotation

From phenoscape
Revision as of 21:17, 18 September 2011 by Jim Balhoff (talk | contribs) (Requirements from Phenoscape I curation workflow (that are expected to carry forward))

Requirements

A requirements and priorities document is in development and being reviewed by stakeholders.

Technology options

MX

Requirements from Phenoscape I curation workflow (that are expected to carry forward)

  • import matrix data from file source (usually NEXUS)
    • Supported. Can import character and state labels, matrix data from a NEXUS file. This is a one-time import before editing in MX.
  • hold a reference to a publication as the source for the matrix
    • Supported. Has a module for maintaining references (publications) which can be used throughout the system. References can be used to tag several different types of data in the system.
  • view and edit matrix
    • Supported. Several different matrix viewing and editing modes. Subsets of the matrix can be given a label and viewed/edited. In MX all the characters and OTUs constitute one global matrix - a labelled matrix is an arbitrary view into this big matrix via a specific list of characters and OTUs.
  • free text entry for characters and character states for a matrix
    • Supported.
  • free text entry for OTUs
    • Supported.
  • annotate OTUs with taxonomy ontology ID
    • Not directly supported. There is a taxonomy module which OTUs can reference. However for Phenoscape needs we would probably implement a separate ontology/controlled vocabulary field for OTUs.
  • add specimens for each OTU, select museum code from ontology
    • Supported but complex. MX has a pretty comprehensive specimen management module. However it is much more complex than that in Phenex.
  • ontology term autocomplete for term input
    • Supported, via Bioportal data services. However the EQ entry form is being updated and is somewhat in flux at the moment.
  • term info panel for terms selected in autocomplete
    • Not supported - would require implementation. A trivial but somewhat lacking implementation could provide a link in MX that would pop open the Bioportal term info page.
  • ontology tree browsing panel or linkout to term in BioPortal tree view
    • Not supported - would require implementation. A trivial implementation could provide a link in MX that would pop open the Bioportal term info page.
  • output to Excel report - consistency review; author page
    • Not supported - would require implementation. Could be nicely implemented as a public-viewable web page in MX.
  • output to KB - currently requires NeXML
    • Not completely supported. An OWL output format exists which could be the basis for output to future revisions of the KB based on OWL technology.

Requirements expected to newly arise (or arise at a more demanding level) in Phenoscape II

  • Integrate with ORB:
    • request temporary term
      • Needs implementation - straightforward.
    • check temporary terms for official ID
      • Needs implementation - could be periodically run in the background and the database updated directly.
    • use previously requested temporary terms in data
      • Needs implementation.
  • Improved UI usability:
    • UI (data entry) and data model (OWL output) support for pre-configured frequently occurring types of characters (such as presence/absence (neomorphic), qualitative, count, relative phenotype)
      • Support for character types has been implemented; however after user experience with the initial version a complete revision is planned (and is beginning very soon).
    • as few clicks as possible for reaching features for composing annotations
      • The current implementation provides a link next to a term field which pops open a post-composition window.
    • avoid right-clicks where possible
      • Right-clicks are not used in the interface.
    • ability to attach images to character states or entities
      • MX supports tagging of almost anything with images/figures.
    • interface that unifies access to pdfs, svn, matrix editor, orb, etc.
      • Currently supports storage of PDFs with refs, matrix editing. ORB can be implemented. SVN would not be necessary with shared database.
  • Support collaborative phenotype annotation
    • real-time teaching of the curation tool, practices, and results to project curators
      • Web-based implementation would facilitate remote collaboration.
    • simultaneous editing of different parts of single data matrix
      • Supported well due to web-based implementation.
    • ability to edit a data matrix without regard to current activities of other editors
      • Supported well due to web-based implementation.
    • ability to tie into real-time collaborative editing frameworks (such as Google’s upcoming one, codenamed BRIX)
      • Unlikely to be practical.
    • ability to share pdfs
      • Supported in reference module.
  • Support annotation of homology
    • Not supported - could be developed fairly easily but would be its own new module.
    • evidence codes
      • Not supported but straightforward.
    • attribution
      • Not supported but straightforward.
  • Facilitate wider use and adoption
    • easy tool deployment to users, including software updates
      • Web-based application does not need to be installed by users; always up-to-date.
    • easy deposition of annotation output to a shared repository
      • Would need to be developed for whatever format is required.
    • easy digitization of the published matrices
      • Unsure of meaning.
    • minimize or ideally obviate the need for maintaining 3rd party software dependencies (such as Mesquite, or SVN tools)
      • SVN no longer needed. Mesquite might be needed for initial massaging of original NEXUS files.
    • support for deposition into TreeBASE
      • Not currently supported - an output format would need development.

Phenex

Requirements from Phenoscape I curation workflow (that are expected to carry forward)

  • import matrix data from file source (usually NEXUS)
  • hold a reference to a publication as the source for the matrix
  • view and edit matrix
  • free text entry for characters and character states for a matrix
  • free text entry for OTUs
  • annotate OTUs with taxonomy ontology ID
  • add specimens for each OTU, select museum code from ontology
  • ontology term autocomplete for term input
  • term info panel for terms selected in autocomplete
  • ontology tree browsing panel or linkout to term in BioPortal tree view
  • output to Excel report - consistency review; author page
  • output to KB - currently requires NeXML

Requirements expected to newly arise (or arise at a more demanding level) in Phenoscape II

  • Integrate with ORB:
    • request temporary term
    • check temporary terms for official ID
    • use previously requested temporary terms in data
  • Improved UI usability:
    • UI (data entry) and data model (OWL output) support for pre-configured frequently occurring types of characters (such as presence/absence (neomorphic), qualitative, count, relative phenotype)
    • as few clicks as possible for reaching features for composing annotations
    • avoid right-clicks where possible
    • ability to attach images to character states or entities
    • interface that unifies access to pdfs, svn, matrix editor, orb, etc.
  • Support collaborative phenotype annotation
    • real-time teaching of the curation tool, practices, and results to project curators
    • simultaneous editing of different parts of single data matrix
    • ability to edit a data matrix without regard to current activities of other editors
    • ability to tie into real-time collaborative editing frameworks (such as Google’s upcoming one, codenamed BRIX)
    • ability to share pdfs
  • Support annotation of homology
    • evidence codes
    • attribution
  • Facilitate wider use and adoption
    • easy tool deployment to users, including software updates
    • easy deposition of annotation output to a shared repository
    • easy digitization of the published matrices
    • minimize or ideally obviate the need for maintaining 3rd party software dependencies (such as Mesquite, or SVN tools)
    • support for deposition into TreeBASE