Edit this page on GitHub

EQ Editor

Overview

Currently, the EQ Editor is being developed by customizing Phenote to our needs. We are collaborating with the developers of Phenote at NCBO by contributing improvements to the core Phenote code, and we are also developing specialized components and configuration required for our annotation workflow.

Requirements

These are the currently understood requirements - an earlier draft can be viewed at EQ Editor requirements draft 1.

The EQ Editor will be used by curators to annotate phenotypic descriptions of Ostariophysi fish using EQ syntax and ontologies.

The curators will use the EQ Editor to code phenotypic data from an existing publication into the EQ format. This data consists of descriptions of character state values for corresponding species (or, more precisely, specimens). A published character state description may contain: the species or higher taxonomic specification, a textual description of the character state value, reference to a voucher specimen for this description, an image showing the character.

Data Model

The following are essential data elements to be captured by the EQ Editor for each character state description (format is in parentheses). EQ coding will be performed at the level of character states, but there may be a facility for dynamically viewing entries in a character matrix format. See “EQ for character matrices” for a discussion - at the June 5 PI meeting we decided to work with only character states. Phenotype annotations should generally be made at the species taxonomic level.

Phenotype Annotation

Additional data per publication

Some data from the publications may be useful to store in our database, independent of phenotype annotations.

Workflow

Sources of data

The publication being coded may contain data in one of a few different formats. The given data format may suggest its own style of workflow. These publication types include 3 main forms:

  1. Description of many characters for many species and higher taxonomic levels; no character-by-taxon matrix published.
  2. A data matrix with multiple species and multiple characters. There is a character state value for each cell in this matrix.
  3. A single species description of values for many characters.
  4. Description of a single character for many species (perhaps less common than the other formats?). If focusing on a single character the data may come from multiple publications.

It seems like scenarios 2 and 3 can be treated as special cases of scenario 1. For each character, an interface is required for choosing the Entity and Quality from their respective ontologies, and entering free text such as the original character descriptions, as well as other relevant data.

Detailed steps

The standard workflow would be a curator dealing with a single publication during a session of working with the EQ Editor. Steps might be:

  1. Create a new EQ Editor document.
  2. Enter document-wide data:
    1. publication information (author, title, journal)
    2. list of species - choose taxon from taxonomic ontology for each one
      • For each species, input list of specimen catalog numbers used in the publication, choosing museum institutions from a pick list
  3. Begin making character state annotations:
    1. Select a species or multiple species to which this character state applies (there should be facilities for efficiently choosing sets of taxa)
      • If the author makes a statement about a higher taxon (e.g. a family), make a phenotype annotation for every species in the materials list for that publication which is a member of the taxon.
    2. Create a new EQ statement
    3. Choose Entity from anatomy ontology
    4. Choose Quality from PATO (usually a Value term)
    5. If the Quality is an Attribute term (such as “length”), enter a measurement and its units
    6. If the Quality is relational, enter a second Entity from the anatomy ontology
    7. Enter the evidence code appropriate for the published statement - if the phenotype is clearly based on one or more specific specimen numbers, enter them into the catalog number(s) field to support the appropriate evidence code.
    8. Enter any information for the figure or section within the paper showing the character.
  4. At any point, the curator can save the current work to a document (or database).

Working with EQ statements

Questions

Roadmap

See the software roadmap for further plans.