Phenex

From phenoscape
Revision as of 15:39, 4 April 2013 by Jim Balhoff (talk | contribs) (Reporting Bugs and Feature Requests)
Error creating thumbnail: Unable to save thumbnail to destination

Phenex is an application for annotating character matrix files with ontology terms. Character states can be annotated using the Entity-Quality syntax for ontologically describing phenotypes. In addition, taxon entries can be annotated with identifiers from a taxonomy ontology. Phenex saves ontology annotations alongside traditional character matrix data using the new NeXML format standard for evolutionary data. Phenex builds on Phenote and OBO-Edit from the OBO project. Jim Balhoff is the lead developer of Phenex.

Download & Installation

System Requirements

Phenex runs on any system with Java 5 or newer (Java 5 is also called version 1.5). Java 5 comes pre-installed on Mac OS X 10.4 and later. Phenex can't be used on older releases of Mac OS X (check your version of Mac OS X by selecting "About This Mac" under the Apple menu). For Windows, you can check your version of Java, and install if necessary, at http://www.java.com/. Phenex runs on Linux - just make sure you have Java 5 installed.

Direct downloads for Phenex 1.8

Choose download appropriate for your platform:

Installation on Mac OS X

Double-click the downloaded file to unzip it. Copy Phenex.app anywhere you would like to install it. Double-click Phenex.app to launch the application.

Installation on Windows

On Windows you must be sure to extract the Phenex folder from the zip archive - Phenex will not function properly if run from within the zip archive. Right-click on the downloaded zip file and choose "Extract All...". Use the Extraction Wizard to copy Phenex to a folder on your desktop. You can then move the Phenex folder anywhere you like. Within the Phenex folder, double-click the file Phenex.bat to launch the application.

Installation on Linux/Unix

Unzip the downloaded archive using a command such as tar zxvf phenex-1.0-unix.tgz. Change directories into the Phenex folder and run the shell script to launch Phenex. You will need to have Java in your executable PATH.

License and Attribution

Phenex is open source software, released under the MIT license.

If you use Phenex for annotation or extend it for your project, please cite the following publication in your documentation and resulting publications:
Balhoff JP, Dahdul WM, Kothari CR, Lapp H, Lundberg JG, Mabee P, Midford PE, Westerfield ME, Vision TJ. 2010. Phenex: Ontological Annotation of Phenotypic Diversity. PLoS ONE 5(5): e10500. DOI

Source code and support

Please join the Phenex users mailing list for software support and discussion of Phenex features:

The Phenex source for a particular release can be found in the "src" archive at the download site. To make contributions to the project or test the latest in-development code, you should check out a working copy from the Git repository:

The Phenex source includes both an Eclipse project (used for most active development) and an Ant build file (used for making releases). You can browse Phenex source code on the web.

Running Phenex

After launching Phenex, Java can sometimes be a little slow to start up. If you're running Java 6 or later, you should see a splash screen image shortly after launching. This is followed by a panel informing you that Phenex is checking for ontology updates. If you have not run Phenex before, your computer needs to be online in order for Phenex to download the required ontologies. If you have run Phenex before, Phenex will check for the availability of a newer version of each ontology and download it if necessary. It is okay to work offline if Phenex has previously had a chance to download the ontologies. Phenex performs a check for ontology updates each time it is launched.

Reporting Bugs and Feature Requests

Report any bugs and make feature requests using the Phenex issue tracker. You will need to log in with a GitHub account to submit an issue (just create an account if you don't already have one).

Documentation

Ontology configuration

By default, Phenex comes pre-configured with the ontologies used for the Phenoscape project, but it can be configured to load terms from any OBO ontology from the web or a local file. Configuration of ontologies in Phenex has two components: (1) adding term sources - URLs representing OBO files; and (2) specifying the set of terms which should be available within each kind of entry field - an entry field can allow terms from a subset of a given ontology, or from more than one ontology.

Term sources

To edit the list of terms sources, open the Ontology Sources panel by selecting View > Config > Ontology Sources from the menu. Add a new source by pressing the '+' button. Enter an HTTP or local file URL in the URL column, and an optional label of your choice in the Label column. Press "Apply" to save your list of ontology sources. You will need to relaunch Phenex in order for it to download the given files and load all the terms into its ontology session.

Entry field filters

The set of terms available in a given entry field are determined by term filters. Term filters are just saved search specifications which can be created using the Search Panel. To apply a term filter to a particular type of entry field, save it in one of the standard locations (below).

Creating a term filter

Open the Search Panel from View > Ontology > Search Panel. Configure the search to specify the needed terms - you can test the result by performing the search with the Search button. Save the filter by pressing the disk icon. An example of a term filter used for the entity field in the Phenoscape configuration is shown here.

Configuring entry fields

To configure a particular entry field, place a filter file with the appropriate name in the Filters folder within the Phenex settings folder. You may need to create the Filters folder. You will need to relaunch Phenex for the new filters to take effect. The following types of entry fields are available:

  • Taxa
    • <Phenex settings folder>/Filters/taxa.xml
    • Used in the Taxa panel: Valid Taxon column
  • Museum collections
    • <Phenex settings folder>/Filters/museums.xml
    • Used in the Specimens panel: Collection column
  • Entities
    • <Phenex settings folder>/Filters/entities.xml
    • Used in the Phenotypes panel: Entity column, Related Entity column
  • Qualities
    • <Phenex settings folder>/Filters/qualities.xml
    • Used in the Phenotypes panel: Quality column
  • Units
    • <Phenex settings folder>/Filters/units.xml
    • Used in the Phenotypes panel: Unit column
  • Relations
    • <Phenex settings folder>/Filters/relations.xml
    • Specifies the relations available when creating post-compositions

Locating the Phenex settings folder

The Phenex settings folder contains cached copies of downloaded ontology files, entry field filter files, and other settings files. Its location is dependent on the platform on which you're running Phenex.

  • Mac OS X: <user's home>/Library/Application Support/Phenex
    • In Mac OS 10.7+, this folder is hidden but can be made visible by going to Finder> Go > Go to Folder, and typing in "~/Library"
  • Windows: C:\Documents and Settings\<user's name>\Phenex
  • Unix: <user's home>/.phenex

Menu choices for opening files

  • File > Open...
    • Open an existing NeXML file (native Phenex format), replacing all existing data.
  • File > Merge > Tab-delimited Taxa...
    • Merge data from a Phenote taxon list file into your existing data.
  • File > Merge > Tab-delimited Characters...
    • Merge data from a Phenote character EQ list file into your existing data.
  • File > Merge > NEXUS Data...
    • Merge data from an existing NEXUS file into your existing data.
  • File > Merge > NeXML Data...
    • Merge data from an existing NeXML file into your existing data.

Merging data from Phenote files and NEXUS matrices (for Phenoscape curators)

Double-checking Taxon Lists for omissions and errors

  1. Open the taxon file in Excel and print the taxon list. Compare the printed copy against the publication's materials list, and use Phenote+ to add missing taxa or make other corrections to the file. Remember to upload the corrected file to the fileshare, and use the corrected file to merge in Phenex.

Importing matrix files and taxon lists into Phenex

  1. In order to combine a taxon list created in Phenote with matrix data stored in a NEXUS file, both files need to use identical taxon names. For the Phenote file, this is the value in "Publication Taxon" (NOT "Valid Taxon"). For the NEXUS file, this is the ordinary taxon name you can see and edit in Mesquite. It is much easier if all the taxon names are made identical before you do the merge.
  2. First, open the taxon list in Phenote and the NEXUS file in Mesquite. Make sure all the taxon names you want to match are identical. It is okay if the files don't have the same number of taxa - if there are more in the taxon list, the extra taxa will be added. If there are fewer, you will just have some that don't get matched.
  3. Next, make sure that the character description field has some content (more than numerical). If it is blank, it leads to an import error in current Phenex (1.0-beta4; 25 Aug 08). Copying the free text character name from the publication pdf works well. Pasting in the character state free text, however, works better in Phenex than Mesquite (because Mesquite doesn't return to next character in list after entering character states from previous character).
  4. While the NEXUS file is open in Mesquite, choose "File > Save File As..." and save a new copy of the NEXUS file. Make sure you have the most recent copy of Mesquite. This will ensure that Mesquite re-writes the NEXUS file using a modern format which is more likely to be successfully opened by Phenex. Use this new copy of the NEXUS file for all subsequent steps.
  5. Launch Phenex. Choose "File > Import NEXUS...". Select your NEXUS file and open it. Verify that all the data is imported as expected, including the matrix.
  6. Now choose "File > Merge > Tab-delimited Taxa...". Select your taxon list file and open it. The import will seem a little slow. Verify that the taxa you loaded from your NEXUS file now have TTO terms associated with them under "Valid Taxon", and specimens in the specimens panel (assuming you had this data in your taxon list).
  7. If some taxa from the Phenote taxon list were not matched to existing taxa in Phenex, you will see them at the end of the list. Also, these taxa would have only empty cell values in the matrix panel. If you expected these taxa to match existing taxa, check the taxon names and start over.
  8. Save your Phenex data to a new file using "File > Save". It is recommended to append ".xml" to the file name.
  9. NOTE: If your publication does not include a character X taxon matrix, you can Merge Tab-delimited Taxa to import the taxon list and then save as a .xml Phenex file. A matrix can be entered at some later point, either in Phenex or Jim is developing a matrix import feature (the first version of this is available now in 1.0-beta5).

Merging EQ annotations from a tab-delimited file

Using the menu item "File > Merge Tab-delimited Characters...", you can merge EQ annotations from a tab file into an existing data set. The "Character Number" and "State Number" columns are used to match a character (by index) and state (by symbol) in the existing data set. If the index falls outside the current range of characters, a new character is appended to the existing data set. If a state with the given symbol does not exist, a new state is appended to the given character. The character and state labels in the tab file will overwrite any labels in the existing data set. If a "Count" value is not a basic number, it will be appended to the Notes field instead of Count.

Merging matrix data (taxon by character state associations) from a NEXUS file

Using the menu item "File > Merge NEXUS Matrix...", you can merge matrix values into an existing data set. Characters are matched via their index. Extra characters are appended to the existing data. Values are matched by comparing the symbol - if a state with that symbol is not available for the character in the existing data set, a new state with that symbol is added to the character. Taxa are matched via their Publication Name. Matrix values for unmatched taxa are unaltered.

NeXML file format extensions

This is out of date and will be updated with the new NeXML metadata scheme soon!

The NeXML schema provides an XML format for saving evolutionary character data. It provides for customization by allowing most elements to be annotated by including a "dict" element containing key-value data. Keys are text while the values can be textual, numeric, or arbitrary XML. Phenex stores ontological annotations and application-specific data in a number of proprietary dict elements. While these dict elements are specific to Phenex, all standard data in the file can still be read and used by other applications that support NeXML.

While these data are currently stored within proprietary Phenex dicts, eventually common needs for such data across applications should be identified and public standards should be established and implemented. While reading and writing files, Phenex attempts to preserve and "round-trip" any unsupported NeXML data or unknown annotation elements.

Document metadata

Document metadata is stored within the root "nexml" element in a dict using the key "phenex-metadata". An "any" element contains custom XML with the elements listed below. <xml> <dict> <key>phenex-metadata</key> <any> <curators>W. Dahdul</curators> <publication>Buckup, 1998</publication> <publicationNotes>Evidence codes for matrix is IVS</publicationNotes> </any> </dict> </xml>

Taxon and specimen data

An OBO ontology identifier for a taxon is stored within the taxon's "otu" element in a dict using the key "OBO_ID". The value is a string representing the OBO identifier. Specimens for this taxon are also stored within the taxon's "otu" element, within a dict using the key "OBO_specimens". The value for this dict is an "any" element containing custom "specimen' XML elements. The OBO identifier for the museum collection is stored in a "collection" attribute while the specimen accession number is in an "accession" attribute. <xml> <otu id="4341922c-a479-4088-b02e-42a0e248a825" label="Acestrorhynchus lacustris"> <dict> <key>OBO_ID</key> <string>TTO:1030219</string> </dict> <dict> <key>OBO_specimens</key> <any> <specimen collection="COLLECTION:0000403" accession="205830"/> <specimen collection="COLLECTION:0000403" accession="206997"/> <specimen collection="COLLECTION:0000403" accession="207388"/> <specimen collection="COLLECTION:0000403" accession="207768"/> <specimen collection="COLLECTION:0000403" accession="207850"/> </any> </dict> </otu> </xml>

Phenotype annotations

Phenotype annotations are stored within the "state" element to which they correspond, within a dict using the key "OBO_phenotype". The value for this dict is an "any" element containing a "phenotype" element. The "phenotype" element and its children are taken from the PhenoXML schema. <xml> <state id="889131fa-7e4d-45f0-b918-14d3ba806c65" label="present" symbol="1"> <dict> <key>OBO_phenotype</key> <any> <phen:phenotype> <phen:phenotype_character> <phen:description/> <phen:bearer> <phen:typeref about="TAO:0000203"/> </phen:bearer> <phen:quality> <phen:typeref about="PATO:0000467"/> </phen:quality> </phen:phenotype_character> </phen:phenotype> </any> </dict> </state> </xml>

Entering polymorphic or uncertain state values

The Phenex matrix editor can handle entry of polymorphic or uncertain state values within cells. In order to enter multiple states in a single cell,you must use the matrix "quick editor". Just beneath the matrix view, there is a checkbox titled "Use quick editor". After enabling this setting, when you click to edit a cell, instead of popping up a states menu, the cell will become an editable text field, which operates very much like the matrix cell editor in Mesquite. Here you can simply type the symbol of the state you want to enter. If you want to enter a polymorphism of states 0 and 1, simply enter "0&1" (no spaces). For an uncertainty, use a slash: "0/1".

Consistency Review panel

Phenex includes a Consistency Review panel which reports problematic or missing annotations. The consistency issues currently evaluated are:

  • Unannotated state.
  • Empty entity field.
  • Empty quality field.
  • Post-composition consisting of more than one differentia (may be okay).
  • Relational quality used without a related entity.
  • Related entity entered without a relational quality.
  • Biological process entity not used with a process quality.
  • Qualities descending from different attributes used in states for a given character.

Troubleshooting Problems

Removing Corrupted Ontology Files

Local copies of ontology files can become corrupted, causing Phenex to display a warning about "dangling" terms on start-up. Note that the warning about danglers can also indicate a valid ontology change related to merging of terms from a recent ontology update.

To remove local copies of ontologies from your Phenex directory, delete all files within the “Ontology Cache” folder on your computer: User’s Home/Library/Application Support/Phenex/Ontology Cache/

Phenex will then download new copies of the ontology files on startup.