Difference between revisions of "Phenex"

From phenoscape
(Documentation)
(Troubleshooting Problems)
Line 138: Line 138:
 
</xml>
 
</xml>
  
===Troubleshooting Problems===
+
==Troubleshooting Problems==
  
====Removing Corrupted Ontology Files====
+
===Removing Corrupted Ontology Files===
  
 
Local copies of ontology files can become corrupted, causing Phenex to display a warning about "dangling" terms on start-up. Note that the warning about danglers can also indicate a valid ontology change related to merging of terms from a recent ontology update.
 
Local copies of ontology files can become corrupted, causing Phenex to display a warning about "dangling" terms on start-up. Note that the warning about danglers can also indicate a valid ontology change related to merging of terms from a recent ontology update.

Revision as of 14:50, 4 May 2009

Error creating thumbnail: Unable to save thumbnail to destination

Phenex is an application for annotating character matrix files with ontology terms. Character states can be annotated using the Entity-Quality syntax for ontologically describing phenotypes. In addition, taxon entries can be annotated with identifiers from a taxonomy ontology. Phenex saves ontology annotations alongside traditional character matrix data using the new NeXML format standard for evolutionary data. Phenex is hosted by the OBO project and builds on Phenote and OBO-Edit. Jim Balhoff is the lead developer of Phenex.

System Requirements

Phenex runs on any system with Java 5 or newer (Java 5 is also called version 1.5). Java 5 comes pre-installed on Mac OS X 10.4 and later. Phenex can't be used on older releases of Mac OS X (check your version of Mac OS X by selecting "About This Mac" under the Apple menu). For Windows, you can check your version of Java, and install if necessary, at http://www.java.com/. Phenex runs on Linux - just make sure you have Java 5 installed.

Download & Installation

The latest version of Phenex can be obtained from OBO's Sourceforge project site. Until the release of Phenex 1.0, only in-development "beta" downloads are available. Download the file appropriate for your system ("mac", "win", or "unix"). The "src" download contains the sourcecode for the current release of Phenex.

Installation on Mac OS X

Double-click the downloaded file to unzip it. Copy Phenex.app anywhere you would like to install it. Double-click Phenex.app to launch the application.

Installation on Windows

On Windows you must be sure to extract the Phenex folder from the zip archive - Phenex will not function properly if run from within the zip archive. Right-click on the downloaded zip file and choose "Extract All...". Use the Extraction Wizard to copy Phenex to a folder on your desktop. You can then move the Phenex folder anywhere you like. Within the Phenex folder, double-click the file Phenex.bat to launch the application.

Installation on Linux/Unix

Unzip the downloaded archive using a command such as tar zxvf phenex-1.0-unix.tgz. Change directories into the Phenex folder and run the shell script to launch Phenex. You will need to have Java in your executable PATH.

Obtaining Phenex source code (for developers)

The Phenex source for a particular release can be found in the "src" archive at the download site. To make contributions to the project or test the latest in-development code, you should check out a working copy from the Subversion repository:

The Phenex source includes both an Eclipse project (used for most active development) and an Ant build file (used for making releases). You can browse Phenex SVN revisions on the web at http://obo.svn.sourceforge.net/viewvc/obo/phenex/.

Running Phenex

After launching Phenex, Java can sometimes be a little slow to start up. If you're running Java 6 or later, you should see a splash screen image shortly after launching. This is followed by a panel informing you that Phenex is checking for ontology updates. If you have not run Phenex before, your computer needs to be online in order for Phenex to download the required ontologies. If you have run Phenex before, Phenex will check for the availability of a newer version of each ontology and download it if necessary. It is okay to work offline if Phenex has previously had a chance to download the ontologies. Phenex performs a check for ontology updates each time it is launched.

Reporting Bugs and Feature Requests

Report any bugs and make feature requests using the Phenex issue tracker. You will need to log in with a Sourceforge account to submit an issue (just create an account if you don't already have one). Be sure to choose a category for your new issue, either Bug Report or Feature Request.

Assigning request priorities

Prioritizing tracker issues is essential for guiding Phenex development. Depending on your project role, you may be able to assign a priority to issues you submit. Otherwise feel free to suggest a priority in your issue description. Use the following guidelines in assigning priorities, from 1 (lowest) to 9 (highest). Not every number has a definition - this allows some flexibility in ordering what tasks should be addressed first.

  • 9 - Showstopper bug or missing feature; makes Phenex unusable. Items of this priority take precedence over all other work.
  • 8 - Serious problem, quite urgent, but there is a very tedious workaround. Should be addressed before any other Phenex work.
  • 7 - Serious issue, but workaround is not bad.
  • 6
  • 5
  • 4 - Standard priority. Normal bug or feature request. Use 4, 5, and 6 to schedule some tasks before others (the tracker creates new requests with priority 5 by default).
  • 3 - I would really like this, but I appreciate that it is not a high priority.
  • 2
  • 1 - Nice idea to keep in mind; wishlist. Not scheduled for implementation.

Documentation

Menu choices for opening files

  • File > Open...
    • Open an existing NeXML file (native Phenex format), replacing all existing data.
  • File > Merge > Tab-delimited Taxa...
    • Merge data from a Phenote taxon list file into your existing data.
  • File > Merge > Tab-delimited Characters...
    • Merge data from a Phenote character EQ list file into your existing data.
  • File > Merge > NEXUS Data...
    • Merge data from an existing NEXUS file into your existing data.
  • File > Merge > NeXML Data...
    • Merge data from an existing NeXML file into your existing data.

Merging data from Phenote files and NEXUS matrices (for Phenoscape curators)

Double-checking Taxon Lists for omissions and errors

  1. Open the taxon file in Excel and print the taxon list. Compare the printed copy against the publication's materials list, and use Phenote+ to add missing taxa or make other corrections to the file. Remember to upload the corrected file to the fileshare, and use the corrected file to merge in Phenex.

Importing matrix files and taxon lists into Phenex

  1. In order to combine a taxon list created in Phenote with matrix data stored in a NEXUS file, both files need to use identical taxon names. For the Phenote file, this is the value in "Publication Taxon" (NOT "Valid Taxon"). For the NEXUS file, this is the ordinary taxon name you can see and edit in Mesquite. It is much easier if all the taxon names are made identical before you do the merge.
  2. First, open the taxon list in Phenote and the NEXUS file in Mesquite. Make sure all the taxon names you want to match are identical. It is okay if the files don't have the same number of taxa - if there are more in the taxon list, the extra taxa will be added. If there are fewer, you will just have some that don't get matched.
  3. Next, make sure that the character description field has some content (more than numerical). If it is blank, it leads to an import error in current Phenex (1.0-beta4; 25 Aug 08). Copying the free text character name from the publication pdf works well. Pasting in the character state free text, however, works better in Phenex than Mesquite (because Mesquite doesn't return to next character in list after entering character states from previous character).
  4. While the NEXUS file is open in Mesquite, choose "File > Save File As..." and save a new copy of the NEXUS file. Make sure you have the most recent copy of Mesquite. This will ensure that Mesquite re-writes the NEXUS file using a modern format which is more likely to be successfully opened by Phenex. Use this new copy of the NEXUS file for all subsequent steps.
  5. Launch Phenex. Choose "File > Import NEXUS...". Select your NEXUS file and open it. Verify that all the data is imported as expected, including the matrix.
  6. Now choose "File > Merge > Tab-delimited Taxa...". Select your taxon list file and open it. The import will seem a little slow. Verify that the taxa you loaded from your NEXUS file now have TTO terms associated with them under "Valid Taxon", and specimens in the specimens panel (assuming you had this data in your taxon list).
  7. If some taxa from the Phenote taxon list were not matched to existing taxa in Phenex, you will see them at the end of the list. Also, these taxa would have only empty cell values in the matrix panel. If you expected these taxa to match existing taxa, check the taxon names and start over.
  8. Save your Phenex data to a new file using "File > Save". It is recommended to append ".xml" to the file name.
  9. NOTE: If your publication does not include a character X taxon matrix, you can Merge Tab-delimited Taxa to import the taxon list and then save as a .xml Phenex file. A matrix can be entered at some later point, either in Phenex or Jim is developing a matrix import feature (the first version of this is available now in 1.0-beta5).

Merging EQ annotations from a tab-delimited file

Using the menu item "File > Merge Tab-delimited Characters...", you can merge EQ annotations from a tab file into an existing data set. The "Character Number" and "State Number" columns are used to match a character (by index) and state (by symbol) in the existing data set. If the index falls outside the current range of characters, a new character is appended to the existing data set. If a state with the given symbol does not exist, a new state is appended to the given character. The character and state labels in the tab file will overwrite any labels in the existing data set. If a "Count" value is not a basic number, it will be appended to the Notes field instead of Count.

Merging matrix data (taxon by character state associations) from a NEXUS file

Using the menu item "File > Merge NEXUS Matrix...", you can merge matrix values into an existing data set. Characters are matched via their index. Extra characters are appended to the existing data. Values are matched by comparing the symbol - if a state with that symbol is not available for the character in the existing data set, a new state with that symbol is added to the character. Taxa are matched via their Publication Name. Matrix values for unmatched taxa are unaltered.

NeXML file format extensions

The NeXML schema provides an XML format for saving evolutionary character data. It provides for customization by allowing most elements to be annotated by including a "dict" element containing key-value data. Keys are text while the values can be textual, numeric, or arbitrary XML. Phenex stores ontological annotations and application-specific data in a number of proprietary dict elements. While these dict elements are specific to Phenex, all standard data in the file can still be read and used by other applications that support NeXML.

While these data are currently stored within proprietary Phenex dicts, eventually common needs for such data across applications should be identified and public standards should be established and implemented. While reading and writing files, Phenex attempts to preserve and "round-trip" any unsupported NeXML data or unknown annotation elements.

Document metadata

Document metadata is stored within the root "nexml" element in a dict using the key "phenex-metadata". An "any" element contains custom XML with the elements listed below. <xml> <dict>

   <key>phenex-metadata</key>
   <any>
     <curators>W. Dahdul</curators>
     <publication>Buckup, 1998</publication>
     <publicationNotes>Evidence codes for matrix is IVS</publicationNotes>
   </any>

</dict> </xml>

Taxon and specimen data

An OBO ontology identifier for a taxon is stored within the taxon's "otu" element in a dict using the key "OBO_ID". The value is a string representing the OBO identifier. Specimens for this taxon are also stored within the taxon's "otu" element, within a dict using the key "OBO_specimens". The value for this dict is an "any" element containing custom "specimen' XML elements. The OBO identifier for the museum collection is stored in a "collection" attribute while the specimen accession number is in an "accession" attribute. <xml> <otu id="4341922c-a479-4088-b02e-42a0e248a825" label="Acestrorhynchus lacustris">

     <dict>
       <key>OBO_ID</key>
       <string>TTO:1030219</string>
     </dict>
     <dict>
       <key>OBO_specimens</key>
       <any>
         <specimen collection="COLLECTION:0000403" accession="205830"/>
         <specimen collection="COLLECTION:0000403" accession="206997"/>
         <specimen collection="COLLECTION:0000403" accession="207388"/>
         <specimen collection="COLLECTION:0000403" accession="207768"/>
         <specimen collection="COLLECTION:0000403" accession="207850"/>
       </any>
     </dict>

</otu> </xml>

Phenotype annotations

Phenotype annotations are stored within the "state" element to which they correspond, within a dict using the key "OBO_phenotype". The value for this dict is an "any" element containing a "phenotype" element. The "phenotype" element and its children are taken from the PhenoXML schema. <xml> <state id="889131fa-7e4d-45f0-b918-14d3ba806c65" label="present" symbol="1">

         <dict>
           <key>OBO_phenotype</key>
           <any>
             <phen:phenotype>
               <phen:phenotype_character>
                 <phen:description/>
                 <phen:bearer>
                   <phen:typeref about="TAO:0000203"/>
                 </phen:bearer>
                 <phen:quality>
                   <phen:typeref about="PATO:0000467"/>
                 </phen:quality>
               </phen:phenotype_character>
             </phen:phenotype>
           </any>
         </dict>

</state> </xml>

Troubleshooting Problems

Removing Corrupted Ontology Files

Local copies of ontology files can become corrupted, causing Phenex to display a warning about "dangling" terms on start-up. Note that the warning about danglers can also indicate a valid ontology change related to merging of terms from a recent ontology update.

To remove local copies of ontologies from your Phenex directory, delete all files within the “Ontology Cache” folder on your computer: User’s Home/Library/Application Support/Phenex/Ontology Cache/

Phenex will then download new copies of the ontology files on startup.

Development Roadmap

January 2009

  • Support alt_id term references in NeXML files [Done]
  • Copy text from tables [Taxon table done]

March 2009

  • Improved autocomplete interface [Done] - April 9

April 2009

  • Panel for matrix cell-specific information, e.g. evidence codes, image references

May 2009 - Phenex 1.0

  • Ontology configuration interface