Difference between revisions of "VTO Taxonomy Resources"

From phenoscape
 
(5 intermediate revisions by one other user not shown)
Line 1: Line 1:
 
=Resources for Constructing a Vertebrate Taxonomy Ontology =
 
=Resources for Constructing a Vertebrate Taxonomy Ontology =
  
There are two types of resources useful for building taxonomic ontologies: taxonomic resources which provide hierarchy and nomenclature resources which provide information about names.  Both types of resources are used in taxonomic ontologies and listed here.
+
Note: The material on this section has been moved and incorporated into the main [[Ontologies]] page.
 
 
== Taxonomic Resources ==
 
=== Fish ===
 
 
 
The TTO is derived from the catalog of fishes, which provides some taxonomy information and a great deal of nomenclature in the form of synonyms.  It is updated several times a year and TTO should be rebuilt by requesting the appropriate files from Stan Blum.
 
 
 
The TTO also incorporates some common names from Fishbase and cross references (xrefs) from the Global Names Index.
 
 
 
Information for extinct taxa has been added as needed by curation, and the taxonomy is, as a rule, from the curated publication.
 
 
 
=== Amphibians ===
 
 
 
The ATO is derived from the [http://amphibiaweb.org/ AmphibiaWeb] list, which provides both taxonomy and some synonyms from ITIS and the [http://www.iucnredlist.org/initiatives/amphibians IUCN redlist].
 
 
 
=== Birds ===
 
 
 
The [http://www.worldbirdnames.org/index.html IOU checklist] provides a current taxonomy and common names, but no taxonomic synonyms.
 
 
 
=== Mammals ===
 
 
 
Wilson and Reader (1993) is the standard, but the available electronic version of their taxonomy is IP encumbered. NCBI and/or ITIS may be the best starting point.
 
 
 
=== All Vertebrates ===
 
 
 
*[http://www.ncbi.nlm.nih.gov/Taxonomy/taxonomyhome.html/ NCBI] This was used to fill in the gaps (non-avian amniotes) in the current proposed VTO. This provides taxonomy for GenBank submissions (including fossil taxa), but does not claim to be an authoritative source (and generally doesn't cover taxa that have not been submitted).
 
 
 
*[http://www.itis.gov/ ITIS] Claims to be authoritative, but coverage is not complete (nor uncontroversial).
 
 
 
== Nomenclature Resources ==
 
 
 
These can provide links to additional synonyms and resources (e.g., TTO uses fishbase to provide common names and links to their pages).  Taxonomic synonyms are particularly useful as aids to data curation, but common names can assist users in browsing the website.  Some nomenclature resources act as aggregators of names from other sources (e.g., Catalogue of Life incorporates names from ITIS.  Taxonomic resources (above) can be used as sources of names as well.
 
 
 
* [http://www.fishbase.org/ Fishbase] - their taxonomy is close to TTO (both based on Catalog of Fishes), but TTO uses it strictly as a name resource.
 
* [http://gni.globalnames.org/ Global Names Index (GNI)]
 
* [http://www.catalogueoflife.org/ Catalogue of Life]
 
  
 
= Tools =
 
= Tools =
Line 45: Line 10:
  
 
The tool source is available at [https://github.com/NESCent/Taxonomy-Ontology-Tool GitHub].
 
The tool source is available at [https://github.com/NESCent/Taxonomy-Ontology-Tool GitHub].
 
== Generated Taxonomic Ontology ==
 
 
This taxonomy covers vertebrates and was built by starting with the NCBI taxonomy for vertebrates and splicing in TTO (except hagfish), ATO, and the IOC taxonomy of living birds.  Synonyms from ITIS and Catalog of Life were attached if the primary name matched a name in the existing taxonomy.  Subspecies names were added to their parent species as synonyms (not subclasses).  The taxonomy is currently in the OBO format used for TTO and ATO, which includes the use of the Taxonomic Rank Vocabulary to tag taxa with specified rank.  The current version of the VTO is [http://phenoscape.svn.sourceforge.net/viewvc/phenoscape/trunk/vocab/vertebrate_taxonomy.obo here].  The file is large so be patient when browsing or downloading.
 
  
 
== TTOUpdate ==
 
== TTOUpdate ==
  
 
This tool is used to merge an existing TTO with a Catalog of Fishes update file, which will consist either of a single Microsoft Access database or three Excel (2003) files (one each for lineages, genera, and species).  Does not use CSV or tab-delimited text files as the free-text comments, which include extractable synonyms, contain commas, tabs and line breaks, so rendering the common text formats unusable.  TTO update includes libraries for reading the pre-2007 Excel formats which properly handle the various breaking characters.
 
This tool is used to merge an existing TTO with a Catalog of Fishes update file, which will consist either of a single Microsoft Access database or three Excel (2003) files (one each for lineages, genera, and species).  Does not use CSV or tab-delimited text files as the free-text comments, which include extractable synonyms, contain commas, tabs and line breaks, so rendering the common text formats unusable.  TTO update includes libraries for reading the pre-2007 Excel formats which properly handle the various breaking characters.
 +
 +
[[Category:Taxonomy]]
 +
[[Category:Ontology]]

Latest revision as of 20:58, 1 November 2011

Resources for Constructing a Vertebrate Taxonomy Ontology

Note: The material on this section has been moved and incorporated into the main Ontologies page.

Tools

VTO Construction Tool

This tool uses a script to construct a taxonomic ontology by specifying a starting taxonomy, then modifying it by removing branches and splicing corresponding pieces of alternate taxonomies (e.g., start with the NCBI taxonomy and replace the teleost part of the tree with the tree from the TTO). It also allows taxonomic synonyms to be extracted from taxonomies or name lists and attached to terms in the taxonomy. Currently, the tool generates a taxonomy in the OBO format, though support for an individual-based OWL format is in progress.

The tool source is available at GitHub.

TTOUpdate

This tool is used to merge an existing TTO with a Catalog of Fishes update file, which will consist either of a single Microsoft Access database or three Excel (2003) files (one each for lineages, genera, and species). Does not use CSV or tab-delimited text files as the free-text comments, which include extractable synonyms, contain commas, tabs and line breaks, so rendering the common text formats unusable. TTO update includes libraries for reading the pre-2007 Excel formats which properly handle the various breaking characters.