Difference between revisions of "Main Page"

From phenoscape
(Fish Morphology)
(Added SCATE)
(153 intermediate revisions by 7 users not shown)
Line 1: Line 1:
 
{{EventBox1|
 
{{EventBox1|
*Check out the new Phenoscape [http://blog.phenoscape.org/ blog]!
+
* Try the [http://kb.phenoscape.org Phenoscape Knowledgebase].  Your feedback is welcome!
*Together with the National Center for Biomedical Ontologies (NCBO) we held an [http://blog.phenoscape.org/2008/07/11/evolutionary-biology-ontologies-workshop-report/ Evolutionary Biology and Ontologies Workshop] in conjunction with the Evolution 2008 Meetings in Minneapolis, MN.
+
* Check out the latest news on the [http://blog.phenoscape.org/ Phenoscape blog]
*A report from the recent [[Data_Jamboree_1|Data Jamboree]] at NESCent (April 18-20, 2008) is now available.
+
* We have posted vertebrate skeletal images from the Junior Biocurator Program to our [https://www.flickr.com/photos/109282112@N02/with/14085448954/ Phenoscape flickr account], all CC-BY licensed.
 
}}
 
}}
  
==Linking Evolution to Genomics Using Phenotype Ontologies==
+
== Enabling Machine-actionable Semantics for Comparative Analysis of Trait Evolution (SCATE) ==
  
[[Image:NESCent Logo.png|right]]
+
Our objective with this project is to create infrastructure that will provide comparative trait analysis tools easy access to algorithms powered by machine reasoning with the semantics of trait descriptions. Similar to how Google, IBM Watson, and others have enabled developers of smartphone apps to incorporate, with only a few lines of code, complex machine-learning and artificial intelligence capabilities such as sentiment analysis, we aim to demonstrate how easy access to knowledge computing opens up new opportunities for analysis, tools, and research in comparative trait analysis. As driving biological research questions, we focus on addressing three long-standing limitations in comparative studies of trait evolution: recombining trait data, modeling trait evolution, and generating testable hypotheses for the drivers of trait adaptation.
===About this project===
 
  
What are the developmental and genetic bases of evolutionary differences in morphology across species?  Currently it is difficult to approach this question due to a lack of computational tools that allow researchers to integrate developmental genetic and comparative morphological/anatomical data.
+
More information about this project, including participating PIs and funding, can be found at its [http://scate.phenoscape.org project website].
  
[[Image:Ctol Logo.jpg|right]] [[Image:Zfinlogo.png|left]] We are addressing this by developing a database of evolutionarily variable morphological characters for a large clade of fishes (the Ostariophysi) and connecting this database to the large collection of mutant phenotypes in the [http://zfin.org ZFIN database], the central database of the zebrafish model organism community. The evolutionary and mutant phenotypes are being described using common [[#The_Role_of_Ontologies|ontologies]].  The database with its web-interface, called EQSYTE (Entity-Quality System for Trait Evolution), together with the extended ontologies and data curation tools, will allow researchers to ask novel questions about the genetic and developmental regulation of evolutionary morphological transitions. Tool and database development are being guided by [http://en.wikipedia.org/wiki/Use_case use cases], or driving research questions, defined by the devo-evo community.  These tools are being developed under an open-source, open-development model, and in such a way that they can be used for additional biological systems in the future.
+
== Ontology-enabled reasoning across phenotypes from evolution and model organisms ==
  
[[Image:Deepfin Logo.gif|right]] This project is a unique collaboration between evolutionary and model organism biologists including two national centers ([http://www.nescent.org NESCent] and [http://www.bioontology.org NCBO]), the [http://zfin.org ZFIN model organism database], the [http://bio.slu.edu/mayden/cypriniformes/home.html Cypriniformes Tree of Life] project, the [http://www.deepfin.org/ DeepFin Research Coordination Network], and the morphological image databases used by the evolutionary biology community ([http://www.morphbank.net/ Morphbank], [http://morphobank.geongrid.org/ MorphoBank], [http://digimorph.org/ DigiMorph], [http://www.digitalfishlibrary.org/ Digital Fish Library]).
+
NSF funding for this project ends June 30, 2018.
  
===The Role of Ontologies===
+
=== About this project ===
[[Image:Ncbo logo.gif|right]]
 
====Background====
 
  
Ontologies are constrained, structured vocabularies with well defined relationships among terms. Ontologies represent the knowledge-base of a particular discipline, and provide not only a mechanism for consistent annotation of data, but also greater interoperability among people and machines. The most widely used biological ontology is the [http://www.geneontology.org Gene Ontology], which is utilized to annotate molecular function, biological processes and subcellular localization to gene products from different organisms.
+
Our overall objective is to create a scalable infrastructure that enables linking descriptive phenotype observations across different fields of biology by the semantic similarity of their free-text descriptions.  In other words, we are trying to make descriptive observations amenable to large-scale computation so that they can be subjected to computational data integration and knowledge discovery techniques in ways similarly powerful as the techniques we are used to for numeric, quantitative observations.
  
====Phenotype ontologies====
+
Our approach to accomplish this centers on transforming descriptive observations from the natural language text form in which they are typically reported, to fully computable logic expressions that utilize terms from shared ontologies. [[Guide to Character Annotation| We create these expressions]] (which we also call "annotations") for evolutionary phenotypes reported in the systematics literature, typically in the form of character state matrices. We use the [[EQ for character matrices| Entity-Quality (EQ) formalism]] to compose these expressions, which was initially conceived for making biomedical and mutant model organism phenotype observations interoperable.
  
[[Image:EAV4_layers_flat2.png|right|320px]] For model organisms : Approximately 500 mutant zebrafish lines (alleles) with over 660 annotated phenotypic characters from the jaw or gill arches (n=250), fins (n=210), axial skeleton (n=190) and other features (n=10) of the skeleton have been described. Researchers in the [http://www.zfin.org Zebrafish Information Network] (ZFIN) are annotating mutant phenotypes using the [http://obofoundry.org/cgi-bin/detail.cgi?id=zebrafish_anatomy&title=Zebrafish%20anatomy%20and%20development zebrafish anatomy ontology] and the [http://www.bioontology.org/wiki/index.php/PATO:Main_Page Phenotype And Trait Ontology] (PATO). PATO is a “universal” ontology of terms describing qualities (e.g. shape, color, size) that may be applied to any organism.
+
We combine the EQ annotations we create for evolutionary phenotypes with the EQ annotations created for the myriad of phenotypes observed for mutant model organisms in an [http://kb.phenoscape.org integrated knowledgebase] (essentially a triple-store).  We then apply [[:Category:Reasoning| Description Logic-reasoning]] to evaluate which evolutionary phenotype transitions can be inferred as semantically similar to which mutant model organism phenotypes, and vice versa.  Since the genetic cause of a mutant phenotype is usually known, the links between evolutionary and mutant phenotypes identified in this way can be used to construct testable hypotheses about the genetic correlates or causes of evolutionary transitions.
  
====Anatomical ontologies====
+
[[Image:Phenoscape II tree view.jpg|right|380px]] In a previous project, titled [[Linking Evolution to Genomics Using Phenotype Ontologies]], we developed a working prototype as a successful proof-of-concept, using [[Phenoscape 1 Curated Publications| teleost fishes for evolutionary phenotypes]] and the [http://zfin.org zebrafish model organism] as a source of mutant phenotypes. Here, we aim to make the components of the prototype, including tools and workflows, sufficiently scalable so that they are adequate for the much more extensive volume and more diverse nature of skeletal phenotypes across all vertebrates, fossil and modern.  Specifically, our aims encompass the following:
 +
# Develop a fast semantic similarity engine so that the integrated knowledgebase can be searched on-the-fly for biological taxa or genotypes bearing a profile of phenotypes that is similar, but not necessarily identical, to a query profile.
 +
# Develop an [[Reasoning over homology statements| ontological framework for reasoning over homology]] that can be scaled to a large number of anatomically diverse evolutionary lineages.
 +
# Reduce the time and cost of obtaining EQ statements from the literature, while at the same time improving the quality and consistency of those statements, by incorporating natural language processing tools and by improving curation software to allow for on-demand augmentation of community ontologies. [[Image:Phenoscape II architecture.png|right|450px]]
 +
# Build umbrella [[Ontologies#Vertebrate Taxonomy Ontology| taxonomic]] and [[Ontologies| anatomical ontologies]] for the vertebrates, the latter to be supplemented by explicit homology relations among anatomical structures.
 +
# Create a knowledgebase that integrates evolutionary phenotypes for vertebrate fin and limb characters with genetic and phenotype data from three vertebrate model organisms: [http://zfin.org zebrafish] (''Danio rerio''), [http://xenbase.org frog] (''Xenopus laevis''), and [http://www.informatics.jax.org/ mouse] (''Mus musculus'').
 +
# As a capstone, we will assess the results of our work by how well we can apply machine reasoning to retrieve candidate genes for the well-studied vertebrate fin-limb transition and other major events in skeletal evolution of vertebrates.
 +
In addition to a web-based interface, we will make all data, including the integrated knowledgebase, available in the Web Ontology Language (OWL), so that other researchers can reuse the data in as many ways as possible.
  
We have initiated a multi-species ontology for ostariophysan fishes, the Teleost Anatomy Ontology (TAO), which was initialized with the terms in the zebrafish anatomical ontology.  The development of the TAO  is currently focused on the skeletal system because it varies significantly across the Ostariophysi, is well-preserved in fossil specimens, and it is often the focus of morphologically-based evolutionary studies in ichthyology.
+
=== The vertebrate fin/limb transition: the test system ===
 +
The evolution of limbs from fins is arguably one of the most well studied transitions in vertebrate history.  The genes involved in positioning, growth, and patterning of the fin and limb at various stages are well-known and documented in the vertebrate model organism databases [http://zfin.org ZFIN], [http://xenbase.org Xenbase], and [http://www.informatics.jax.org/ MGI]; changes in skeletal morphology and corresponding assertions of homology are well-documented in the comparative morphological literature.  Bringing together the genetic, developmental, morphological and evolutionary data in the Phenoscape Knowledgebase will provide an ideal test bed for judging the reliability of candidate gene predictions and the application of homology logic.
  
This multi-species anatomy ontology is being used in combination with the PATO ontology (see EQ format) to describe the comparative morphological characters.  We have also developed a separate catalog of homology statements for entities within the TAO, so that individual investigators may select particular relationships based on evidence.
+
=== Outreach===
 +
Outreach activities include:
 +
* Summer internships in bio-ontologies for undergraduate/graduate students, in partnership with the [http://www.deepfin.org/ DeepFin Research Coordination Network]
 +
* A "Junior Biocurator" program for advanced Chicago public high school students to be implemented by [http://www.projectexploration.org/ Project Exploration].
 +
* Undergraduate internship and community outreach to the Native American population through University of South Dakota programs.
  
====Taxonomic ontology====
+
=== Contacts ===
  
Together with taxonomic experts, we are developing a taxonomic ontology (based on the [http://www.calacademy.org/RESEARCH/ichthyology/catalog/fishcatsearch.html Catalog of Fishes]) in order to relate species with particular characters and states.  The taxonomic ontology will include nodes ancestral to the Ostariophysi as far back as the Vertebrata in order to associate certain anatomical terms with more inclusive clades than the Ostariophysi.
+
Paula Mabee (University of South Dakota) and Todd Vision (University of North Carolina Chapel Hill, National Evolutionary Synthesis Center) are the Principal Investigators of this project. Co-principal investigators are David Blackburn (California Academy of Sciences), Judith Blake (Mouse Genome Informatics, Jackson Laboratories), Hilmar Lapp (National Evolutionary Synthesis Center), Paul Sereno (University of Chicago), Monte Westerfield (ZFIN, University of Oregon), and Aaron Zorn (Xenbase, Cincinnati Children's Hospital Medical Center) (see their [[Contact| contact addresses]]).
  
====Fish Morphology====
+
== Acknowledgments ==
  
Although the comparative anatomy of fishes has been documented in the literature for several hundred years, it is not available in a computable format.  With the help of taxon experts for ostariophysan fishes, we have prioritized 76 papers for immediate curation.  They are ranked as “A” papers on our publicly available [http://spreadsheets.google.com/pub?key=pTeXfTnVPxC-P1URVHbI4Qg Google spreadsheet]. Our goal is to [[Morphology|input approximately 4,000 morphological features]] in an “EQ” format ([[Media:TREE Mabee.pdf|Mabee et al. 2007a]]<!--; Mabee et al. 2007 in process-->) using a combination of ontologies.
+
{|
 
+
|-
===Contact===
+
| The [http://scate.phenoscape.org SCATE project] has been funded by NSF collaborative grants DBI-1661456 (Duke University), DBI-1661529 (Virginia Tech), DBI-1661516 (University of South Dakota), and DBI-1661356 (UNC Chapel Hill and RENCI) from Sep 1, 2017 to Aug 31, 2020. The grant proposal text with references is publicly available: ''W. Dahdul, J.P. Balhoff, H. Lapp, J. Uyeda, & T.J. Vision. (2017). Enabling machine-actionable semantics for comparative analyses of trait evolution. Zenodo. http://doi.org/10.5281/zenodo.885538''.
 
 
Paula Mabee (University of South Dakota) is the Principal Investigator. Co-principal investigators are Todd Vision (University of North Carolina, Chapel Hill), Monte Westerfield (University of Oregon, ZFIN), and Hilmar Lapp (NESCent) ([[Contact|see their contact addresses]]).
 
  
==Acknowledgments==
+
The Phenoscape II project ("Ontology-enabled reasoning across phenotypes from evolution and model organisms") was funded by NSF collaborative grants DBI-1062404 and DBI-1062542 from July 1, 2011, to June 30, 2018, and supported by the National Evolutionary Synthesis Center (NESCent), NSF #EF-0905606.  The original Project Description for this grant is available [[:File:Phenoscape_Project_description_refs.pdf| here]].
  
{|
+
These projects would not have been possible without the hard work of [[Acknowledgments#Contributors| numerous contributors]] and the results obtained in the [[Linking Evolution to Genomics Using Phenotype Ontologies]] project, which was funded by NSF grant BDI-0641025 from June 1, 2007, to Jun 30, 2011, and was supported by NESCent, NSF #EF-0423641. This earlier project in turn arose from a NESCent <span class="plainlinks">[http://www.nescent.org/science/workinggroup.php Working Group]</span> led by Paula Mabee and Monte Westerfield, "[[Fish Evolution Working Group|Towards an Integrated Database for Fish Evolution]]."
|-
+
| https://www.nescent.org/about/images/nsf_logo.jpg
| This project is funded by NSF grant BDI<nowiki>-</nowiki>0641025, and supported by the National Evolutionary Synthesis Center (NESCent), NSF #EF-0423641. <br/><br/>This project arose from a NESCent <span class="plainlinks">[http://www.nescent.org/science/workinggroup.php Working Group]</span> led by Paula Mabee and Monte Westerfield, "Towards an Integrated Database for Fish Evolution." [[Fish Evolution Working Group|Goals and summaries of the group]] are archived on this wiki.
 
| http://www.nescent.org/about/images/nsf_logo.jpg
 
 
|}
 
|}
  
 
==Pages of public interest==
 
==Pages of public interest==
  
* [[Ontology_Data_Service_API|Ontology Data Service API Description]]
+
* [[Training and Workshops]]

Revision as of 17:03, 24 May 2018

Enabling Machine-actionable Semantics for Comparative Analysis of Trait Evolution (SCATE)

Our objective with this project is to create infrastructure that will provide comparative trait analysis tools easy access to algorithms powered by machine reasoning with the semantics of trait descriptions. Similar to how Google, IBM Watson, and others have enabled developers of smartphone apps to incorporate, with only a few lines of code, complex machine-learning and artificial intelligence capabilities such as sentiment analysis, we aim to demonstrate how easy access to knowledge computing opens up new opportunities for analysis, tools, and research in comparative trait analysis. As driving biological research questions, we focus on addressing three long-standing limitations in comparative studies of trait evolution: recombining trait data, modeling trait evolution, and generating testable hypotheses for the drivers of trait adaptation.

More information about this project, including participating PIs and funding, can be found at its project website.

Ontology-enabled reasoning across phenotypes from evolution and model organisms

NSF funding for this project ends June 30, 2018.

About this project

Our overall objective is to create a scalable infrastructure that enables linking descriptive phenotype observations across different fields of biology by the semantic similarity of their free-text descriptions. In other words, we are trying to make descriptive observations amenable to large-scale computation so that they can be subjected to computational data integration and knowledge discovery techniques in ways similarly powerful as the techniques we are used to for numeric, quantitative observations.

Our approach to accomplish this centers on transforming descriptive observations from the natural language text form in which they are typically reported, to fully computable logic expressions that utilize terms from shared ontologies. We create these expressions (which we also call "annotations") for evolutionary phenotypes reported in the systematics literature, typically in the form of character state matrices. We use the Entity-Quality (EQ) formalism to compose these expressions, which was initially conceived for making biomedical and mutant model organism phenotype observations interoperable.

We combine the EQ annotations we create for evolutionary phenotypes with the EQ annotations created for the myriad of phenotypes observed for mutant model organisms in an integrated knowledgebase (essentially a triple-store). We then apply Description Logic-reasoning to evaluate which evolutionary phenotype transitions can be inferred as semantically similar to which mutant model organism phenotypes, and vice versa. Since the genetic cause of a mutant phenotype is usually known, the links between evolutionary and mutant phenotypes identified in this way can be used to construct testable hypotheses about the genetic correlates or causes of evolutionary transitions.

Phenoscape II tree view.jpg

In a previous project, titled Linking Evolution to Genomics Using Phenotype Ontologies, we developed a working prototype as a successful proof-of-concept, using teleost fishes for evolutionary phenotypes and the zebrafish model organism as a source of mutant phenotypes. Here, we aim to make the components of the prototype, including tools and workflows, sufficiently scalable so that they are adequate for the much more extensive volume and more diverse nature of skeletal phenotypes across all vertebrates, fossil and modern. Specifically, our aims encompass the following:

  1. Develop a fast semantic similarity engine so that the integrated knowledgebase can be searched on-the-fly for biological taxa or genotypes bearing a profile of phenotypes that is similar, but not necessarily identical, to a query profile.
  2. Develop an ontological framework for reasoning over homology that can be scaled to a large number of anatomically diverse evolutionary lineages.
  3. Reduce the time and cost of obtaining EQ statements from the literature, while at the same time improving the quality and consistency of those statements, by incorporating natural language processing tools and by improving curation software to allow for on-demand augmentation of community ontologies.
    Phenoscape II architecture.png
  4. Build umbrella taxonomic and anatomical ontologies for the vertebrates, the latter to be supplemented by explicit homology relations among anatomical structures.
  5. Create a knowledgebase that integrates evolutionary phenotypes for vertebrate fin and limb characters with genetic and phenotype data from three vertebrate model organisms: zebrafish (Danio rerio), frog (Xenopus laevis), and mouse (Mus musculus).
  6. As a capstone, we will assess the results of our work by how well we can apply machine reasoning to retrieve candidate genes for the well-studied vertebrate fin-limb transition and other major events in skeletal evolution of vertebrates.

In addition to a web-based interface, we will make all data, including the integrated knowledgebase, available in the Web Ontology Language (OWL), so that other researchers can reuse the data in as many ways as possible.

The vertebrate fin/limb transition: the test system

The evolution of limbs from fins is arguably one of the most well studied transitions in vertebrate history. The genes involved in positioning, growth, and patterning of the fin and limb at various stages are well-known and documented in the vertebrate model organism databases ZFIN, Xenbase, and MGI; changes in skeletal morphology and corresponding assertions of homology are well-documented in the comparative morphological literature. Bringing together the genetic, developmental, morphological and evolutionary data in the Phenoscape Knowledgebase will provide an ideal test bed for judging the reliability of candidate gene predictions and the application of homology logic.

Outreach

Outreach activities include:

  • Summer internships in bio-ontologies for undergraduate/graduate students, in partnership with the DeepFin Research Coordination Network
  • A "Junior Biocurator" program for advanced Chicago public high school students to be implemented by Project Exploration.
  • Undergraduate internship and community outreach to the Native American population through University of South Dakota programs.

Contacts

Paula Mabee (University of South Dakota) and Todd Vision (University of North Carolina Chapel Hill, National Evolutionary Synthesis Center) are the Principal Investigators of this project. Co-principal investigators are David Blackburn (California Academy of Sciences), Judith Blake (Mouse Genome Informatics, Jackson Laboratories), Hilmar Lapp (National Evolutionary Synthesis Center), Paul Sereno (University of Chicago), Monte Westerfield (ZFIN, University of Oregon), and Aaron Zorn (Xenbase, Cincinnati Children's Hospital Medical Center) (see their contact addresses).

Acknowledgments

The SCATE project has been funded by NSF collaborative grants DBI-1661456 (Duke University), DBI-1661529 (Virginia Tech), DBI-1661516 (University of South Dakota), and DBI-1661356 (UNC Chapel Hill and RENCI) from Sep 1, 2017 to Aug 31, 2020. The grant proposal text with references is publicly available: W. Dahdul, J.P. Balhoff, H. Lapp, J. Uyeda, & T.J. Vision. (2017). Enabling machine-actionable semantics for comparative analyses of trait evolution. Zenodo. http://doi.org/10.5281/zenodo.885538.

The Phenoscape II project ("Ontology-enabled reasoning across phenotypes from evolution and model organisms") was funded by NSF collaborative grants DBI-1062404 and DBI-1062542 from July 1, 2011, to June 30, 2018, and supported by the National Evolutionary Synthesis Center (NESCent), NSF #EF-0905606. The original Project Description for this grant is available here.

These projects would not have been possible without the hard work of numerous contributors and the results obtained in the Linking Evolution to Genomics Using Phenotype Ontologies project, which was funded by NSF grant BDI-0641025 from June 1, 2007, to Jun 30, 2011, and was supported by NESCent, NSF #EF-0423641. This earlier project in turn arose from a NESCent Working Group led by Paula Mabee and Monte Westerfield, "Towards an Integrated Database for Fish Evolution."

nsf_logo.jpg

Pages of public interest