A key activity for life scientists is the exploration of the relatedness of a set of genes in order to differentiate genes performing coherently related functions from random grouped genes. This paper considers exploring the relatedness within two popular bio-organizations, namely gene families and pathways. This exploration is carried out by integrating different resources (ontologies, texts, expert classifications) and aims to suggest patterns that facilitate the biologists in obtaining a more comprehensive vision of differences in gene behaviour. Our approach is based on the annotation of a specialized corpus of texts (the gene summaries) that condense the description of functions/processes in which genes are involved. By annotating these summaries with different ontologies a set of descriptor terms is derived and compared in order to obtain a measure of relatedness within the bio-organizations we considered. Finally, the most important annotations within each family are extracted using a text categorization method.
Exploring the relatedness of gene sets
DESSI, NICOLETTA;DESSI', STEFANIA;PES, BARBARA
2015-01-01
Abstract
A key activity for life scientists is the exploration of the relatedness of a set of genes in order to differentiate genes performing coherently related functions from random grouped genes. This paper considers exploring the relatedness within two popular bio-organizations, namely gene families and pathways. This exploration is carried out by integrating different resources (ontologies, texts, expert classifications) and aims to suggest patterns that facilitate the biologists in obtaining a more comprehensive vision of differences in gene behaviour. Our approach is based on the annotation of a specialized corpus of texts (the gene summaries) that condense the description of functions/processes in which genes are involved. By annotating these summaries with different ontologies a set of descriptor terms is derived and compared in order to obtain a measure of relatedness within the bio-organizations we considered. Finally, the most important annotations within each family are extracted using a text categorization method.File | Dimensione | Formato | |
---|---|---|---|
CIBB2014_post.pdf
Solo gestori archivio
Descrizione: Articolo principale
Tipologia:
versione editoriale (VoR)
Dimensione
399.84 kB
Formato
Adobe PDF
|
399.84 kB | Adobe PDF | Visualizza/Apri Richiedi una copia |
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.