A key activity for life scientists is the exploration of the relatedness of a set of genes in order to differentiate genes performing coherently related functions from random grouped genes. This paper considers exploring the relatedness within two popular bio-organizations, namely gene families and pathways. This exploration is carried out by integrating different resources (ontologies, texts, expert classifications) and aims to suggest patterns that facilitate the biologists in obtaining a more comprehensive vision of differences in gene behaviour. Our approach is based on the annotation of a specialized corpus of texts (the gene summaries) that condense the description of functions/processes in which genes are involved. By annotating these summaries with different ontologies a set of descriptor terms is derived and compared in order to obtain a measure of relatedness within the bio-organizations we considered. Finally, the most important annotations within each family are extracted using a text categorization method.

Exploring the relatedness of gene sets

DESSI, NICOLETTA;DESSI', STEFANIA;PES, BARBARA
2015-01-01

Abstract

A key activity for life scientists is the exploration of the relatedness of a set of genes in order to differentiate genes performing coherently related functions from random grouped genes. This paper considers exploring the relatedness within two popular bio-organizations, namely gene families and pathways. This exploration is carried out by integrating different resources (ontologies, texts, expert classifications) and aims to suggest patterns that facilitate the biologists in obtaining a more comprehensive vision of differences in gene behaviour. Our approach is based on the annotation of a specialized corpus of texts (the gene summaries) that condense the description of functions/processes in which genes are involved. By annotating these summaries with different ontologies a set of descriptor terms is derived and compared in order to obtain a measure of relatedness within the bio-organizations we considered. Finally, the most important annotations within each family are extracted using a text categorization method.
2015
9783319244617
Gene relatedness; Ontology annotation; Semantic similarity; Text mining
File in questo prodotto:
File Dimensione Formato  
CIBB2014_post.pdf

Solo gestori archivio

Descrizione: Articolo principale
Tipologia: versione editoriale (VoR)
Dimensione 399.84 kB
Formato Adobe PDF
399.84 kB Adobe PDF   Visualizza/Apri   Richiedi una copia

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11584/185991
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 3
  • ???jsp.display-item.citation.isi??? 0
social impact