The achievement of a robust, effective and responsible form of data sharing is currently regarded as a priority for biological and bio-medical research. Empirical evaluations of data sharing may be regarded as an indispensable first step in the identification of critical aspects and the development of strategies aimed at increasing availability of research data for the scientific community as a whole. Research concerning human genetic variation represents a potential forerunner in the establishment of widespread sharing of primary datasets. However, no specific analysis has been conducted to date in order to ascertain whether the sharing of primary datasets is common-practice in this research field. To this aim, we analyzed a total of 543 mitochondrial and Y chromosomal datasets reported in 508 papers indexed in the Pubmed database from 2008 to 2011. A substantial portion of datasets (21.9%) was found to have been withheld, while neither strong editorial policies nor high impact factor proved to be effective in increasing the sharing rate beyond the current figure of 80.5%. Disaggregating datasets for research fields, we could observe a substantially lower sharing in medical than evolutionary and forensic genetics, more evident for whole mtDNA sequences (15.0% vs 99.6%). The low rate of positive responses to e-mail requests sent to corresponding authors of withheld datasets (28.6%) suggests that sharing should be regarded as a prerequisite for final paper acceptance, while making authors deposit their results in open online databases which provide data quality control seems to provide the best-practice standard. Finally, we estimated that 29.8% to 32.9% of total resources are used to generate withheld datasets, implying that an important portion of research funding does not produce shared knowledge. By making the scientific community and the public aware of this important aspect, we may help popularize a more effective culture of data sharing.

Mine, yours, ours? sharing data on human genetic variation

MILIA, NICOLA;SANNA, EMANUELE;
2012-01-01

Abstract

The achievement of a robust, effective and responsible form of data sharing is currently regarded as a priority for biological and bio-medical research. Empirical evaluations of data sharing may be regarded as an indispensable first step in the identification of critical aspects and the development of strategies aimed at increasing availability of research data for the scientific community as a whole. Research concerning human genetic variation represents a potential forerunner in the establishment of widespread sharing of primary datasets. However, no specific analysis has been conducted to date in order to ascertain whether the sharing of primary datasets is common-practice in this research field. To this aim, we analyzed a total of 543 mitochondrial and Y chromosomal datasets reported in 508 papers indexed in the Pubmed database from 2008 to 2011. A substantial portion of datasets (21.9%) was found to have been withheld, while neither strong editorial policies nor high impact factor proved to be effective in increasing the sharing rate beyond the current figure of 80.5%. Disaggregating datasets for research fields, we could observe a substantially lower sharing in medical than evolutionary and forensic genetics, more evident for whole mtDNA sequences (15.0% vs 99.6%). The low rate of positive responses to e-mail requests sent to corresponding authors of withheld datasets (28.6%) suggests that sharing should be regarded as a prerequisite for final paper acceptance, while making authors deposit their results in open online databases which provide data quality control seems to provide the best-practice standard. Finally, we estimated that 29.8% to 32.9% of total resources are used to generate withheld datasets, implying that an important portion of research funding does not produce shared knowledge. By making the scientific community and the public aware of this important aspect, we may help popularize a more effective culture of data sharing.
File in questo prodotto:
File Dimensione Formato  
Milia et al. 2012.pdf

accesso aperto

Tipologia: versione editoriale
Dimensione 334.13 kB
Formato Adobe PDF
334.13 kB Adobe PDF Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11584/95833
Citazioni
  • ???jsp.display-item.citation.pmc??? 9
  • Scopus 36
  • ???jsp.display-item.citation.isi??? 31
social impact