It has been estimated that 8% of human genome is composed of sequences originating from exogenous retroviruses that infected the germ line cells over millions of years and are termed human endogenous retroviruses (HERVs). After germ line infection, the retroviral DNA was integrated into the genome and transmitted through vertical transmission in the progeny by Mendelian laws. ERVs have been found in all vertebrates, indicating that the process of endogenization was quiet common, and have been considered retroviral fossil whose phylogenetic analysis provides important information on evolution. During the million of years after their integration, ERV sequences have accumulated abundant mutations (deletions, insertions, duplications, and rearrangements) that have caused loss of virulence, contributing to the current composition of the actual HERV. Several studies suggested the ability of HERVs to significantly interfere in human biology, in both physiological and pathological scenarios. However, despite a few cases such as the one of Syncityn--‐1, the HERV physiological role an their involvement in the development of diseases such as cancer, autoimmune diseases, neuronal diseases and other disorders is still controversial. Among the most studied HERVs is the betaretrovirus HERV--‐K(HML1--‐10) clade, composed of 10 groups whose correlation with diseases has been proposed in a number of cases. In particular, it has been reported that the env gene, termed HERV--‐K--‐MEL, is expressed in melanoma cells and not in healthy controls. In addition, it has been demonstrated that the low copy number of HML--‐10 sequences within the complement C4 gene of the major histocompatibility complex is related to a higher frequency of Type 1 Diabetes, hypothesizing that these elements could act as antisense control or be controlled by other HML--‐10 sequences that could be involved in Type 1 Diabetes. Given the many accumulated mutations and their high copy number, HERV studies have been hampered by the lack of precise information on their chromosomal localization and composition. Recently, the use of a novel bionfomatic approach allowed us to precise identify a total of 3173 sequences in 5 the human genome assembly GRCh37/hg19. Hence in the present work we wanted to fully characterize two HERV--‐K subgroups, the HML--‐10 subgroup that includes 9 proviruses with 5 haplotypes, and the HML--‐6 subgroup that includes 63 proviruses. The HML--‐10 and HML--‐6 analysis allowed to i) confirming their classification by an innovative methodology of Similarity image (Simage) analysis, ii) precisely defining the retroviral structure in each locus, iii) determining the presence of the betaretrovirus feature in all the identified sequences; iv) assessing the putative time of integration of each retroviral sequence and v) verifying the presence of some of the sequences also in no--‐ human primates. In addition, we assessed the expression of HML--‐10 and HML--‐6 sequences by analyzing three public RNAseq databases comprising > 30 different tissues isolated from healthy individuals as well as mRNA from patients with autoimmune diseases such as Type 1 Diabetes, Systemic Lupus Erythematosus and Multiple Sclerosis. Data showed that some HML--‐6 sequences are expressed in a number of healthy tissues while no HML--‐10 expression has been specifically observed in these dataset. Overall, these results increase the knowledge of the composition of the human genome and lay the foundation for a better understanding of the potential physiological and pathological role of HML--‐6 and HML--‐10 retroviruses.

Characterization of the human endogenous retrovirus HERV-­‐K(HML-­‐6) and HERV-­‐K(HML-­‐10) sequences and analysis of their expression

CADEDDU, MARTA
2016-03-11

Abstract

It has been estimated that 8% of human genome is composed of sequences originating from exogenous retroviruses that infected the germ line cells over millions of years and are termed human endogenous retroviruses (HERVs). After germ line infection, the retroviral DNA was integrated into the genome and transmitted through vertical transmission in the progeny by Mendelian laws. ERVs have been found in all vertebrates, indicating that the process of endogenization was quiet common, and have been considered retroviral fossil whose phylogenetic analysis provides important information on evolution. During the million of years after their integration, ERV sequences have accumulated abundant mutations (deletions, insertions, duplications, and rearrangements) that have caused loss of virulence, contributing to the current composition of the actual HERV. Several studies suggested the ability of HERVs to significantly interfere in human biology, in both physiological and pathological scenarios. However, despite a few cases such as the one of Syncityn--‐1, the HERV physiological role an their involvement in the development of diseases such as cancer, autoimmune diseases, neuronal diseases and other disorders is still controversial. Among the most studied HERVs is the betaretrovirus HERV--‐K(HML1--‐10) clade, composed of 10 groups whose correlation with diseases has been proposed in a number of cases. In particular, it has been reported that the env gene, termed HERV--‐K--‐MEL, is expressed in melanoma cells and not in healthy controls. In addition, it has been demonstrated that the low copy number of HML--‐10 sequences within the complement C4 gene of the major histocompatibility complex is related to a higher frequency of Type 1 Diabetes, hypothesizing that these elements could act as antisense control or be controlled by other HML--‐10 sequences that could be involved in Type 1 Diabetes. Given the many accumulated mutations and their high copy number, HERV studies have been hampered by the lack of precise information on their chromosomal localization and composition. Recently, the use of a novel bionfomatic approach allowed us to precise identify a total of 3173 sequences in 5 the human genome assembly GRCh37/hg19. Hence in the present work we wanted to fully characterize two HERV--‐K subgroups, the HML--‐10 subgroup that includes 9 proviruses with 5 haplotypes, and the HML--‐6 subgroup that includes 63 proviruses. The HML--‐10 and HML--‐6 analysis allowed to i) confirming their classification by an innovative methodology of Similarity image (Simage) analysis, ii) precisely defining the retroviral structure in each locus, iii) determining the presence of the betaretrovirus feature in all the identified sequences; iv) assessing the putative time of integration of each retroviral sequence and v) verifying the presence of some of the sequences also in no--‐ human primates. In addition, we assessed the expression of HML--‐10 and HML--‐6 sequences by analyzing three public RNAseq databases comprising > 30 different tissues isolated from healthy individuals as well as mRNA from patients with autoimmune diseases such as Type 1 Diabetes, Systemic Lupus Erythematosus and Multiple Sclerosis. Data showed that some HML--‐6 sequences are expressed in a number of healthy tissues while no HML--‐10 expression has been specifically observed in these dataset. Overall, these results increase the knowledge of the composition of the human genome and lay the foundation for a better understanding of the potential physiological and pathological role of HML--‐6 and HML--‐10 retroviruses.
11-mar-2016
HERVS
HML-10
HML-6
autoimmune diseases
retrovirus endogeni
File in questo prodotto:
File Dimensione Formato  
PhD_Thesis_Cadeddu.pdf

accesso aperto

Tipologia: Tesi di dottorato
Dimensione 1.96 MB
Formato Adobe PDF
1.96 MB Adobe PDF Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11584/266897
 Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
social impact