In this paper, we present our currently on-going work on a method for analyzing digital health transformation in our society by constructing a Knowledge Graph from a large corpus of 7.8 million English news articles, dating from 1987 through 2023. We firstly sampled around 95k articles relevant to the Digital Health topic by training and deploying a Deep Learning binary classifier via fine-tuning BERT. Successively, by deploying NLP techniques, we extracted triples from the identified articles to form a Digital Health News Knowledge Graph, which consists of 431k distinct triples connecting 186k entities through 1866 relations. The constructed Knowledge Graph provides insights into the evolution of Digital Health in news media and serves as a resource for further research in the field. The analysis that we have carried out reveals significant trends in Digital Health as reflected in the news, with notable peaks coinciding with key events like the COVID-19 pandemic.
Charting the Landscape of Digital Health: Towards A Knowledge Graph Approach to News Media Analysis
Zavarella V.;Reforgiato recupero D.
;
2024-01-01
Abstract
In this paper, we present our currently on-going work on a method for analyzing digital health transformation in our society by constructing a Knowledge Graph from a large corpus of 7.8 million English news articles, dating from 1987 through 2023. We firstly sampled around 95k articles relevant to the Digital Health topic by training and deploying a Deep Learning binary classifier via fine-tuning BERT. Successively, by deploying NLP techniques, we extracted triples from the identified articles to form a Digital Health News Knowledge Graph, which consists of 431k distinct triples connecting 186k entities through 1866 relations. The constructed Knowledge Graph provides insights into the evolution of Digital Health in news media and serves as a resource for further research in the field. The analysis that we have carried out reveals significant trends in Digital Health as reflected in the news, with notable peaks coinciding with key events like the COVID-19 pandemic.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.