We introduce a method for analyzing digital transformation in the health domain by constructing a Knowledge Graph from a large corpus of 7.8 million English news articles from the Dow Jones Data, News, and Analytics platform, dating from 1987 through 2023. We first sampled around 97k articles relevant to the Digital Health topic by training and deploying a Deep Learning binary classifier by fine-tuning BERT. Successively, by deploying Natural Language Processing techniques, we extracted triples from the identified articles to form a Digital Health News Knowledge Graph, which consists of 431k distinct triples connecting 186k entities through 1866 relations. This graph provides insights into the evolution of Digital Health in news media and serves as a resource for further research in the field. Our analysis reveals significant trends in Digital Health as reflected in the news, with notable peaks coinciding with key events like the COVID-19 pandemic. We split the analysis geographically for the United States and European countries and tracked over time for each macro-region the predominant entities and relations. The classifier, the knowledge graph, and data analytics visualizations are made publicly available for future work.

Exploring Digital Health Trends in the Headlines via Knowledge Graph Analysis

Zavarella V.;reforgiato recupero diego
;
Fenu G.
2025-01-01

Abstract

We introduce a method for analyzing digital transformation in the health domain by constructing a Knowledge Graph from a large corpus of 7.8 million English news articles from the Dow Jones Data, News, and Analytics platform, dating from 1987 through 2023. We first sampled around 97k articles relevant to the Digital Health topic by training and deploying a Deep Learning binary classifier by fine-tuning BERT. Successively, by deploying Natural Language Processing techniques, we extracted triples from the identified articles to form a Digital Health News Knowledge Graph, which consists of 431k distinct triples connecting 186k entities through 1866 relations. This graph provides insights into the evolution of Digital Health in news media and serves as a resource for further research in the field. Our analysis reveals significant trends in Digital Health as reflected in the news, with notable peaks coinciding with key events like the COVID-19 pandemic. We split the analysis geographically for the United States and European countries and tracked over time for each macro-region the predominant entities and relations. The classifier, the knowledge graph, and data analytics visualizations are made publicly available for future work.
2025
9783031824807
9783031824814
Digital Health
Information Extraction
Knowledge Graphs
Named Entity Recognition
News Analysis
Transformers
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11584/480167
 Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 0
  • ???jsp.display-item.citation.isi??? 0
social impact