The iterative Threshold-based Naïve Bayes (iTb-NB) classifier is introduced as a (simple) improved version of the previously introduced non-iterative Threshold-based Naïve Bayes (Tb-NB) classifier. iTb-NB starts from a Natural Language text-corpus and allows the user to quantify with a numeric value a sentiment (positive or negative) from a specific test. Differently from Tb-NB, iTb-NB is an algorithm aimed at estimating multiple threshold values that concur to refine Tb-NB’s decision rules when classifying a text into positive (negative) based on its content. Observations with sentiment scores close to the threshold are marked to be reclassified, hence a new decision rule is defined for them. Such “iterative” process improves the quality of predictions w.r.t. Tb-NB but keeping the possibility to utilize its results as the input of useful post-hoc analyses. The effectiveness of iTb-NB is evaluated analyzing hotel guests’ reviews from all hotels located in the Sardinia region and available on Booking.com. Furthermore, iTb-NB is compared with Tb-NB in terms of model accuracy, resistance to noise, and computational efficiency.

Iterative threshold‑based Naïve bayes classifer

Romano M
;
Zammarchi G;Conversano C
2024-01-01

Abstract

The iterative Threshold-based Naïve Bayes (iTb-NB) classifier is introduced as a (simple) improved version of the previously introduced non-iterative Threshold-based Naïve Bayes (Tb-NB) classifier. iTb-NB starts from a Natural Language text-corpus and allows the user to quantify with a numeric value a sentiment (positive or negative) from a specific test. Differently from Tb-NB, iTb-NB is an algorithm aimed at estimating multiple threshold values that concur to refine Tb-NB’s decision rules when classifying a text into positive (negative) based on its content. Observations with sentiment scores close to the threshold are marked to be reclassified, hence a new decision rule is defined for them. Such “iterative” process improves the quality of predictions w.r.t. Tb-NB but keeping the possibility to utilize its results as the input of useful post-hoc analyses. The effectiveness of iTb-NB is evaluated analyzing hotel guests’ reviews from all hotels located in the Sardinia region and available on Booking.com. Furthermore, iTb-NB is compared with Tb-NB in terms of model accuracy, resistance to noise, and computational efficiency.
File in questo prodotto:
File Dimensione Formato  
s10260-023-00721-1.pdf

accesso aperto

Tipologia: versione editoriale (VoR)
Dimensione 2.5 MB
Formato Adobe PDF
2.5 MB Adobe PDF Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11584/374445
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 1
  • ???jsp.display-item.citation.isi??? 0
social impact