The iterative Threshold-based Naïve Bayes (iTb-NB) classifier is introduced as a (simple) improved version of the previously introduced non-iterative Threshold-based Naïve Bayes (Tb-NB) classifier. iTb-NB starts from a Natural Language text-corpus and allows the user to quantify with a numeric value a sentiment (positive or negative) from a specific test. Differently from Tb-NB, iTb-NB is an algorithm aimed at estimating multiple threshold values that concur to refine Tb-NB’s decision rules when classifying a text into positive (negative) based on its content. Observations with sentiment scores close to the threshold are marked to be reclassified, hence a new decision rule is defined for them. Such “iterative” process improves the quality of predictions w.r.t. Tb-NB but keeping the possibility to utilize its results as the input of useful post-hoc analyses. The effectiveness of iTb-NB is evaluated analyzing hotel guests’ reviews from all hotels located in the Sardinia region and available on Booking.com. Furthermore, iTb-NB is compared with Tb-NB in terms of model accuracy, resistance to noise, and computational efficiency.
Iterative threshold‑based Naïve bayes classifer
Romano M
;Zammarchi G;Conversano C
2024-01-01
Abstract
The iterative Threshold-based Naïve Bayes (iTb-NB) classifier is introduced as a (simple) improved version of the previously introduced non-iterative Threshold-based Naïve Bayes (Tb-NB) classifier. iTb-NB starts from a Natural Language text-corpus and allows the user to quantify with a numeric value a sentiment (positive or negative) from a specific test. Differently from Tb-NB, iTb-NB is an algorithm aimed at estimating multiple threshold values that concur to refine Tb-NB’s decision rules when classifying a text into positive (negative) based on its content. Observations with sentiment scores close to the threshold are marked to be reclassified, hence a new decision rule is defined for them. Such “iterative” process improves the quality of predictions w.r.t. Tb-NB but keeping the possibility to utilize its results as the input of useful post-hoc analyses. The effectiveness of iTb-NB is evaluated analyzing hotel guests’ reviews from all hotels located in the Sardinia region and available on Booking.com. Furthermore, iTb-NB is compared with Tb-NB in terms of model accuracy, resistance to noise, and computational efficiency.File | Dimensione | Formato | |
---|---|---|---|
s10260-023-00721-1.pdf
accesso aperto
Tipologia:
versione editoriale (VoR)
Dimensione
2.5 MB
Formato
Adobe PDF
|
2.5 MB | Adobe PDF | Visualizza/Apri |
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.