The Threshold-based Naive Bayes (Tb-NB) classifier is introduced as a (simple) improved version of the original Naive Bayes classifier. Tb-NB extracts the sentiment from a Natural Language text corpus and allows the user not only to predict how much a sentence is positive (negative) but also to quantify a sentiment with a numeric value. It is based on the estimation of a single threshold value that concurs to define a decision rule that classifies a text into a positive (negative) opinion based on its content. One of the main advantage deriving from Tb-NB is the possibility to utilize its results as the input of post-hoc analysis aimed at observing how the quality associated to the different dimensions of a product or a service or, in a mirrored fashion, the different dimensions of customer satisfaction evolve in time or change with respect to different locations. The effectiveness of Tb-NB is evaluated analyzing data concerning the tourism industry and, specifically, hotel guests' reviews from all hotels located in the Sardinian region and available on Booking.com. Moreover, Tb-NB is compared with other popular classifiers used in sentiment analysis in terms of model accuracy, resistance to noise and computational efficiency.

Threshold-based Naïve Bayes classifier

Romano, M.;Contu, G.;Mola, F.;Conversano, C.
2024-01-01

Abstract

The Threshold-based Naive Bayes (Tb-NB) classifier is introduced as a (simple) improved version of the original Naive Bayes classifier. Tb-NB extracts the sentiment from a Natural Language text corpus and allows the user not only to predict how much a sentence is positive (negative) but also to quantify a sentiment with a numeric value. It is based on the estimation of a single threshold value that concurs to define a decision rule that classifies a text into a positive (negative) opinion based on its content. One of the main advantage deriving from Tb-NB is the possibility to utilize its results as the input of post-hoc analysis aimed at observing how the quality associated to the different dimensions of a product or a service or, in a mirrored fashion, the different dimensions of customer satisfaction evolve in time or change with respect to different locations. The effectiveness of Tb-NB is evaluated analyzing data concerning the tourism industry and, specifically, hotel guests' reviews from all hotels located in the Sardinian region and available on Booking.com. Moreover, Tb-NB is compared with other popular classifiers used in sentiment analysis in terms of model accuracy, resistance to noise and computational efficiency.
2024
Naive Bayes; Booking.com; Customer satisfaction; Sentiment analysis; Natural language processing; Word of mouth
File in questo prodotto:
File Dimensione Formato  
s11634-023-00536-8.pdf

accesso aperto

Tipologia: versione editoriale (VoR)
Dimensione 1.78 MB
Formato Adobe PDF
1.78 MB Adobe PDF Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11584/356498
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 4
  • ???jsp.display-item.citation.isi??? 6
social impact