The importance of person-to-person communication about a certain topic (Word of Mouth) is growing day by day, especially for decision-makers. These phenomena can be directly observed in online social networks. For example, the rise of influencers and social media managers. If more people talk about a specific product, then more people are encouraged to buy it and vice versa. Forby, those people usually leave a review for it. Such a review will directly impact the product, and this effect is amplified proportionally to how much the reviewer is considered to be trustworthy by the potential new customer. Furthermore, considering the negative reporting bias, it is easy to understand how customer satisfaction is of absolute interest for a company (as well as citizens' trust is for a politician). Textual data have then proved extremely useful, but they are complex, as the language is. For that, many approaches focus more on producing well-performing classifiers and ignore the highly complex interpretability of their models. Instead, we propose a framework able to produce a good sentiment classifier with a particular focus on the model interpretability. After analyzing the impact of Word of Mouth on earnings and the related psychological aspects, we propose an algorithm to extract the sentiment from a Natural Language text corpus. The combined approach of Neural Networks, characterized by high predictive power but at the cost of complex interpretation (usually considered as black-boxes), with more straightforward and informative models, allows not only to predict how much a sentence is positive (negative) but also to quantify a sentiment with a numeric value. In fact, the General Sentiment Decomposition (GSD) framework that we propose is based on a combination of Threshold-based Naive Bayes (an improved version of the original algorithm), SentiWordNet (an enriched Lexical Database for Sentiment Analysis tasks), and the Words Embeddings features (a high dimensional representation of words) that directly comes from the usage of Neural Networks. Moreover, using the GSD framework, we assess an objective sentiment scoring that improves the results' interpretation in many fields. For example, it is possible to identify specific critical sectors that require intervention to improve the offered services, find the company's strengths (useful for advertising campaigns), and, if time information is present, analyze trends on macro/micro topics. Besides, we have to consider that NL text data can be associated (or not) with a sentiment label, for example: 'positive' or 'negative'. To support further decision-making, we apply the proposed method to labeled (Booking.com, TripAdvisor.com) and unlabelled (Twitter.com) data, analyzing the sentiment of people who discuss a particular issue. In this way, we identify the aspects perceived as critical by the people concerning the "feedback" they publish on the web and quantify how happy (or not) they are about a specific problem. In particular, for Booking.com and TripAdvisor.com, we focus on customer satisfaction, whilst for Twitter.com, the main topic is climate change.

General Sentiment Decomposition: opinion mining based on raw Natural Language text

ROMANO, MAURIZIO
2021-04-26

Abstract

The importance of person-to-person communication about a certain topic (Word of Mouth) is growing day by day, especially for decision-makers. These phenomena can be directly observed in online social networks. For example, the rise of influencers and social media managers. If more people talk about a specific product, then more people are encouraged to buy it and vice versa. Forby, those people usually leave a review for it. Such a review will directly impact the product, and this effect is amplified proportionally to how much the reviewer is considered to be trustworthy by the potential new customer. Furthermore, considering the negative reporting bias, it is easy to understand how customer satisfaction is of absolute interest for a company (as well as citizens' trust is for a politician). Textual data have then proved extremely useful, but they are complex, as the language is. For that, many approaches focus more on producing well-performing classifiers and ignore the highly complex interpretability of their models. Instead, we propose a framework able to produce a good sentiment classifier with a particular focus on the model interpretability. After analyzing the impact of Word of Mouth on earnings and the related psychological aspects, we propose an algorithm to extract the sentiment from a Natural Language text corpus. The combined approach of Neural Networks, characterized by high predictive power but at the cost of complex interpretation (usually considered as black-boxes), with more straightforward and informative models, allows not only to predict how much a sentence is positive (negative) but also to quantify a sentiment with a numeric value. In fact, the General Sentiment Decomposition (GSD) framework that we propose is based on a combination of Threshold-based Naive Bayes (an improved version of the original algorithm), SentiWordNet (an enriched Lexical Database for Sentiment Analysis tasks), and the Words Embeddings features (a high dimensional representation of words) that directly comes from the usage of Neural Networks. Moreover, using the GSD framework, we assess an objective sentiment scoring that improves the results' interpretation in many fields. For example, it is possible to identify specific critical sectors that require intervention to improve the offered services, find the company's strengths (useful for advertising campaigns), and, if time information is present, analyze trends on macro/micro topics. Besides, we have to consider that NL text data can be associated (or not) with a sentiment label, for example: 'positive' or 'negative'. To support further decision-making, we apply the proposed method to labeled (Booking.com, TripAdvisor.com) and unlabelled (Twitter.com) data, analyzing the sentiment of people who discuss a particular issue. In this way, we identify the aspects perceived as critical by the people concerning the "feedback" they publish on the web and quantify how happy (or not) they are about a specific problem. In particular, for Booking.com and TripAdvisor.com, we focus on customer satisfaction, whilst for Twitter.com, the main topic is climate change.
26-apr-2021
File in questo prodotto:
File Dimensione Formato  
ociamthesismain.pdf

Open Access dal 27/10/2021

Descrizione: tesi_di_dottorato_Maurizio_Romano
Tipologia: Tesi di dottorato
Dimensione 4.97 MB
Formato Adobe PDF
4.97 MB Adobe PDF Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11584/313197
 Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
social impact