Generative AI as Sentiment Analysis Tool in Hospitality GenAI, ChatGPT, research methods, tourism, hospitality
##plugins.themes.bootstrap3.article.main##
Resumo
Objectives | Customer reviews on social media and dedicated websites are an essential source of information for hospitality companies that allows them to learn what customers think about their services and those of their competitors (Olorunsola et al., 2023; Perez-Aranda et al., 2021; Veloso & Gomez-Suarez, 2023). By analysing online reviews, hospitality managers can get insights into the opinions of customers and what they liked or disliked about the services they used. Moreover, online reviews are a helpful source of information about changing trends in customer perceptions and preferences. The evaluation of customer reviews can be implemented manually or with the help of specialised software (Tetzlaff et al., 2019).
Sentiment analysis is a process of analysing language to interpret subjective evaluations, emotions and points of view (Taboada, 2016). Humans associate their opinions and emotions with specific linguistic structures in daily life. Sentiment analysis, therefore, addresses three tasks, including establishing whether the content represents a fact or a subjective opinion, determining its polarity (i.e. positive vs negative), and analysing the degree of polarity/ intensity of a sentiment (Cambria et al., 2017).
The importance of sentiment analysis for research and practice motivated the proliferation of multiple analytical methods and tools, including the automation of the analysis with AI. The advantage of manual sentiment analysis is its high achievable validity and reliability. However, special linguistic skills, substantial time and cross-validation are required to prevent subjectivity bias (Sotiriadou et al., 2014).
A range of tools that utilise supervised learning are trained for automatic differentiation between negative vs positive vs neutral (if required) emotion. The advantage of supervised learning algorithms is the achievable speed of analysis alongside the relatively high accuracy of the results. The limitation of the machine learning algorithms for sentiment analysis is the dependence of the analysis validity on the context. A new training might be required for a new content source (Taboada, 2016).
Recently, ChatGPT introduced a plugin for sentiment analysis (OpenAI.com, 2023). It uses a class of machine learning, called Natural Language Processing (NLP) that is trained to understand a language autonomously. Due to its cost, speed and intuitive interface, a range of studies has already applied ChatGPT for sentiment analysis of used-generated content (e.g. Adeshola & Adepoju, 2023). The launch of ChatGPT in November 2022 opened a new opportunity for the analysis of texts that do not require significant digital skills necessary to use effectively and efficiently other software packages for textual analysis. However, its performance in comparison to other methods remains largely underexplored (Fatouros et al., 2023).
This ongoing study aims to evaluate GenAI as a tool for sentiment analysis in the context of Tourism & Hospitality. It tests the effectiveness of ChatGPT vs NVivo auto coding vs manual analysis.
Methodology To analyse the effectiveness of GenAI as a tool for a sentiment analysis, the study compares the effectiveness of the sentiment analysis, done by ChatGPT 3.5 vs NVivo 20 vs manual coding. The entire dataset for analysis consists of the 160 reviews posted at Booking.com to ensure that the sentiment is derived from subjective tourist opinions. First, the codes that characterise hotel service robots, were manually extracted to ensure data validity. The sentiment was analysed to determine the polarity of the sentiments and well as its degree. The manual coding, replicated by two trained researchers, was used as a baseline. NVivo auto coding and a prompt “Run a sentiment analysis for each row of the table. The sentiment would range from "very positive", "moderately positive", "neutral", "moderately negative", to "very negative" .” , for ChatGPT, were used to automate the analysis. The number of errors of automated analysis vs manual coding was then calculated.
Main Results and Contributions The preliminary findings demonstrate that ChatGPT performs better than NVivo auto coding but worth than the manual analysis. In comparison to manual coding, ChatGPT provided identical results in determining the polarity of a sentiment (i.e. negative vs neutral vs positive), while 12% of the codes, generated by NVivo, contained a wrongly identified sentiment. Regarding the degree of polarity (i.e. very negative vs moderately negative), ChatGPT generated a total of 24% of erroneous codes. NVivo-generated results were comparable with 28% of wrongly attributed degree of polarity in comparison to the manual coding.
Limitations | The key limitations of the study are the related to the validity of each method and their comparability. Thus, the manual sentiment analysis required cross-validation, the supervised learning models do range in terms of their relevance for the context, and GenAI represents a “black box” with the model being unavailable for researchers’ evaluation. Therefore, more studies in various contexts and with the application of different sentiment analysis methods are required prior to claiming GenAI a reliable tool for the analysis.
Conclusions The preliminary findings indicate that ChatGPT 3.5 performs substantially better than NVivo auto coding. However, manual evaluation of the sentiment still provides more valid results. While research automation can substantially decrease the time and efforts of the researchers, a broader scale of research is required to understand how to ensure the validity and reliability of the GenAI tools.