Sentiments on the Map: A Comparative Analysis of Online Tourist Reviews in Bragança's Attractions
##plugins.themes.bootstrap3.article.main##
Resumo
Objective | A comparative sentiment analysis based on reviews of tourist attractions on platforms such as TripAdvisor and Google Maps can provide valuable insights into tourist satisfaction and the quality of attractions in a specific destination (Viñán-Ludeña & Campos, 2022; Pineda-Jaramillo et al., 2023). This research aims to undertake a comparative sentiment analysis through reviews on the online platforms TripAdvisor and Google Maps regarding the tourist attractions in Bragança, located in the northern region of Portugal.
Methodology | The TripAdvisor is a popular platform for travel reviews (Nowacki & Niezgoda, 2020; Scalabrini et al., 2023). Google Maps, also allows for such evaluations, equally ranking attractions and other points of interest (Alharbi et al., 2022; Alzboun et al., 2023; Khairina Mohd Haris et al., 2023). Apify is a web automation and scraping platform that enables users to extract data from websites, automate various web-related tasks, and access structured data (Nallakaruppan et al., 2023). Using Apify, it was possible to compile two review databases, extracting all comments from 16 attractions in the Bragança destination between 2020-23. This database comprises 250 TripAdvisor reviews and 242 Google Maps reviews, encompassing rating assessments and text-based comments in Portuguese. Reviews were categorised according to the rating systems of both platforms (ranging from 1 to 5 points) and compared with algorithmic sentiment analysis. In line with similar studies (e.g. Scalabrini et al., 2023), the ratings were further categorised as negative (1-2 points), neutral (=3 points), or positive (4-5 points). Despite initially using the Multinomial Naïve Bayes classifier, it was employed to re-evaluate the model for text classification.
Principal's results and contribution | The results show that some attractions received consistently positive evaluations, such as Castelo de Bragança; Parque Natural de Montesinho; Centro Ciência Viva; Galeria George Dussaud; Museu Militar de Bragança. Occasionally, the TripAdvisor ratings remained neutral while the Google Maps ratings shifted to positive. For instance, Domus Municipalis; Cidadela de Bragança. Conversely, the Centro de Interpretação da Cultura Sefardita do Nordeste Transmontano received opposite assessments, with TripAdvisor showing a positive rating while Google Maps indicated a neutral rating.
A confusion matrix was employed to classify tourist reviews (e.g. Kulkarni et al., 2020), calculating the accuracy, F1-Score, and recall, to gauge the model's performance. Specific features, such as bigrams, were used to enhance model evaluation. For TripAdvisor, classified 46 reviews as negative, while only 8 were genuinely negative, suggesting a tendency to overestimate negative reviews. The model categorised 14 reviews as neutral, of which 4 were genuinely neutral, indicating reasonable performance. Regarding positive reviews, the model classified 34 reviews as positive, while 68 were positive, indicating an overestimation of positive reviews.
The model classified 18 reviews as negative for Google Maps, with only 2 being negative. This also indicated a tendency to overestimate negative reviews. Categorised 5 reviews as neutral, with none being truly neutral, suggesting good model performance in this category. As for positive reviews, the model labelled 34 reviews as positive when 123 were positive, indicating an overestimation of positive reviews.
Limitations | Specifically, the low accuracy value in both datasets, suggests that the quantity of data collected from the web needs to be increased. Furthermore, not all multi-label classifiers and algorithms were tested using the study's model.
Conclusions | The sentiment distribution varied significantly among different attractions in both datasets, with Castelo de Bragança consistently receiving positive reviews. Conversely, Domus Municipalis, Cidadela de Bragança and Igreja de Santa Maria, exhibited a disproportionately high frequency of negative comments on TripAdvisor, despite having relatively high average ratings. On the other hand, the Centro de Interpretação da Cultura Sefardita do Nordeste Transmontano had a disproportionately high frequency of negative comments on Google Maps. The confusion matrix analysis provided additional insights, revealing the model's tendency to overestimate specific sentiments, particularly negative and positive ratings. This highlights the need to fine-tune the model's parameters or better evaluate the quality of the training data.