Topic modelling of spa visitor reviews using the example of Gellért Spa and Swimming Pool

Authors

  • Mátyás Hinek Budapest Metropolitan University

DOI:

https://doi.org/10.15170/MM.2021.55.04.03

Keywords:

natural language processing, topic modelling, latent Dirichlet distribution, Gellért Spa and Swimming Pool

Abstract

THE AIMS OF THE PAPER

The study presents the results of a computer based topical modelling of guest reviews written by visitors of the Gellért Spa and Swimming Pool between 2004 and 2021. From a tourism marketing point of view, the analysis of guest reviews is of particular importance, especially for attractions with a high turnover of visitors. The Gellért is an iconic monumental bath of Budapest and Hungary, a health tourism attraction that is an unmissable experience for many visitors to Budapest. This is reflected in the more than 10,000 guest reviews written in almost 30 languages on Tripadvisor over the last decade and a half, a number so huge that it can only be comprehensively understood by machine.

METHODOLOGY

All reviews written on Tripadvisor between 2005 and 2021 were downloaded using a dedicated app. Reviews written in languages other than English were translated into English using Google Translate. The resulting corpus was analysed using structured topic modelling with latent Dirichlet allocation, a rapidly developing method in text mining, to identify the topics that typically occur in multiple guest reviews. We did this using the statistical software R and the structured topic modelling application STM running in R environment.

MOST IMPORTANT RESULTS

The modelling identified 12 typical themes across the guest reviews. These were also compared with an earlier analysis of the same corpus based on word frequency analysis, which showed that similar themes could be identified using both methodologies.  We also separately analysed the representation of opinions on service features that were largely negatively evaluated by guests over the time horizon studied. According to this, the proportion of themes related to hygiene and cleanliness increased, while the proportion of themes related to guest communication, also largely negatively rated, decreased in the written guest reviews of the Gellért Spa.

Author Biography

Mátyás Hinek, Budapest Metropolitan University

College Professor

References

Balogh K. (2015). A látens Dirichlet allokáció társadalomtudományi alkalmazása [ELTE Társadalomtudományi Kar]. https://tas.precognox.com/labs/kuruc-info-visualization/A_latens_Dirichlet_allokacio_tarsadalomtudomanyi_alkalmazasa_Balogh_Kitti.pdf

Blei, D. M. (2012), Probabilistic topic models. Communications of the ACM, 55(4), 77–84. DOI: 10.1145/2133806.2133826

Blei, D. M., Ng, A. Y., & Jordan, M. I. (2003), Latent Dirichlet Allocation. Journal of Machine Learning Research, 3(4–5), 993–1022. DOI: 10.1162/JMLR.2003.3.4-5.993

Calheiros, A. C., Moro, S., & Rita, P. (2017), Sentiment Classification of Consumer-Generated Online Reviews Using Topic Modeling. Journal of Hospitality Marketing & Management, 26(7), 675–693. DOI: 10.1080/19368623.2017.1310075

Gerrish, S., & Blei, D. M. (2012), How They Vote: Issue-Adjusted Models of Legislative Behavior. Advances in Neural Information Processing Systems, 25, 2753–2761. https://proceedings.neurips.cc/paper/2012/hash/193002e668758ea9762904da1a22337c-Abstract.html

Hinek M. (2021), Fesztivállátogatók véleményeinek számítógéppel támogatott tematikus modellezése – egy kísérlet eredményei Computer-aided topic modelling based on festival-goers’ opinions – results of an experiment. Turizmus Bulletin, 21(1), 4–12. DOI: 10.14267/TURBULL.2021v21n1.1

Hu, N., Zhang, T., Gao, B., & Bose, I. (2019), What do hotel customers complain about? Text analysis using structural topic model. Tourism Management, 72, 417–426. DOI: 10.1016/j.tourman.2019.01.002

Kirilenko, A. P., Stepchenkova, S. O., & Dai, X. (2021), Automated topic modeling of tourist reviews: Does the Anna Karenina principle apply? Tourism Management, 83, 104241. DOI: 10.1016/j.tourman.2020.104241

Korfiatis, N., Stamolampros, P., Kourouthanassis, P., & Sagiadinos, V. (2019), Measuring service quality from unstructured data: A topic modeling application on airline passengers’ online reviews. Expert Systems with Applications, 116, 472–486. DOI: 10.1016/j.eswa.2018.09.037

Park, K., & Ha, S. H. (2017), Customer Service Evaluation based on Online Text Analytics: Sentiment Analysis and Structural Topic Modeling. The Journal of Information Systems, 26(4), 327–353. DOI: 10.5859/KAIS.2017.26.4.327

Paul, M. J., & Dredze, M. (2014), A Model for Mining Public Health Topics from Twitter. PLoS ONE, 9(8), e103408. DOI: 10.1371/journal.pone.0103408

R Core Team. (2021), R: A language and environment for statistical computing (4.1.1) [Computer software]. https://www.R-project.org/

Roberts, M. E., Stewart, B. M., & Airoldi, E. M. (2016), A model of text for experimentation in the social sciences. Journal of the American Statistical Association, 111(515), 988–1003. DOI: 10.1080/01621459.2016.1141684

Roberts, M. E., Stewart, B. M., & Tingley, D. (2019), stm: An R Package for Structural Topic Models. Journal of Statistical Software, 91, 1–40. DOI: 10.18637/jss.v091.i02

Roberts, M. E., Tingley, D., Stewart, B. M., & Airoldi, E. M. (2013), The Structural Topic Model and Applied Social Science. NIPS 2013 Workshop on Topic Models: Computation, Application, and Evaluation. DOI: 10.1080/01621459.2016.1141684

Smith, M. K., Jancsik, A., & Puczkó, L. (2020), Customer satisfaction in post-socialist Spas: A case study of Budapest, City of Spas. International Journal of Spa and Wellness, 3(2–3), 165–186. DOI: 10.1080/24721735.2020.1866330

Sutherland, I., Sim, Y., Lee, S. K., Byun, J., & Kiatkawsin, K. (2020), Topic Modeling of Online Accommodation Reviews via Latent Dirichlet Allocation. Sustainability, 12(5), 1821. DOI: 10.3390/su12051821

Weinshall, D., Levi, G., & Hanukaev, D. (2013), LDA Topic Model with Soft Assignment of Descriptors to Words. Proceedings of the 30th International Conference on Machine Learning, 711–719. https://proceedings.mlr.press/v28/weinshall13.html

Downloads

Published

2022-02-25

How to Cite

Hinek, M. (2022) “Topic modelling of spa visitor reviews using the example of Gellért Spa and Swimming Pool”, The Hungarian Journal of Marketing and Management, 55(4), pp. 27–38. doi: 10.15170/MM.2021.55.04.03.

Issue

Section

Papers