Resumen
In this paper we propose the Recurrent Embedded Topic Model (RETM) which is a modification of the Embedded Topic Modelling (ETM) by reusing the Continuous Bag of Words (CBOW) that the model had implemented and applying it to a recurrent neural network (LSTM), using the order of the words of the text, in the CBOW space as the recurrency of the LSTM, while calculating the topic?document distribution of the model. This approach is novel because the ETM and Latent Dirichlet Allocation (LDA) do not use the order of the words while calculating the topic proportions for each text, making worse predictions in the end. The RETM is a topic-modelling technique that vastly improves (by more than 15 times in train data and between 10% and 90% better based on test dataset values for perplexity) the quality of the topics that were calculated for the datasets used in this paper. This model is explained in detail throughout the paper and presents results on different use cases on how the model performs against ETM and LDA. The RETM can be used with better accuracy for any topic model-related problem.