Resumen
A survey published by Nature in 2016 revealed that more than 70% of researchers failed in their attempt to reproduce another researcher?s experiments, and over 50% failed to reproduce one of their own experiments; a state of affairs that has been termed the ?reproducibility crisis? in science. The purpose of this work is to contribute to the field by presenting a reproducibility study of a Natural Language Processing paper about ?Language Representation Models for Fine-Grained Sentiment Classification?. A thorough analysis of the methodology, experimental setting, and experimental results are presented, leading to a discussion of the issues and the necessary steps involved in this kind of study.