Resumen
In this manuscript, we present a prediction model based on the behaviour of each customer using data mining techniques. The proposed model utilizes a supermarket database and an additional database from Amazon, both containing information about customers’ purchases. Subsequently, our model analyzes these data in order to classify customers as well as products, being trained and validated with real data. This model is targeted towards classifying customers according to their consuming behaviour and consequently proposes new products more likely to be purchased by them. The corresponding prediction model is intended to be utilized as a tool for marketers so as to provide an analytically targeted and specified consumer behavior. Our algorithmic framework and the subsequent implementation employ the cloud infrastructure and use the MapReduce Programming Environment, a model for processing large data-sets in a parallel manner with a distributed algorithm on computer clusters, as well as Apache Spark, which is a newer framework built on the same principles as Hadoop. Through a MapReduce model application on each step of the proposed method, text processing speed and scalability are enhanced in reference to other traditional methods. Our results show that the proposed method predicts with high accuracy the purchases of a supermarket.