Resumen
This paper offers a general view of evaluation in automatic document classification. We analyse both evaluation methods and test collections in which they are used, focusing in the later aspect, and especially in the use of documentary languages inside these collections. We have detected a set of imperfections that could undermine trust in evaluation process results, and propose some ways of solving them.