Abstract
This paper describes a methodology for diabetic retinopathy detection from eye fundus images using a generalization of the bag-of-visual-words (BoVW) method. We formulate the BoVW as two neural networks that can be trained jointly. Unlike the BoVW, our model is able to learn how to perform feature extraction, feature encoding, and classification guided by the classification error. The model achieves 0.97 area under the curve (AUC) on the DR2 dataset while the standard BoVW approach achieves 0.94 AUC. Also, it performs at the same level of the state-of-the-art on the Messidor dataset with 0.90 AUC.