Recuperação de imagem com múltiplos rótulos usando hashing profundo

Resumo

Content-based Image Retrieval (CBIR) is the task of retrieving images as result of an image search, such that the retrieved images have the same visual contents as the query image. This problem has attracted increasing attention in the area of computer vision. Learning-based hashing techniques are among the most studied approaches to nearest-neighbor approximate search for large-scale image retrieval. With the advancement of deep neural networks in image representation, hashing based methods for CBIR have adopted deep learning in the process of outputing binary hash codes. Such strategies are known generically as Deep Hashing techniques. Although a variety of methods have been proposed for CBIR using deep hashing, most of them deal with single-labeled images. However, in visual search it is natural for images to have several topics, each of which is represented by a different label that may be related, for example, with objects of various categories or different concepts associated with the images. Furthermore, many of these models focus exclusively on the quality of the generated rankings, ignoring issues such as search efficiency and the use of the available space, which are important aspects to consider in Image Retrieval. In this way, we investigate deep hashing techniques which enable efficient image retrieval while achieving a high-quality response ranking. In addition, we focus on the multiple-label scenario so that the generated hash codes capture the various levels of similarity among the images. More specifically, throughout this research, we propose and study deep generative architectures trained on pairs and triples of images for the task of multi-label image retrieval. To this, we adopt variational autoencoders based on discrete distributions. These models can generate compact image representations, directly applicable to hashing techniques, without intermediate processes unrelated to training. When evaluating the proposed methods in two collections of multi-label images, we observed that they are capable of generating effective binary hash codes. Such codes can be used to produce high-quality rankings while enabling an efficient use of the hashing space.

Descrição

Palavras-chave

Citação

SILVA, Josiane Rodrigues da. Recuperação de imagem com múltiplos rótulos usando hashing profundo. 2022. 122 f. Tese (Doutorado em Informática) Universidade Federal do Amazonas, Manaus (AM), 2022.

Avaliação

Revisão

Suplementado Por

Referenciado Por