Combinando busca binária exata e busca aproximada em sistemas de complementação automática de consultas

Resumo

Query autocompletion is an important feature for moderen search engines that stands for suggesting queries at each user keystroke. The most efficient solutions in the past years have been using Tries to index the query suggestions. However, this structure may lead to more memory usage, which is a costly and limited resource. In this work, we introduce a two-level structure that uses the ICPAN error-tolerant search algorithm at the first level and combines sequential and binary search at the second level, with the aim of verify whether this combination can increase the efficiency of the second level without affecting the results accuracy. Our initial hypothesis was that it was possible to perform a binary search instead of a sequential when all the typing errors had already “ran out” at the first level. Nonetheless, we conclude in our research that using binary search at the second level prevents the method of retrieving some query suggestions, and also makes it retrieve some suggestions that exceed the limit 𝜏 of typing errors tolerated. Facing this problem, we also proposed a heuristic to selectively activate the binary search. We have experimented with three versions of two-level structure, with each one having a modification at the second level. We compared their accuracy levels, their performances, and their memory usage. Both versions that use the binary search with and without the heuristic showed some lack of accuracy for 𝜏 ≤ 2. However, they had good accuracy for 𝜏 = 3. In this specific case the model that uses binary search without the heuristic had the best performance when compared with other versions, and also had performed better than other methods found in the literature in some scenarios. Besides that, all the three proposed models have shown a significant reduction of memory needed to run error-tolerant query autocompletion.

Descrição

Citação

BELÉM, Ruben Jozafá Silva. Combinando busca binária exata e busca aproximada em sistemas de complementação automática de consultas. 2022. 94 f. Dissertação (Mestrado em Informática) - Universidade Federal do Amazonas, Manaus (AM), 2022.

Avaliação

Revisão

Suplementado Por

Referenciado Por

Licença Creative Commons

Exceto quando indicado de outra forma, a licença deste item é descrita como Acesso Aberto