Caracterização teórica e limites assintóticos para o problema da geração de redes candidatas na busca por palavras-chave em bancos de dados relacionais

Carregando...
Imagem de Miniatura

Título da Revista

ISSN da Revista

Título de Volume

Editor

Universidade Federal do Amazonas

Resumo

The search in collections of data using keyword queries (KwS) is an important field of study within Computer Science, which facilitates and accelerates the obtaining of rele- vant information. However, if on the one hand there was a great advance in the keywords search in web pages, which are heterogeneous bases mostly semi-structured, in relational databases still dominates the structured query elaborated with query languages such as SQL. Such databases do not allow queries for keywords directly, because it is necessary to know the schema that organizes the tables and to have technical knowledge in such lan- guages, factors that make the search process difficult. However, there are already systems that use query keywords and, transparently to the user, the relational database schema, to then internally mount appropriate SQL queries that return relevant responses to users (R-KwS). Such generated SQL queries are called Candidate Networks. The problem with such an approach is that the focus of such systems has been empirical and experimen- tal, highlighting a difficulty and high processing time in solving the Candidate Network Generation problem. Given this context, it is being proposed in this research a theoret- ical characterization of an R-KwS system, modeling it as a combination of two classic problems in graph theory, which are the Steiner Tree problem and the Edge Coloring problem, called the Steiner Tree with Colored Edges problem (STCEP). In this way, the viiiNP-completeness proof is presented for the general case where an arbitrary number of keywords is considered in the search, providing an NP algorithm and performing the polynomial reduction from the Exact Cover by 3-Sets problem (X3C) to the STCEP. Ad- ditionally, asymptotic polynomial boundaries were also determined for the practical case, when only a small fixed number of keywords were considered. Contrary to what the liter- ature up till now assumed to be true of the general case, it is demonstrated that in practice, the difficulty in terms of computational complexity does not occur. As a complementary result, the first implementation of the best approximation ratio algorithm for the problem of generation of all minimum spanning trees, the Eppstein [12] algorithm, was carried out, followed by the first implementation of the exact algorithm for the classical prob- lem of the Steiner tree in graphs, of polynomial delay for a fixed number of terminals, proposed by Dourado, Oliveira and Protti [8], and that uses the algorithm of Eppstein. Comparative analysis with the main algorithms of both problems, separately, were also performed. Finally, a new heuristic was developed for the classic problem of the Steiner tree in graphs, which presented a competitive performance in practice and that eliminates the exponential step of the algorithm of Dourado, Oliveira and Protti [8]. These results answer open questions in R-KwS systems, as well as contributions to the Steiner Tree problem in Graph Theory.

Descrição

Citação

MARTINEZ, João Guilherme Alves. Caracterização teórica e limites assintóticos para o problema da geração de redes candidatas na busca por palavras-chave em bancos de dados relacionais. 2019. 77 f. Dissertação (Mestrado em Informática) - Universidade Federal do Amazonas, Manaus, 2019.

Avaliação

Revisão

Suplementado Por

Referenciado Por