Caracterização teórica e limites assintóticos para o problema da geração de redes candidatas na busca por palavras-chave em bancos de dados relacionais
Carregando...
Data
Autores
Título da Revista
ISSN da Revista
Título de Volume
Editor
Universidade Federal do Amazonas
Resumo
The search in collections of data using keyword queries (KwS) is an important field
of study within Computer Science, which facilitates and accelerates the obtaining of rele-
vant information. However, if on the one hand there was a great advance in the keywords
search in web pages, which are heterogeneous bases mostly semi-structured, in relational
databases still dominates the structured query elaborated with query languages such as
SQL. Such databases do not allow queries for keywords directly, because it is necessary
to know the schema that organizes the tables and to have technical knowledge in such lan-
guages, factors that make the search process difficult. However, there are already systems
that use query keywords and, transparently to the user, the relational database schema,
to then internally mount appropriate SQL queries that return relevant responses to users
(R-KwS). Such generated SQL queries are called Candidate Networks. The problem with
such an approach is that the focus of such systems has been empirical and experimen-
tal, highlighting a difficulty and high processing time in solving the Candidate Network
Generation problem. Given this context, it is being proposed in this research a theoret-
ical characterization of an R-KwS system, modeling it as a combination of two classic
problems in graph theory, which are the Steiner Tree problem and the Edge Coloring
problem, called the Steiner Tree with Colored Edges problem (STCEP). In this way, the
viiiNP-completeness proof is presented for the general case where an arbitrary number of
keywords is considered in the search, providing an NP algorithm and performing the
polynomial reduction from the Exact Cover by 3-Sets problem (X3C) to the STCEP. Ad-
ditionally, asymptotic polynomial boundaries were also determined for the practical case,
when only a small fixed number of keywords were considered. Contrary to what the liter-
ature up till now assumed to be true of the general case, it is demonstrated that in practice,
the difficulty in terms of computational complexity does not occur. As a complementary
result, the first implementation of the best approximation ratio algorithm for the problem
of generation of all minimum spanning trees, the Eppstein [12] algorithm, was carried
out, followed by the first implementation of the exact algorithm for the classical prob-
lem of the Steiner tree in graphs, of polynomial delay for a fixed number of terminals,
proposed by Dourado, Oliveira and Protti [8], and that uses the algorithm of Eppstein.
Comparative analysis with the main algorithms of both problems, separately, were also
performed. Finally, a new heuristic was developed for the classic problem of the Steiner
tree in graphs, which presented a competitive performance in practice and that eliminates
the exponential step of the algorithm of Dourado, Oliveira and Protti [8]. These results
answer open questions in R-KwS systems, as well as contributions to the Steiner Tree
problem in Graph Theory.
Descrição
Citação
MARTINEZ, João Guilherme Alves. Caracterização teórica e limites assintóticos para o problema da geração de redes candidatas na busca por palavras-chave em bancos de dados relacionais. 2019. 77 f. Dissertação (Mestrado em Informática) - Universidade Federal do Amazonas, Manaus, 2019.
