Uma abordagem para detectar relatórios de defeitos duplicados baseada em aprendizagem profunda
Carregando...
Data
Autores
Título da Revista
ISSN da Revista
Título de Volume
Editor
Universidade Federal do Amazonas
Resumo
In large-scale software development environments, defect reports are maintained through bug tracking systems and analyzed by domain experts. Since different users may create bug reports in a non-standard manner, each user can report a particular problem with a unique set of words. Therefore, different reports may describe the same problem, generating duplication. In order to avoid redundant tasks for the development team, an expert needs to look at all new reports while trying to label possible duplicates. However, this approach is neither trivial nor scalable and has a direct impact on bug fix correction time. Recent efforts to find duplicate bug reports tend to focus on deep neural approaches that consider hybrid information from bug reports as textual and categorical features. However, these approaches ignore that a single bug can have multiple previously identified duplicates and, therefore, multiple textual descriptions, titles, and categorical information. In this work, we propose SiameseQAT, a duplicate bug report detection method that considers not only information on individual bugs, but also collective information from bug clusters. The SiameseQAT combines context and semantic learning on textual and categorical features, as also topic-based features, with a novel loss function called Quintet Loss, which considers the centroid of duplicate clusters and their contextual information. We validated our approach on the well-known open-source software repositories Eclipse, Netbeans, and Open Office, that comprises more than 500 thousand bug reports. We evaluated both retrieval and classification of duplicates, reporting a Recall@25 mean of 71% for retrieval, and 99% AUROC for classification tasks, results that were significantly superior to related works.
Descrição
Citação
ROCHA, Thiago Marques. Uma abordagem para detectar relatórios de defeitos duplicados baseada em aprendizagem profunda. 2020. 129 f. Dissertação (Mestrado em Informática) - Universidade Federal do Amazonas, Manaus, 2020.
Coleções
Avaliação
Revisão
Suplementado Por
Referenciado Por
Licença Creative Commons
Exceto quando indicado de outra forma, a licença deste item é descrita como Acesso Aberto

