Impacto de técnicas de pré-processamento de texto na detecção de intenção e extração de parâmetros em sistemas de diálogo orientados a tarefa

Resumo

After the popularity of the internet and the low price of mobile devices, the people have changed the way they interact with each other and with companies. In the past, the internet has driven the growth of e-commerce, initially with people shopping through personal computers and more recently using mobile devices, when e-commerce has come to be called also as mobile commerce. In recent years, alongside the growth of mobile commerce, the number of active users in messaging applications has also grown. In response to this phenomenon, companies from various sectors have invested in serving their customers through these types of applications, however, maintaining qualified personnel to serve can generate high costs. In addition to the cost, service can also be time-consuming at peak times, generating customer dissatisfaction. In this scenario, the development of task-oriented dialogue systems emerges as an alternative to customer service, thanks to its ability to serve a large number of customers continuously, with good response speed and low cost. The growing demand for these systems and the challenges involved in their construction, motivated us to study about this type of system. In this study, we learned that there is a phase in the development called natural language understanding, which purpose is to identify the user's intention for each sentence spoken by him, as well as parameters related to that identified intention. This purpose can be achieved through two tasks, known as: intention detection and slot filling. As they are tasks known in the dialog system literature, and since there are several works already published over the years, we propose in this dissertation a study on the impact of the use of text pre-processing techniques applied in models used in these two tasks. More precisely, we chose techniques such as stemmer, lemmatization, stopwords remotion and using Word Embeddings to be used in our experiments. Experiments carried out in reference datasets for the problem studied indicate that not all the pre-processing techniques chosen had a positive impact when applied in works published in the literature. In view of the compared techniques, only the stemmer results in a gain, a gain of up to 3% in the recall of the parameter extraction task, costing a small loss of 0.9% in the same task. Since stemming techniques, removing stopwords and Word Embeddings resulted in changes in recall and accuracy. When analyzing the completed results, which shows a confused reader or model for presenting different slogans for the same word, while removing stopwords removes prepositions and articles that are important to contextualize and use the items to be extracted, no case Word Embeddings, a configuration of the compared works did not favor the use of the technique.

Descrição

Citação

RIBEIRO, Erick Rego. Impacto de técnicas de pré-processamento de texto na detecção de intenção e extração de parâmetros em sistemas de diálogo orientados a tarefa. 2020. 64 f. Dissertação (Mestrado em Informática) - Universidade Federal do Amazonas, Manaus, 2020.

Avaliação

Revisão

Suplementado Por

Referenciado Por

Licença Creative Commons

Exceto quando indicado de outra forma, a licença deste item é descrita como Acesso Aberto