OneTrack - Modelos Baseados em Transformers e Eficientes em Tempo de Inferência para Rastreamento de Múltiplos Objeto

Resumo

Tracking Multiple Objects (MOT) is a critical problem in computer vision, essential for understanding how objects move and interact in videos. This field faces significant challenges, as occlusions and complex environmental dynamics affect the accuracy and efficiency of models. While traditional approaches have relied on Convolutional Neural Networks (CNNs), this work presents OneTrack-M, a MOT model based on transformers, designed to enhance computational efficiency and tracking accuracy. Our approach simplifies the typical transformer-based architecture by eliminating the need for a decoder model for object detection and tracking. Instead, only the encoder serves as the basis for interpreting temporal data, significantly reducing processing time and increasing inference speed. In parallel, innovative data preprocessing techniques and multitask training are employed to address various objectives within a single set of weights. Experimental results demonstrate that OneTrack-M achieves inference times at least 25% faster compared to state-of-the-art models in the literature, while maintaining or improving tracking accuracy metrics. These improvements highlight the proposed solution’s potential for real-time applications, such as autonomous vehicles and surveillance systems, where quick responses are crucial for system effectiveness.

Descrição

Palavras-chave

., ., .

Citação

ARAUJO FILHO, Luiz Carlos. OneTrack - Modelos Baseados em Transformers e Eficientes em Tempo de Inferência para Rastreamento de Múltiplos Objeto. 2024. 85 f. Dissertação (Mestrado em Informática) - Universidade Federal do Amazonas, Manaus (AM), 2024.

Avaliação

Revisão

Suplementado Por

Referenciado Por

Licença Creative Commons

Exceto quando indicado de outra forma, a licença deste item é descrita como Acesso Aberto