A Hybrid Gene Selection Method Based on Outliers for Breast Cancer Classification

Carregando...
Imagem de Miniatura

Título da Revista

ISSN da Revista

Título de Volume

Editor

Universidade Federal do Amazonas

Resumo

Breast cancer is the second most common cancer type and is the leading cause of cancer-related deaths worldwide among women. Since it is a heterogeneous disease, subtyping breast cancer plays an important role in performing a specific treatment. Gene expression data is a viable alternative to be employed on cancer subtype classification, as they represent the state of a cell at the molecular level; but generally has a relatively small number of samples compared to a large number of genes. Gene selection is a promising approach to address this uneven high-dimensional matrix of genes versus samples and plays a major role in developing efficient cancer subtype classification. In this thesis, an innovative hybrid gene selection method based on outliers (H-OGS) is proposed to select relevant genes to efficiently and effectively classify breast cancer subtypes, and to identify distinct signatures capable of to characterize breast cancer subtypes. Then, the associations learned by the classifier employed in this method are interpreted locally by SHAP Values revealing genes that are biologically relevant for the classification of each subtype of breast cancer. In general, our method selects only a few highly relevant genes, speeding up the classification and significantly improving the classifier's performance. Experiments show that our strategy gives the best results for Basal and Her 2 subtypes, the two breast cancer subtypes with the worst prognosis, respectively. Our method also identifies three distinct signatures that characterize the basal subtype, where these signatures have genes and pathways directly related to breast cancer subtypes. We also propose an evaluation framework that uses different machine learning techniques for a broader analysis of the PAM50 list in the classification of breast cancer subtypes. The experiments show that the best method to classify breast cancer subtypes is the SVM with linear kernel.

Descrição

Palavras-chave

., ., .

Citação

MEDONÇA NETO, Rayol de. A Hybrid Gene Selection Method Based on Outliers for Breast Cancer Classification. 2023. 105 f. Tese (Doutorado em Informática) - Universidade Federal do Amazonas, Manaus (AM), 2023.

Avaliação

Revisão

Suplementado Por

Referenciado Por

Licença Creative Commons

Exceto quando indicado de outra forma, a licença deste item é descrita como Acesso Aberto