Dimensionality Reduction Using Principal Component Analysis and Feature Selection Using Genetic Algorithm with Support Vector Machine for Microarray Data Classification

Microarray; Principal Component Analysis; Genetic Algorithm; Classification;

Authors

January 17, 2025
February 20, 2025
February 28, 2025

Downloads

DNA microarray is used to analyze gene expression on a large scale simultaneously and plays a critical role in cancer detection. The creation of a DNA microarray starts with RNA isolation from the sample, which is then converted into cDNA and scanned to generate gene expression data. However, the data generated through this process is highly dimensional, which can affect the performance of predictive models for cancer detection. Therefore, dimensionality reduction is required to reduce data complexity. This study aims to analyze the impact of applying Principal Component Analysis (PCA) for dimensionality reduction, Genetic Algorithm (GA) for feature selection, and their combination on microarray data classification using Support Vector Machine (SVM). The datasets used are microarray datasets, including breast cancer, ovarian cancer, and leukemia. The research methodology involves preprocessing, PCA for dimensionality reduction, GA for feature selection, data splitting, SVM classification, and evaluation. Based on the results, the application of PCA dimensionality reduction combined with GA feature selection and SVM classification achieved the best performance compared to other classifications. For the breast cancer dataset, the highest accuracy was 73.33%, recall 0.74, precision 0.75, and F1 score 0.73. For the ovarian cancer dataset, the highest accuracy was 98.68%, recall 0.98, precision 0.99, and F1 score 0.99. For the leukemia dataset, the highest accuracy was 95.45%, recall 0.94, precision 0.97, and F1 score 0.95. It can be concluded that combining PCA for dimensionality reduction with GA for feature selection in microarray classification can simplify the data and improve the accuracy of the SVM classification model. The implications of this study emphasize the effectiveness of applying PCA and GA methods in enhancing the classification performance of microarray data.

How to Cite

Kartini, D., Badali, R. A., Muliadi, M., Nugrahadi, D. T., Indriani, F., & Saputro, S. W. (2025). Dimensionality Reduction Using Principal Component Analysis and Feature Selection Using Genetic Algorithm with Support Vector Machine for Microarray Data Classification. Indonesian Journal of Electronics, Electromedical Engineering, and Medical Informatics, 7(1), 154-166. https://doi.org/10.35882/mr7x9713

Most read articles by the same author(s)

Similar Articles

1-10 of 31

You may also start an advanced similarity search for this article.