An Empirical Study of Cross-Project and Within-Project Performance in Software Defect Prediction Models Using Tree-Based and Boosting Classifiers

Raidra Zeniananto; Rudy Herteno; Radityo Adi Nugroho; Andi Farmadi; Setyo Wahyu Saputro

doi:10.35882/ijeeemi.v7i3.95

Authors

Raidra Zeniananto Department of Computer Science, Lambung Mangkurat University, Kalimantan Selatan, Indonesia, Indonesia https://orcid.org/0009-0004-0824-5368
Rudy Herteno
rudy.herteno@ulm.ac.id
Department of Computer Science, Lambung Mangkurat University, Kalimantan Selatan, Indonesia, Indonesia https://orcid.org/0000-0003-0637-8090
Radityo Adi Nugroho Department of Computer Science, Lambung Mangkurat University, Kalimantan Selatan, Indonesia, Indonesia https://orcid.org/0000-0002-7326-7668
Andi Farmadi Department of Computer Science, Lambung Mangkurat University, Kalimantan Selatan, Indonesia, Indonesia https://orcid.org/0009-0009-0926-8082
Setyo Wahyu Saputro Department of Computer Science, Lambung Mangkurat University, Kalimantan Selatan, Indonesia, Indonesia https://orcid.org/0009-0007-9250-7704

Vol. 7 No. 3 (2025): August

Articles

Submitted April 24, 2025

Accepted May 14, 2025

Published August 20, 2025

Downloads

pdf

Abstract
How to Cite
Author Biographies
Metrics
License

Software Defect Prediction (SDP) is a vital process in modern software engineering aimed at identifying faulty components in the early stages of development. In this study, we conducted a comprehensive evaluation of two widely employed SDP approaches, Within-Project Software Defect Prediction (WP-SDP) and Cross-Project Software Defect Prediction (CP-SDP), using identical preprocessing steps to ensure an objective comparison. We utilized the NASA MDP dataset, where each project was split into 70% training and 30% testing data, and applied three distinct resampling strategies—no sampling, oversampling, and undersampling—to address the challenge of class imbalance. Five classification algorithms were examined, including Support Vector Machine (SVM), Random Forest (RF), Gradient Boosting (GB), XGBoost (XGB), and LightGBM (LGBM). Performance was measured primarily using Accuracy and Area Under the Curve (AUC) metrics, resulting in 360 experimental outcomes. Our findings revealed that WP-SDP, combined with oversampling and Random Forest, demonstrated superior predictive capability on most projects, achieving an Accuracy of 89.92% and an AUC of 0.931 on PC4. Nonetheless, CP-SDP excelled in certain small-scale projects (e.g., MW1), underscoring its potential when local historical data is scarce but inter-project characteristics remain sufficiently similar. This study’s results underscore the importance of selecting a prediction scheme tailored to specific project attributes, class imbalance levels, and available historical data. By establishing a standardized methodological framework, our work contributes to a clearer understanding of the strengths and limitations of WP-SDP and CP-SDP, paving the way for more effective defect detection strategies and improved software quality.

Raidra Zeniananto, Herteno, R., Radityo Adi Nugroho, Andi Farmadi, & Setyo Wahyu Saputro. (2025). An Empirical Study of Cross-Project and Within-Project Performance in Software Defect Prediction Models Using Tree-Based and Boosting Classifiers. Indonesian Journal of Electronics, Electromedical Engineering, and Medical Informatics, 7(3), 514-525. https://doi.org/10.35882/ijeeemi.v7i3.95

Download Citation

Raidra Zeniananto, Department of Computer Science, Lambung Mangkurat University, Kalimantan Selatan, Indonesia

Raidra Zeniananto, a student at Lambung Mangkurat University with Computer Science study program, faculty of Mathematics and Natural Sciences. he has a strong interest in software engineering, particularly mobile-based application development. With his perseverance and passion for learning, Raidra strives to hone his technical skills through various projects and research. Through his dedication and innovative mindset, he seeks not only to excel in his studies but also to make a significant impact in the field of mobile technology and software development. Additionally, he devotes himself to exploring emerging technologies and refining his practical abilities by working on challenging assignments that bridge theoretical insights with real-world applications. Email: raidrazeni@gmail.com.

Radityo Adi Nugroho, Department of Computer Science, Lambung Mangkurat University, Kalimantan Selatan, Indonesia

Radityo Adi Nugroho received his bachelor's degree in Informatics from the Islamic University of Indonesia and a master's degree in Computer Science from Gadjah Mada University. Currently, he is an assistant professor in the Department of Computer Science at Lambung Mangkurat University. His research interests include software defect prediction and computer vision. He can be contacted at email: radityo.adi@ulm.ac.id.

Andi Farmadi, Department of Computer Science, Lambung Mangkurat University, Kalimantan Selatan, Indonesia

Andi Farmadi, a senior lecturer in the Computern Science program at Lambung Mangkurat University. He has been teaching since 2008 and currently serves as the Head of the Data Science Lab since 2018. He completed his undergraduate studies at Hasanuddin University and his graduate studies at Bandung Institute of Technology. His research area, up to the present, focuses on Data Science. One of his research projects, along with other researchers, published in the International Conference of Computer and Informatics Engineering (IC2IE), is titled "Hyperparameter tuning using GridsearchCV on the comparison of the activation function of the ELM method to the classification of pneumonia in toddlers," and this research was published in 2021. Email: andifarmadi@ulm.ac.id.

Setyo Wahyu Saputro, Department of Computer Science, Lambung Mangkurat University, Kalimantan Selatan, Indonesia

Setyo Wahyu Saputro, is a lecturer in Computer Science Department, Faculty of Mathematics and Natural Science, Lambung Mangkurat University in Banjarbaru. He received bachelor’s degree also in Computer Science from Lambung Mangkurat University in 2011, and received his master’s degree in Informatics from STMIK Amikom University in 2016. He is active as an information technology practitioner and consultant, being a project manager or systems analyst working on several projects in government and private agencies in South Kalimantan province since 2017. His research interests include software engineering, human computer interaction, and artificial intelligence applications. He can be contacted at email: setyo.saputro@ulm.ac.id.

How to Cite

Raidra Zeniananto, Herteno, R., Radityo Adi Nugroho, Andi Farmadi, & Setyo Wahyu Saputro. (2025). An Empirical Study of Cross-Project and Within-Project Performance in Software Defect Prediction Models Using Tree-Based and Boosting Classifiers. Indonesian Journal of Electronics, Electromedical Engineering, and Medical Informatics, 7(3), 514-525. https://doi.org/10.35882/ijeeemi.v7i3.95

Download Citation

An Empirical Study of Cross-Project and Within-Project Performance in Software Defect Prediction Models Using Tree-Based and Boosting Classifiers

Authors

Downloads

Raidra Zeniananto, Department of Computer Science, Lambung Mangkurat University, Kalimantan Selatan, Indonesia

Radityo Adi Nugroho, Department of Computer Science, Lambung Mangkurat University, Kalimantan Selatan, Indonesia

Andi Farmadi, Department of Computer Science, Lambung Mangkurat University, Kalimantan Selatan, Indonesia

Setyo Wahyu Saputro, Department of Computer Science, Lambung Mangkurat University, Kalimantan Selatan, Indonesia

How to Cite

Most read articles by the same author(s)

Similar Articles

Login

Journal Metrics

About IJEEEMI

Article Template

Citedness & Repository

Statistics

Information

Editorial Pick

The Role of U-Net Segmentation for Enhancing Deep Learning-based Dental Caries Classification

Acute effects of methadone on neural oscillations: an EEG study of theta, alpha, beta power, and frontal alpha asymmetry in opioid rehabilitation patients

Hybrid Feature Selection and Balancing Data Approach for Improved Software Defect Prediction

Address

Contact Info: