The Impact of Correlated Metrics on the Interpretation of Defect ModelsJ1
Defect models are analytical models for building empirical theories related to software quality. Prior studies often derive knowledge from such models using interpretation techniques, e.g., ANOVA Type-I. Recent work raises concerns that correlated metrics may impact the interpretation of defect models. Yet, the impact of correlated metrics in such models has not been investigated. In this paper, we investigate the impact of correlated metrics on the interpretation of defect models and the improvement of the interpretation of defect models when removing correlated metrics. Through a case study of 14 publicly- available defect datasets, we find that (1) correlated metrics have the largest impact on the consistency, the level of discrepancy, and the direction of the ranking of metrics, especially for ANOVA techniques. On the other hand, we find that removing all correlated metrics (2) improves the consistency of the produced rankings regardless of the ordering of metrics (except for ANOVA Type-I); (3) improves the consistency of ranking of metrics among the studied interpretation techniques; (4) impacts the model performance by less than 5 percentage points. Thus, when one wishes to derive sound interpretation from defect models, one must (1) mitigate correlated metrics especially for ANOVA analyses; and (2) avoid using ANOVA Type-I even if all correlated metrics are removed.
Thu 9 JulDisplayed time zone: (UTC) Coordinated Universal Time change
08:05 - 09:05 | I16-Testing and Debugging 2Technical Papers / Journal First at Baekje Chair(s): Rui Abreu Instituto Superior Técnico, U. Lisboa & INESC-ID | ||
08:05 12mTalk | Low-Overhead Deadlock PredictionTechnical Technical Papers Yan Cai Institute of Software, Chinese Academy of Sciences, Ruijie Meng University of Chinese Academy of Sciences, Jens Palsberg University of California, Los Angeles | ||
08:17 8mTalk | The Impact of Feature Reduction Techniques on Defect Prediction ModelsJ1 Journal First Masanari Kondo Kyoto Institute of Technology, Cor-Paul Bezemer University of Alberta, Canada, Yasutaka Kamei Kyushu University, Ahmed E. Hassan Queen's University, Osamu Mizuno Kyoto Institute of Technology | ||
08:25 8mTalk | The Impact of Correlated Metrics on the Interpretation of Defect ModelsJ1 Journal First Jirayus Jiarpakdee Monash University, Australia, Kla Tantithamthavorn Monash University, Australia, Ahmed E. Hassan Queen's University | ||
08:33 8mTalk | The Impact of Mislabeled Changes by SZZ on Just-in-Time Defect PredictionJ1 Journal First Yuanrui Fan Zhejiang University, Xin Xia Monash University, Daniel Alencar Da Costa University of Otago, David Lo Singapore Management University, Ahmed E. Hassan Queen's University, Shanping Li Zhejiang University | ||
08:41 8mTalk | Which Variables Should I Log?J1 Journal First Zhongxin Liu Zhejiang University, Xin Xia Monash University, David Lo Singapore Management University, Zhenchang Xing Australia National University, Ahmed E. Hassan Queen's University, Shanping Li Zhejiang University | ||
08:49 12mTalk | Understanding the Automated Parameter Optimization on Transfer Learning for Cross-Project Defect Prediction: An Empirical StudyTechnical Technical Papers Ke Li University of Exeter, Zilin Xiang University of Electronic Science and Technology of China, Tao Chen Loughborough University, Shuo Wang , Kay Chen Tan City University of Hong Kong Pre-print |