Write a Blog >>
ICSE 2020
Wed 24 June - Thu 16 July 2020
Thu 9 Jul 2020 08:33 - 08:41 at Baekje - I16-Testing and Debugging 2 Chair(s): Rui Abreu

Just-in-Time (JIT) defect prediction—a technique which aims to predict bugs at change level—has been paid more attention. JIT defect prediction leverages the SZZ approach to identify bug-introducing changes. Recently, researchers found that the performance of SZZ (including its variants) is impacted by a large amount of noise. SZZ may considerably mislabel changes that are used to train a JIT defect prediction model, and thus impact the prediction accuracy.

In this paper, we investigate the impact of the mislabeled changes by different SZZ variants on the performance and interpretation of JIT defect prediction models. We analyze four SZZ variants (i.e., B-SZZ, AG-SZZ, MA-SZZ, and RA-SZZ) that are proposed by prior studies. We build the prediction models using the labeled data by these four SZZ variants. Among the four SZZ variants, RA-SZZ is least likely to generate mislabeled changes, and we construct the testing set by using RA-SZZ. All of the four prediction models are then evaluated on the same testing set. We choose the prediction model built on the labeled data by RA-SZZ as the baseline model, and we compare the performance and metric importance of the models trained using the labeled data by the other three SZZ variants with the baseline model. Through a large-scale empirical study on a total of 126,526 changes from ten Apache open source projects, we find that in terms of various performance measures (AUC, F1-score, G-mean and Recall@20%), the mislabeled changes by B-SZZ and MA-SZZ are not likely to cause a considerable performance reduction, while the mislabeled changes by AG-SZZ cause a statistically significant performance reduction with an average difference of 1%–5%. When considering developers’ inspection effort (measured by LOC) in practice, the mislabeled changes by B-SZZ and AG-SZZ lead to 9%–10% and 1%–15% more wasted inspection effort, respectively. And the mislabeled changes by B-SZZ lead to significantly more wasted effort. The mislabeled changes by MA-SZZ do not cause considerably more wasted effort. We also find that the top-most important metric for identifying bug-introducing changes (i.e., number of files modified in a change) is robust to the mislabeling noise generated by SZZ. But the second- and third-most important metrics are more likely to be impacted by the mislabeling noise, unless random forest is used as the underlying classifier.

Thu 9 Jul

Displayed time zone: (UTC) Coordinated Universal Time change

08:05 - 09:05
I16-Testing and Debugging 2Technical Papers / Journal First at Baekje
Chair(s): Rui Abreu Instituto Superior Técnico, U. Lisboa & INESC-ID
08:05
12m
Talk
Low-Overhead Deadlock PredictionTechnical
Technical Papers
Yan Cai Institute of Software, Chinese Academy of Sciences, Ruijie Meng University of Chinese Academy of Sciences, Jens Palsberg University of California, Los Angeles
08:17
8m
Talk
The Impact of Feature Reduction Techniques on Defect Prediction ModelsJ1
Journal First
Masanari Kondo Kyoto Institute of Technology, Cor-Paul Bezemer University of Alberta, Canada, Yasutaka Kamei Kyushu University, Ahmed E. Hassan Queen's University, Osamu Mizuno Kyoto Institute of Technology
08:25
8m
Talk
The Impact of Correlated Metrics on the Interpretation of Defect ModelsJ1
Journal First
Jirayus Jiarpakdee Monash University, Australia, Kla Tantithamthavorn Monash University, Australia, Ahmed E. Hassan Queen's University
08:33
8m
Talk
The Impact of Mislabeled Changes by SZZ on Just-in-Time Defect PredictionJ1
Journal First
Yuanrui Fan Zhejiang University, Xin Xia Monash University, Daniel Alencar Da Costa University of Otago, David Lo Singapore Management University, Ahmed E. Hassan Queen's University, Shanping Li Zhejiang University
08:41
8m
Talk
Which Variables Should I Log?J1
Journal First
Zhongxin Liu Zhejiang University, Xin Xia Monash University, David Lo Singapore Management University, Zhenchang Xing Australia National University, Ahmed E. Hassan Queen's University, Shanping Li Zhejiang University
08:49
12m
Talk
Understanding the Automated Parameter Optimization on Transfer Learning for Cross-Project Defect Prediction: An Empirical StudyTechnicalArtifact Available
Technical Papers
Ke Li University of Exeter, Zilin Xiang University of Electronic Science and Technology of China, Tao Chen Loughborough University, Shuo Wang , Kay Chen Tan City University of Hong Kong
Pre-print