An Investigation of Cross-Project Learning in Online Just-In-Time Software Defect PredictionTechnical
Just-In-Time Software Defect Prediction (JIT-SDP) is concerned with predicting whether software changes are defect-inducing or clean based on machine learning classifiers. Building such classifiers requires a sufficient amount of training data that is not available at the beginning of a software project. Cross-Project (CP) JIT-SDP can overcome this issue by using data from other projects to build the classifier, achieving similar (not better) predictive performance to classifiers trained on Within-Project (WP) data. However, such approaches have never been investigated in realistic online learning scenarios, where WP software changes arrive continuously over time and can be used to update the classifiers. It is unknown to what extent CP data can be helpful in such situation. In particular, it is unknown whether CP data are only useful during the very initial phase of the project when there is little WP data, or whether they could be helpful for extended periods of time. This work thus provides the first investigation of when and to what extent CP data are useful for JIT-SDP in a realistic online learning scenario. For that, we develop three different CP JIT-SDP approaches that can operate in online mode. We also collect 2048 commits from three ongoing projects being developed by a software company over the course of 9 to 10 months, and use 19,8468 commits from 10 active open source GitHub projects being developed over the course of 6 to 14 years. The study shows that CP data can lead to improvements in G-mean of up to 53.90% compared to WP classifiers at the initial stage of the projects. For the open source projects, which have been running for longer periods of time, the CP data also helped the classifiers to reduce or prevent large drops in predictive performance that may occur over time, obtaining up to around 40% better G-Mean during such periods. CP data was shown to be beneficial even after a large number of WP data were received, leading to overall G-means up to 18.5% better than those of WP classifiers.
Fri 10 JulDisplayed time zone: (UTC) Coordinated Universal Time change
15:00 - 16:00 | A21-Testing and Debugging 3Journal First / Technical Papers at Silla Chair(s): Tingting Yu University of Kentucky | ||
15:00 12mTalk | Schrödinger's Security: Opening the Box on App Developers' Security RationaleTechnical Technical Papers Dirk van der Linden University of Bristol, Pauline Anthonysamy Google Inc., Bashar Nuseibeh The Open University (UK) & Lero (Ireland), Thein Tun , Marian Petre The Open University, Mark Levine Lancaster University, John Towse Lancaster University, Awais Rashid University of Bristol, UK | ||
15:12 8mTalk | Smart Greybox FuzzingJ1 Journal First Van-Thuan Pham Monash University, Marcel Böhme Monash University, Andrew Santosa National University of Singapore, Alexandru Răzvan Căciulescu UiPath, Abhik Roychoudhury National University of Singapore, Singapore | ||
15:20 8mTalk | Deep Transfer Bug LocalizationJ1 Journal First Xuan Huo Nanjing University, Ferdian Thung Singapore Management University, Ming Li Nanjing University, David Lo Singapore Management University, Shu-Ting Shi Nanjing University | ||
15:28 8mTalk | A Benchmark-Based Evaluation of Search-Based Crash ReproductionJ1 Journal First Mozhan Soltani Leiden University, Pouria Derakhshanfar Delft University of Technology, Xavier Devroey Delft University of Technology, Arie van Deursen Delft University of Technology Link to publication DOI Pre-print Media Attached | ||
15:36 12mTalk | An Investigation of Cross-Project Learning in Online Just-In-Time Software Defect PredictionTechnical Technical Papers Sadia Tabassum University of Birmingham, UK, Leandro Minku University of Birmingham, UK, Danyi Feng XiLiu Tech, George Cabral Universidade Federal Rural de Pernambuco, Liyan Song University of Birmingham | ||
15:48 8mTalk | An Empirical Study of the Long Duration of Continuous Integration BuildsJ1 Journal First Taher A Ghaleb Queen's University, Daniel Alencar Da Costa University of Otago, Ying Zou Queen's University, Kingston, Ontario Link to publication DOI Pre-print |