Spectrum-based fault localization (SBFL) techniques are widely studied and have been evaluated to be effective in locating faults. Recent studies also showed that developers from industry value automated SBFL techniques. However, their effectiveness is still limited by two main reasons. First, the test coverage information leveraged to construct the spectrum does not reflect the root cause directly. Second, SBFL suffers from the tie issue so that the buggy code entities can not be well differentiated from non-buggy ones. To address these challenges, we propose to leverage the information of version histories in fault localization based on the following two intuitions. First, version histories record how bugs are introduced to software projects and this information reflects the root cause of bugs directly. Second, the evolution histories of code can help differentiate those suspicious code entities ranked in tie by SBFL. Our intuitions are also inspired by the observations on debugging practices from large open source projects and industry.
Based on the intuitions, we propose a novel technique HSFL (historical spectrum based fault localization). Specifically, HSFL identifies bug-inducing commits from the version history in the first step. It then constructs historical spectrum (denoted as Histrum) based on bug-inducing commits, which is another dimension of spectrum orthogonal to the coverage based spectrum used in SBFL. HSFL finally ranks the suspicious code elements based on our proposed Histrum and the conventional spectrum. HSFL outperforms the state-of-the-art SBFL techniques significantly on the Defects4J benchmark. Specifically, it locates and ranks the buggy statement at Top-1 for $77.8%$ more bugs as compared with SBFL, and $33.9%$ more bugs at Top-5. Besides, for the metrics MAP and MRR, HSFL achieves an average improvement of $28.3%$ and $40.8%$ over all bugs, respectively. Moreover, HSFL can also outperform other six families of fault localization techniques, and our proposed Histrum model can be integrated with different families of techniques and boost their performance.
Wed 8 Jul Times are displayed in time zone: (UTC) Coordinated Universal Time change
|01:05 - 01:17|
|01:17 - 01:29|
Jinhan Kim, Valeriy SavchenkoIvannikov Institute for System Programming of the RAS, Kihyuck ShinSamsung Electronics, Konstantin SorokinIvannikov Institute for System Programming of the RAS, Hyunseok JeonSamsung Electronics, Georgiy PankratenkoIvannikov Institute for System Programming of the RAS, Sergey MarkovIvannikov Institute for System Programming of the RAS, Chul-Joo KimSamsung Electronics
|01:29 - 01:37|
Haijun WangAnt Financial Services Group, China; CSSE, Shenzhen University, China, Yun LinNational University of Singapore, Zijiang YangWestern Michigan University, Jun SunSingapore Management University, Yang LiuNanyang Technological University, Singapore, Jin Song DongNational University of Singapore, Qinghua ZhengXi'an Jiaotong University, Ting LiuXi'an Jiaotong University
|01:37 - 01:45|
Ming WenHuazhong University of Science and Technology, China, Junjie ChenTianjin University, China, Yongqiang TIANThe Hong Kong University of Science and Technology, Rongxin WuDepartment of Cyber Space Security, Xiamen University, Dan HaoPeking University, Shi HanMicrosoft Research Asia, Shing-Chi CheungDepartment of Computer Science and Engineering, The Hong Kong University of Science and Technology
|01:45 - 01:53|
Ivan BeschastnikhComputer Science, University of British Columbia, Perry LiuUniversity of British Columbia, Albert XingUniversity of British Columbia, Patty WangUniversity of British Columbia, Yuriy BrunUniversity of Massachusetts Amherst, Michael D. ErnstUniversity of Washington, USADOI Pre-print
|01:53 - 02:01|