ICSE 2020
Wed 24 June - Thu 16 July 2020
Sat 11 Jul 2020 01:05 - 01:13 at Silla - P30-Ecosystems 2

Much research has investigated the common reasons for build breakages. However, prior research has paid little attention to builds that may break due to reasons that are unlikely to be related to development activities. For example, Continuous Integration (CI) builds may break due to timeout or connection errors while generating the build. Such kinds of build breakages potentially introduce noises to build breakage data, which can lead to misleading results when performing research on CI builds. In this paper, we identify three types of noisy build breakages, namely Environmental, Cascading, and Allowed breakages. Our results reveal that over 50% of build breakages can be noisy. Moreover, we measure the impact of using noisy data on modeling the build breakage. We observe that findings from prior research may not hold if noisy build breakages are excluded from datasets. Therefore, researchers should be more careful about the quality of build breakage data.

