A Container-Based Infrastructure for Fuzzy-Driven Root Causing of Flaky Tests
Intermittent test failures (test flakiness) is common during continuous integration as modern software systems have become inherently non-deterministic. Understanding the root cause of test flakiness is crucial as intermittent test failures might be the result of real non-deterministic defects in the production code, rather than mere errors in the test code. Existing techniques for root causing test flakiness compare the runtime behaviour of passing and failing executions of the same test. They achieve this by repetitively executing a flaky test on an instrumented version of the system under test. This approach has two fundamental limitations: (i) code instrumentation might disrupt the normal behaviour of the test preventing the manifestation of test flakiness; (ii) passively re-executing a test many times could be insufficient for triggering intermittent test outcomes when test flakiness is rare or difficult to manifest. To address these limitations, we propose a new approach for root causing test flakiness that actively explores the non-deterministic space, without instrumenting code. Our novel idea is to repetitively execute a flaky test, under different execution clusters. Each cluster explores a certain non-deterministic dimension (e.g., concurrency, I/O, and networking) with dedicated software containers and fuzzy-driven resource load generators. The execution cluster that manifests the most balanced (or unbalanced) sets of passing and failing executions is likely to explain the broad type of test flakiness.