Difference Grouping and Test Suite Evaluation: Lessons from Automated Differential Testing for Adobe Analytics
Arguably the most important measure of software quality in industry is the ability to acquire and retain customers. At Adobe we have learned that a good indicator of our future ability to retain customers is the number of customer reported regression bugs. To keep this number low we have found that the best testing approach is differential testing (DT). One of the greatest challenges for DT is how to handle the large number of differences that are often discovered. DT can only tell you if behavior has changed but cannot tell you if those changes are acceptable. Traditionally, a large manual effort is required to inspect each difference to determine the quality of the code change(s). This is very time consuming and severely limits the volume of testing that can be completed thus reducing adoption of this powerful testing technique. One approach used at Adobe to solve this problem is the use of association rule mining to collect similar differences into unique groups.
Grouping accomplishes two important tasks: 1. A great reduction in the number of differences that must be examined manually and 2. A simple and practical method of test suite evaluation. By grouping the differences and then analyzing those groups we can tell how effective the test suite has been in exercising all code paths. If code changes are expected to create differences, and we fail to find groups that reflect these expected differences, then we know that our test suite is incomplete and needs attention.