Improving the Effectiveness of Traceability Link Recovery using Hierarchical Bayesian Networks
Traceability is a fundamental component of the modern software development process that helps to ensure properly functioning, secure programs. However, traceability tasks incur high costs in terms of effort and time and are often prone to errors. This has prompted a wealth of research on automated approaches that draw relationships between pairs of textual software artifacts using similarity measures. Despite the progress made toward practical automation, such approaches currently have two major drawbacks that inhibit their effectiveness. Namely, current techniques typically only utilize a single measure of artifact similarity, and cannot simultaneously model (implicit and explicit) relationships across groups of diverse structured and unstructured development artifacts.
In this paper, we illustrate how these limitations can be overcome through the use of a tailored probabilistic model. To this end, we design and implement a HierarchiCal PrObabilistic Model for SoftwarE Traceability (Comet) that is able to predict candidate trace links. Comet is capable of modeling relationships between artifacts by combining the complimentary observational prowess of multiple measures of textual similarity. Additionally, our model can holistically incorporate information from a diverse set of sources, including developer expertise and transitive (often implicit) relationships among groups of software artifacts, to improve prediction accuracy. We conduct a comprehensive empirical evaluation of Comet that illustrates our approach is consistently more effective across datasets than existing baseline techniques, and outperforms past approaches on average when considering multiple information sources. Additionally, we worked with a major telecommunication company to develop a Continuous Integration (CI) plugin implementing Comet. A survey with industry developers who used the Comet plugin illustrates its potential for practical applicability.