Write a Blog >>
ICSE 2020
Wed 24 June - Thu 16 July 2020
Wed 8 Jul 2020 16:16 - 16:24 at Goguryeo - A11-Performance and Analysis Chair(s): Pooyan Jamshidi

Performance issues can have a devastating impact on the perceived quality of the software. To avoid such problems, the performance of a software system needs to be thoroughly tested and microbenchmarking is a widely used method for precise performance evaluation of specific units of program code.

Microbenchmarking frameworks, such as Java’s Microbenchmark Harness (JMH), allow developers to write fine-grained performance test suites at the method or statement level. However, due to the complexities of the Java Virtual Machine, developers often struggle with writing expressive JMH benchmarks which accurately represent the performance of such methods or statements.

In this paper, we empirically study bad practices of JMH benchmarks. We develop a tool that leverages static analysis to automatically identify 5 bad JMH practices. Using this tool, we empirically investigate the occurrence of bad JMH practices on 123 open source Java-based systems and found that each of these 5 bad practices is prevalent in open source software.

Further, we manually fix 105 benchmarks across 6 projects and quantify the impact of each bad practice in multiple case studies. Our analysis shows that bad practices often significantly impact the benchmark results, distorting the performance counters with large effect sizes.

To validate our experimental results, we constructed seven patches that fix the identified bad practices in 57 benchmarks from six of the studied open source projects, of which six were merged into the main branch of the project. In this paper, we show that developers struggle with accurate Java microbenchmarking, and provide several recommendations to developers of microbenchmarking frameworks on how to improve future versions of their framework.

The contributions of this paper are as follows: 1. Our study is the first to investigate the prevalence of bad JMH practices on real open-source projects and find that bad JMH practices are common and widespread in Java projects. 2. Our study is the first to quantify the impact of bad JMH practices of real benchmark results. Our results show that bad practices often significantly impact benchmark measurements. 3. We provide a static analysis tool that identifies the occurrence of bad JMH practices in JMH microbenchmarks. This tool can be executed through a batch command using Maven, Gradle or Ant and can be embedded into the CI pipeline of a software project. Also, our tool can be integrated with Eclipse IDE, where the warnings about bad JMH practices are showing directly in the editor view and can be used as a guideline for developers during benchmark development.

The full-paper is published in the IEEE Transactions on Software Engineering and can be found at https://ieeexplore.ieee.org/document/8747433

Date of Publication: June 27, 2019

Wed 8 Jul

Displayed time zone: (UTC) Coordinated Universal Time change

16:05 - 17:05
A11-Performance and AnalysisNew Ideas and Emerging Results / Journal First / Technical Papers / Demonstrations at Goguryeo
Chair(s): Pooyan Jamshidi University of South Carolina
Nimbus: Improving the Developer Experience for Serverless ApplicationsDemo
Robert Chatley Imperial College London, Thomas Allerton Starling Bank
Testing with Fewer Resources: An Adaptive Approach to Performance-Aware Test Case GenerationJ1
Journal First
Giovanni Grano University of Zurich, Christoph Laaber University of Zurich, Annibale Panichella Delft University of Technology, Sebastiano Panichella Zurich University of Applied Sciences
Link to publication DOI Pre-print
What's Wrong with My Benchmark Results? Studying Bad Practices in JMH BenchmarksJ1
Journal First
Diego Costa Concordia University, Canada, Cor-Paul Bezemer University of Alberta, Canada, Philipp Leitner Chalmers University of Technology & University of Gothenburg, Artur Andrzejak Heidelberg University
Towards the Use of the Readily Available Tests from the Release Pipeline as Performance Tests. Are We There Yet?ACM SIGSOFT Distinguished Paper AwardsTechnical
Technical Papers
Zishuo Ding University of Waterloo, Canada, Jinfu Chen Concordia University, Canada, Weiyi Shang Concordia University
ModGuard: Identifying Integrity & Confidentiality Violations in Java ModulesJ1
Journal First
Andreas Dann Paderborn University, Ben Hermann Paderborn University, Eric Bodden Heinz Nixdorf Institut, Paderborn University and Fraunhofer IEM
Link to publication DOI
Program Debloating via Stochastic OptimizationNIER
New Ideas and Emerging Results
Qi Xin Georgia Institute of Technology, Myeongsoo Kim Georgia Institute of Technology, Qirun Zhang Georgia Institute of Technology, USA, Alessandro Orso Georgia Tech
The ORIS Tool: Quantitative Evaluation of Non-Markovian SystemsJ1
Journal First
Marco Paolieri University of Southern California, Marco Biagi University of Florence, Laura Carnevali University of Florence, Enrico Vicario University of Florence