ICSE 2020
Wed 24 June - Thu 16 July 2020
Wed 8 Jul 2020 16:34 - 16:46 at Silla - A12-Testing Chair(s): Sasa Misailovic

With the growing use of DevOps tools and frameworks, there is an increased need for tools and techniques that support \emph{more than code}. The current state-of-the-art in static developer assistance for tools like Docker is limited to shallow syntactic validation. We identify three core challenges in the realm of learning from, understanding, and supporting developers writing DevOps artifacts: (i) nested languages in DevOps artifacts, (ii) rule mining, and (iii) the lack of semantic rule-based analysis. To address these challenges we introduce a toolset, binnacle, that enabled us to ingest 900,000 GitHub repositories.

Focusing on Docker, we extracted approximately 219,000 Dockerfiles, and also identified a Gold Set of Dockerfiles written by Docker experts. We addressed challenge (i) by reducing the number of effectively uninterpretable nodes in our ASTs by over 80% via a technique we call \emph{phased parsing}. To address challenge (ii), we introduced a novel rule-mining technique capable of recovering two-thirds of the rules in a benchmark we curated. Through this automated mining, we were able to recover 16 new rules that were not found during manual rule collection. To address challenge (iii), we manually collected a set of rules for Dockerfiles from commits to the files in the Gold Set. These rules encapsulate best practices, avoid docker build failures, and improve image size and build latency. We created an analyzer that used these rules, and found that, on average, Dockerfiles on GitHub violated the rules \emph{six times more frequently} than the Dockerfiles in our Gold Set. We also found that industrial Dockerfiles fared no better than those sourced from GitHub.

The learned rules and analyzer in binnacle can be used to aid developers in the IDE when creating Dockerfiles, and in a post-hoc fashion to identify issues in, and to improve, existing Dockerfiles.

Wed 8 Jul
16:05 - 17:05: Paper Presentations - A12-Testing at Silla
Chair(s): Sasa MisailovicUniversity of Illinois at Urbana-Champaign
icse-2020-papers16:05 - 16:17
Thodoris SotiropoulosAthens University of Economics and Business, Dimitris MitropoulosAthens University of Economics and Business, Diomidis SpinellisAthens University of Economics and Business
icse-2020-Journal-First16:17 - 16:25
Paul TemplePReCISE, NaDi, UNamur, Mathieu Acher(Univ Rennes, Inria, IRISA), Jean-Marc JézéquelUniv Rennes - IRISA
Demonstrations16:25 - 16:28
Matias MartinezUniversité Polytechnique Hauts-de-France, Anne EtienUniversité de Lille, CNRS, Inria, Centrale Lille, UMR 9189 –CRIStAL, Stéphane Ducasse INRIA Lille, Christopher FuhrmanÉcole de technologie supérieure
icse-2020-New-Ideas-and-Emerging-Results16:28 - 16:34
Valerio TerragniUniversità della Svizzera Italiana, Pasquale SalzaUniversity of Zurich, Filomena FerrucciUniversity of Salerno
icse-2020-papers16:34 - 16:46
Jordan HenkelUniversity of Wisconsin–Madison, Christian BirdMicrosoft Research, Shuvendu K. LahiriMicrosoft Research, Thomas RepsUniversity of Wisconsin-Madison, USA
icse-2020-Journal-First16:46 - 16:54
Gemma CatolinoDelft University of Technology, Fabio PalombaUniversity of Salerno, Francesca Arcelli FontanaUniversity of Milano-Bicocca, Andrea De LuciaUniversity of Salerno, Andy ZaidmanTU Delft, Filomena FerrucciUniversity of Salerno
Demonstrations16:54 - 16:57
Phu X. MaiUniversity of Luxembourg, Arda GoknilSnT, University of Luxembourg, Fabrizio PastoreUniversity of Luxembourg, Lionel C. BriandSnT Centre/University of Luxembourg