Where should I comment my code? A dataset and model for predicting locations that need comments (ICSE 2020 - New Ideas and Emerging Results) - ICSE 2020

Write a Blog >>

Wed 24 June - Thu 16 July 2020

Who

Annie Louis, Santanu Dash, Earl T. Barr, Michael D. Ernst, Charles Sutton

Track

ICSE 2020 New Ideas and Emerging Results

Time Zone

The program is currently displayed in (UTC) Coordinated Universal Time.

Use conference time zone: (UTC) Coordinated Universal TimeSelect other time zone

The GMT offsets shown reflect the offsets at the moment of the conference.

Time Band

By setting a time band, the program will dim events that are outside this time window. This is useful for (virtual) conferences with a continuous program (with repeated sessions).
The time band will also limit the events that are included in the personal iCalendar subscription service.

Display full programSpecify a time band

When

Tue 7 Jul 2020 15:36 - 15:42 at Silla - A3-Code Summarization Chair(s): Shaohua Wang

Abstract

Programmers should write code comments, but not on every line of code. Because both too few and too many comments are undesirable, programmers must judiciously decide where to write code comments. We have created a machine learning model that suggests locations where a programmer should write a code comment. We trained it on existing commented code to learn locations that are chosen by developers. Once trained, the model can predict locations in new code. Our models achieved precision of 74% and recall of 13% in identifying comment-worthy locations. This first success opens the door to future work, both in the new \emph{where-to-comment} problem and in generating the content of comments.

Annie Louis

University of Edinburgh

Santanu Dash

University College London, UK

Earl T. Barr

University College London, UK

Michael D. Ernst

University of Washington, USA

Charles Sutton

Google Research

United States

Time Zone

The program is currently displayed in (UTC) Coordinated Universal Time.

Use conference time zone: (UTC) Coordinated Universal TimeSelect other time zone

The GMT offsets shown reflect the offsets at the moment of the conference.

Time Band

By setting a time band, the program will dim events that are outside this time window. This is useful for (virtual) conferences with a continuous program (with repeated sessions).
The time band will also limit the events that are included in the personal iCalendar subscription service.

Display full programSpecify a time band

Session Program

Tue 7 Jul
Displayed time zone: (UTC) Coordinated Universal Time change

	15:00 - 16:00	A3-Code SummarizationTechnical Papers / New Ideas and Emerging Results at Silla Chair(s): Shaohua Wang New Jersey Institute of Technology, USA

	15:00 12m Talk		Posit: Simultaneously Tagging Natural and Programming LanguagesTechnical Technical Papers Profir-Petru Pârțachi University College London, Santanu Dash University College London, UK, Christoph Treude The University of Adelaide, Earl T. Barr University College London, UK Pre-print Media Attached File Attached
	15:12 12m Talk		CPC: Automatically Classifying and Propagating Natural Language Comments via Program AnalysisTechnical Technical Papers Juan Zhai Rutgers University, Xiangzhe Xu Nanjing University, Yu Shi Purdue University, Guanhong Tao Purdue University, Minxue Pan Nanjing University, Shiqing Ma Rutgers University, Lei Xu National Key Laboratory for Novel Software Technology, Nanjing University, Weifeng Zhang Nanjing University of Posts and Telecommunications, Lin Tan Purdue University, Xiangyu Zhang Purdue University
	15:24 12m Talk		Suggesting Natural Method Names to Check Name ConsistenciesTechnical Technical Papers Son Nguyen The University of Texas at Dallas, Hung Phan , Trinh Le University of Engineering and Technology, Tien N. Nguyen University of Texas at Dallas Pre-print
	15:36 6m Talk		Where should I comment my code? A dataset and model for predicting locations that need commentsNIER New Ideas and Emerging Results Annie Louis University of Edinburgh, Santanu Dash University College London, UK, Earl T. Barr University College London, UK, Michael D. Ernst University of Washington, USA, Charles Sutton Google Research
	15:42 12m Talk		Retrieval-based Neural Source Code SummarizationTechnical Technical Papers Jian Zhang Beihang University, Xu Wang Beihang University, Hongyu Zhang University of Newcastle, Australia, Hailong Sun Beihang University, Xudong Liu Beihang University Pre-print
	15:54 6m Talk		The Dual Channel HypothesisNIER New Ideas and Emerging Results Casey Casalnuovo University of California at Davis, USA, Earl T. Barr University College London, UK, Santanu Dash University College London, UK, Prem Devanbu University of California, Emily Morgan University of California, Davis