The surprising predictability of source code has triggered a boom in tools using language models for code. Code is much more predictable than natural language, but the reasons are not well understood. We propose a dual channel view of code; code combines a formal channel for specifying execution and a natural language channel in the form of identifiers and comments that assists human comprehension.
Computers ignore the natural language channel, but developers read both and, when writing code for longterm use and maintenance, consider each channel’s audience: computer and human. As developers hold both channels in mind, we advance the \emph{dual channel hypothesis}: the two channels interact and constrain each other. If true, this hypothesis will overturn current, standard practice of considering only the formal channel, or, if both channels, each in isolation.
We describe how the constraints of this dual audience setting can lead to humans writing code in a way more predictable than natural language, highlight pioneering research that has implicitly or explicitly used parts of this theory, and drive new research, such as systematically searching for cross-channel inconsistencies. The dual channel hypothesis provides an exciting opportunity as truly multi-disciplinary research; for computer scientists it promises improvements to program analysis via a more holistic approach to code, and to psycholinguists it promises a novel environment for studying linguistic processes.
Tue 7 JulDisplayed time zone: (UTC) Coordinated Universal Time change
15:00 - 16:00 | A3-Code SummarizationTechnical Papers / New Ideas and Emerging Results at Silla Chair(s): Shaohua Wang New Jersey Institute of Technology, USA | ||
15:00 12mTalk | Posit: Simultaneously Tagging Natural and Programming LanguagesTechnical Technical Papers Profir-Petru Pârțachi University College London, Santanu Dash University College London, UK, Christoph Treude The University of Adelaide, Earl T. Barr University College London, UK Pre-print Media Attached File Attached | ||
15:12 12mTalk | CPC: Automatically Classifying and Propagating Natural Language Comments via Program AnalysisTechnical Technical Papers Juan Zhai Rutgers University, Xiangzhe Xu Nanjing University, Yu Shi Purdue University, Guanhong Tao Purdue University, Minxue Pan Nanjing University, Shiqing Ma Rutgers University, Lei Xu National Key Laboratory for Novel Software Technology, Nanjing University, Weifeng Zhang Nanjing University of Posts and Telecommunications, Lin Tan Purdue University, Xiangyu Zhang Purdue University | ||
15:24 12mTalk | Suggesting Natural Method Names to Check Name ConsistenciesTechnical Technical Papers Son Nguyen The University of Texas at Dallas, Hung Phan , Trinh Le University of Engineering and Technology, Tien N. Nguyen University of Texas at Dallas Pre-print | ||
15:36 6mTalk | Where should I comment my code? A dataset and model for predicting locations that need commentsNIER New Ideas and Emerging Results Annie Louis University of Edinburgh, Santanu Dash University College London, UK, Earl T. Barr University College London, UK, Michael D. Ernst University of Washington, USA, Charles Sutton Google Research | ||
15:42 12mTalk | Retrieval-based Neural Source Code SummarizationTechnical Technical Papers Jian Zhang Beihang University, Xu Wang Beihang University, Hongyu Zhang University of Newcastle, Australia, Hailong Sun Beihang University, Xudong Liu Beihang University Pre-print | ||
15:54 6mTalk | The Dual Channel HypothesisNIER New Ideas and Emerging Results Casey Casalnuovo University of California at Davis, USA, Earl T. Barr University College London, UK, Santanu Dash University College London, UK, Prem Devanbu University of California, Emily Morgan University of California, Davis |