Write a Blog >>
ICSE 2020
Wed 24 June - Thu 16 July 2020
Thu 9 Jul 2020 17:10 - 18:00 at Poster Special Room - A310-Posters

Statistical language modeling techniques have successfully been applied to large source code corpora, yielding a variety of new software development tools, such as tools for code suggestion, improving readability, and API migration. A major issue with these techniques is that code introduces new vocabulary at a far higher rate than natural language, as new identifier names proliferate. Both large vocabularies and out-of-vocabulary issues severely affect Neural Language Models (NLMs) of source code, degrading their performance and rendering them unable to scale.

In this paper, we address this issue by: 1) studying how various modelling choices impact the resulting vocabulary on a large-scale corpus of 13,362 projects; 2) presenting an \emph{open vocabulary} source code NLM that can scale to such a corpus, 100 times larger than in previous work, and outperforms the state of the art. To our knowledge, this is the largest NLM for code that has been reported.

Thu 9 Jul
Times are displayed in time zone: (UTC) Coordinated Universal Time change

icse-2020-poster
17:10 - 18:00: ICSE 2020 Posters - A310-Posters at Poster Special Room
icse-2020-poster17:10 - 18:00
Poster
Daniela GirardiUniversity of Bari, Nicole NovielliUniversity of Bari, Davide FucciBlekinge Institute of Technology, Filippo LanubileUniversity of Bari
icse-2020-poster17:10 - 18:00
Poster
Simos GerasimouUniversity of York, UK, Hasan Ferit EniserMPI-SWS, Alper SenBogazici University, Turkey, Alper ÇakanBogazici University, Turkey
icse-2020-poster17:10 - 18:00
Poster
Rafael-Michael KarampatsisThe University of Edinburgh, Hlib BabiiFree University of Bozen-Bolzano, Romain Robbes, Charles SuttonGoogle Research, Andrea JanesFree University of Bozen-Bolzano
icse-2020-poster17:10 - 18:00
Poster
Markus BorgRISE Research Institutes of Sweden AB
icse-2020-poster17:10 - 18:00
Poster
Leonardo Alexandre Ferreira LeiteUniversity of São Paulo, Fabio KonUniversity of São Paulo, Gustavo PintoUFPA, Paulo MeirellesFederal University of São Paulo
icse-2020-poster17:10 - 18:00
Poster
Vartika Agrahari, Sridhar ChimalakondaIndian Institute of Technology Tirupati