Write a Blog >>
ICSE 2020
Wed 24 June - Thu 16 July 2020
Wed 8 Jul 2020 01:13 - 01:21 at Goguryeo - P11-Natural Language Artifacts Chair(s): Jane Cleland-Huang

Software localization is the process of adapting a software product to the linguistic, cultural and technical requirements of a target market. It allows software companies to access foreign markets that would be otherwise difficult to penetrate. Many studies have been carried out to locate need-to-translate strings in software and adapt UI layout after text translation in the new language. However, no work has been done on the most important and time-consuming step of software localization process, i.e., the translation of software text. Due to some unique characteristics of software text, for example, application-specific meanings, context-sensitive translation, domain-specific rare words, general machine translation tools such as Google Translate cannot properly address linguistic and technical nuance in translating software text for software localization. In this paper, we propose a neural-network based translation model specifically designed and trained for mobile application text translation. We collect large-scale human-translated bilingual sentence pairs inside different Android applications, which are crawled from Google Play store. We customize the original RNN encoder-decoder neural machine translation model by adding categorical information addressing the domain-specific rare word problem which is common phenomenon in software text. We evaluate our approach in translating the text of testing Android applications by both BLEU score and exact match rate. The results show that our method outperforms the general machine translation tool, Google Translate, and generates more acceptable translation for software localization with less needs for human revision. Our approach is language independent, and we show the generality of our approach between English and the other five official languages used in United Nation (UN).

Wed 8 Jul

Displayed time zone: (UTC) Coordinated Universal Time change

01:05 - 02:05
P11-Natural Language ArtifactsJournal First / Technical Papers at Goguryeo
Chair(s): Jane Cleland-Huang University of Notre Dame
01:05
8m
Talk
Neural Network Based Classification of Self-admitted Technical Debt: From Performance to Explainability and DeployabilityJ1
Journal First
Xiaoxue Ren Zhejiang University, Zhenchang Xing Australia National University, Xin Xia Monash University, David Lo Singapore Management University, Xinyu Wang Zhejiang University, John Grundy Monash University
01:13
8m
Talk
Domain-specific Machine Translation with Recurrent Neural Network for Software LocalizationJ1
Journal First
Xu Wang College of Engineering & Computer ScienceAustralian National University, Canberra, Australia, Chunyang Chen Monash University, Zhenchang Xing Australia National University
01:21
12m
Talk
Mitigating Turnover with Code Review Recommendation: Balancing Expertise, Workload, and Knowledge DistributionTechnicalArtifact Available
Technical Papers
Ehsan Mirsaeedi Concordia University, Peter Rigby Concordia University, Montreal, Canada