Why Reinventing the Wheels? An Empirical Study on Library Reuse and Re-implementation (ICSE 2020 - Journal First)

Write a Blog >>

Wed 24 June - Thu 16 July 2020

Who

Bowen Xu, Le An, Ferdian Thung, Foutse Khomh, David Lo

Track

ICSE 2020 Journal First

Time Zone

The program is currently displayed in (UTC) Coordinated Universal Time.

Use conference time zone: (UTC) Coordinated Universal TimeSelect other time zone

The GMT offsets shown reflect the offsets at the moment of the conference.

Time Band

By setting a time band, the program will dim events that are outside this time window. This is useful for (virtual) conferences with a continuous program (with repeated sessions).
The time band will also limit the events that are included in the personal iCalendar subscription service.

Display full programSpecify a time band

Save

When

Fri 10 Jul 2020 07:36 - 07:44 at Silla - I21-Version Control and Programming Chair(s): Sunghun Kim

Abstract

Nowadays, with the rapid growth of open source software (OSS), library reuse becomes more and more popular since a large amount of third- party libraries are available to download and reuse. A deeper understanding on why developers reuse a library (i.e., replacing self-implemented code with an external library) or re-implement a library (i.e., replacing an imported external library with self-implemented code) could help researchers better understand the factors that developers are concerned with when reusing code. This understanding can then be used to improve existing libraries and API recommendation tools for researchers and practitioners, by using the developers concerns identified in this study as design criteria.

In this work, we investigated the reasons behind library reuse and re-implementation. To achieve this goal, we first crawled data from two popular sources, F-Droid and GitHub. Then, potential instances of library reuse and re-implementation were found automatically based on certain heuristics. Next, for each instance, we further manually identified whether it is valid or not. For library re-implementation, we obtained 82 instances which are distributed in 75 repositories. We then conducted two types of surveys (i.e., individual survey to corresponding developers of the validated instances and another open survey) for library reuse and re-implementation. For library reuse individual survey, we received 36 responses out of 139 contacted developers. For re-implementation individual survey, we received 13 responses out of 71 contacted developers. In addition, we received 56 responses from the open survey. Finally, we perform qualitative and quantitative analysis on the survey responses and commit logs of the validated instances.

The results suggest that library reuse occurs mainly because developers were initially unaware of the library or the library had not been introduced. Re-implementation occurs mainly because the used library method is only a small part of the library, the library dependencies are too complicated, or the library method is deprecated. Finally, based on all findings obtained from analyzing the surveys and commit messages, we provided a few suggestions to improve the current library recommendation systems: tailored recommendation according to users’ preferences, detection of external code that is similar to a part of the users’ code (to avoid duplication or re-implementation), grouping similar recommendations for developers to compare and select the one they prefer, and disrecommendation of poor-quality libraries.

Bowen Xu

Singapore Management University

Le An

Polytechnique Montreal

Canada

Ferdian Thung

Singapore Management University

Foutse Khomh

Polytechnique Montréal

Canada

David Lo

Singapore Management University

Singapore

Time Zone

The program is currently displayed in (UTC) Coordinated Universal Time.

Use conference time zone: (UTC) Coordinated Universal TimeSelect other time zone

The GMT offsets shown reflect the offsets at the moment of the conference.

Time Band

Display full programSpecify a time band

Save

Session Program

Fri 10 Jul
Displayed time zone: (UTC) Coordinated Universal Time change

07:00 - 08:00	I21-Version Control and ProgrammingTechnical Papers / Journal First / Software Engineering in Practice at Silla Chair(s): Sunghun Kim Hong Kong University of Science and Technology

07:00 12m Talk		Towards Understanding and Fixing Upstream Merge Induced Conflicts in Divergent Forks: An industrial Case StudySEIP Software Engineering in Practice Chungha Sung University of Southern California, Shuvendu K. Lahiri Microsoft Research, Mike Kaufman Microsoft Corporation, Pallavi Choudhury Microsoft Corporation, Chao Wang USC
07:12 8m Talk		Version Control Systems: An Information Foraging PerspectiveJ1 Journal First Sruti Srinivasa Ragavan Microsoft Research; School of EECS, Oregon State University, Mihai Codoban Microsoft, David Piorkowski IBM Research AI, Danny Dig University of Colorado, Boulder, Margaret Burnett Oregon State University
07:20 8m Talk		How different are different diff algorithms in Git?J1 Journal First Yusuf Sulistyo Nugroho Nara Institute of Science and Technology, Hideaki Hata Nara Institute of Science and Technology, Kenichi Matsumoto Nara Institute of Science and Technology DOI Media Attached
07:28 8m Talk		Characterizing the Usage, Evolution and Impact of Java Annotations in PracticeJ1 Journal First Zhongxing Yu KTH Royal Institute of Technology, Chenggang Bai Beihang University, Lionel Seinturier , Martin Monperrus KTH Royal Institute of Technology
07:36 8m Talk		Why Reinventing the Wheels? An Empirical Study on Library Reuse and Re-implementationJ1 Journal First Bowen Xu Singapore Management University, Le An Polytechnique Montreal, Ferdian Thung Singapore Management University, Foutse Khomh Polytechnique Montréal, David Lo Singapore Management University
07:44 12m Talk		HeteroRefactor: Refactoring for Heterogeneous Computing with FPGATechnical Technical Papers Aishwarya Sivaraman University of California, Los Angeles, Jason Lau University of California, Los Angeles, Qian Zhang University of California, Los Angeles, Muhammad Ali Gulzar University of California, Los Angeles, Jason Cong UCLA, Miryung Kim University of California, Los Angeles DOI