What Do Programmers Discuss about Blockchain? A Case Study on the Use of Balanced LDA and the Reference Architecture of a Domain to Capture Online Discussions about Blockchain Platforms across Stack Exchange CommunitiesJ1
Blockchain-related discussions have become increasingly prevalent in programming Q&A websites, such as Stack Overflow and other Stack Exchange communities. Analyzing and understanding those discussions could provide insights about the topics of interest to practitioners, and help the software development and research communities better understand the needs and challenges facing developers as they work in this new domain. Prior studies propose the use of LDA to study the Stack Exchange discussions. However, a simplistic use of LDA would capture the topics in discussions blindly without keeping in mind the variety of the dataset and domain-specific concepts. Specifically, LDA is biased towards larger sized corpora; and LDA-derived topics are not linked to higher level domain-specific concepts. We propose an approach that combines balanced LDA (which ensures that the topics are balanced across a domain) with the reference architecture of a domain to capture and compare the popularity and impact of discussion topics across the Stack Exchange communities. Popularity measures the distribution of interest in discussions, and impact gauges the trend of popularity over time. We made a number of interesting observations, including: (1) Bitcoin, Ethereum, Hyperledger Fabric and Corda are the four most commonly-discussed blockchain platforms on the Stack Exchange communities. (2) A broad range of topics are discussed across the various platforms of distinct layers in our derived reference architecture. (3) The Application layer topics exhibit the highest popularity (33.2%) and fastest growth in topic impact since November 2015. (4) The Application, API, Consensus and Network layer topics are discussed across the studied blockchain platforms, but exhibit different distributions in popularity. (5) The impact of architectural layer topics exhibits an upward trend, but is growing at different speeds across the studied blockchain platforms. The breakdown of the topic impact across the architectural layers is relatively stable over time except for the Hyperledger Fabric platform. Based on our findings, we highlighted future directions and provided recommendations for practitioners and researchers.