Jiahao Deng
- BSc (Zhehiang University, 2020)
Topic
Library Usage Analysis in the C++ Codebase of Fedora Linux 37
Department of Electrical and Computer Engineering
Date & location
- Tuesday, October 29, 2024
- 9:00 A.M.
- Engineering Office Wing, Room 230
Examining Committee
Supervisory Committee
- Dr. Michael D. Adams, Department of Electrical and Computer Engineering, 番茄社区 (Supervisor)
- Dr. Stephen Neville, Department of Electrical and Computer Engineering, UVic (Member)
External Examiner
- Dr. Ibrahim Numanagic, Department of Computer Science, UVic
Chair of Oral Examination
- Dr. Steve Perlman, Department of Biology, UVic
Abstract
C++source codeanalysis is conducted at scale. A framework is proposed for analyzing the C++ codebase of operating systems that employ the dnf package manager, such as Fedora Linux and Red Hat Enterprise Linux. The framework can run an arbitrary static analysis tool over software packages that contain C++ code from compatible operating systems. In order to evaluate the effectiveness of the framework and to better understand how the C++ language is used in practice, a C++ analysis tool is developed to study library usage with a fine level of granularity, considering instances of uses of types, type aliases, member/non-member functions, variables, and enumerators.
The framework, combined with the C++ library usage analysis tool, is used to analyze 2379 software packages from the codebase of Fedora Linux 37. A total of 398065762 lines of C++ code is investigated, a scale not achieved by previous C++ research. Based on the Clang compiler front-end libraries, our library usage analysis tool addresses C++ parsing accuracy issues found in many other studies. Consequently, the tool extracts a reliable collection of library usage instances from C++ software.
Numerous observations are made regarding various aspects of library usage that can facilitate improved teaching of C++, aid in the refinement of C++ libraries, and help guide the future evolution of the C++ standard. For example, our analysis reveals that C++ programmers rarely use some C++ standard library algorithms designed for specialized purposes or combined operations. These algorithms often appear in less than 1% of all C++ software packages investi gated. The low adoption rate suggests possible improvements by deprecating such algorithms to simplify the standard library interface. Such observations summarize current trends in C++ library usage and provide recommendations for improving the C++ language and its libraries.