Project stage-3 (testing & reflection)

Introduction In this blog, we will test and validate the prune-clones I made last time. In this process, I will explain how I modified the files passes.cc and passes.def, and why I made prune_clones.h separately because prune_clones.cc was not enough. passes.cc and passes.def modification process 1. Modify passes.cc The passes.cc file defines and implements several optimization passes for GCC, where you might want to add a prune-clones pass: Define and implement prune_clones related classes and functions. Clearly define what role this path plays in the overall optimization process of the compiler. For example, passes.cc needs to declare a prune_clones class and add code to register it with pass_manager. passes.cc 's main task is to initialize and manage each optimization pass. // passes.cc #include "gcc.h"     ... #include "prune_clones.h" // function generating prune_clones pass gimple_opt_pass *  make_pass_prune_clones (gcc::context *ctxt )   {     retur...

Exploring FMV (Function Multi-Versioning) in Compiler Optimization

 In this post, I'd like to delve into FMV (Function Multi-Versioning), a fascinating optimization technique in compiler design that enhances software performance by generating specialized versions of functions. This week, following a course on compiler optimizations, I became intrigued by how FMV optimizes code execution across different hardware architectures. Introduction to FMV FMV, or Function Multi-Versioning, is a compiler optimization technique aimed at improving performance by dynamically generating multiple versions of a function. Each version is tailored to specific runtime conditions, such as CPU features and instruction sets. Evolution and Adoption Initially developed to optimize compiler-generated code, FMV has evolved into a key feature of modern compiler design. It addresses the challenge of adapting software performance to diverse hardware environments. Practical Application in GCC For instance, in GCC (GNU Compiler Collection), FMV is utilized to generate optimized...

Understanding SIMD and SVE: Harnessing Parallelism for Enhanced Performance

 In this post, I would like to share the research about SIMD and SVE. In modern computing, achieving optimal performance often relies on parallel processing techniques like SIMD (Single Instruction, Multiple Data) and SVE (Scalable Vector Extensions). Introduction SIMD executes a single instruction across multiple data elements simultaneously, while SVE offers scalable vector lengths for dynamic adaptation to computational needs. Evolution and Foundations Originally designed to accelerate mathematical operations, SIMD and SVE leverage specialized instruction sets to minimize instruction overhead and maximize data throughput. Real-World Impact SIMD and SVE find applications in diverse fields such as image processing, data analytics, and machine learning, delivering substantial performance gains compared to traditional scalar processing. Challenges and Future Directions Despite their benefits, programming for SIMD and SVE requires expertise due to architectural complexities and compa...

Project Stage 2 - (Implementation)

In this blog post, I'll delve into the detailed implementation of a new pass named "prune_clones" in GCC. This pass aims to traverse the call graph and prune duplicate clones, enhancing optimization capabilities within the compiler. Introduction In previous discussions, I outlined a plan to integrate a new pass into GCC for optimizing clone functions. Here, I will focus on the concrete implementation details, showcasing how this pass is structured and integrated into GCC's build process. 1. Create the New Pass File: First, create the new source file ' prune-clones.cc'  in the GCC source directory. [sshin36@aarch64-001 gcc]$ touch prune-clones.cc 2. Write the Code for the New Pass: Define the Pass Data Structure: create the pass data structure to define the properties of new pass #include "cgraph.h" #include "tree-pass.h" // Define the pass data structure const pass_data pass_data_prune_clones = {     GIMPLE_PASS ,         // type     ...