Advisor: Allen Malony (University of Oregon)
Abstract: My research focuses on developing a pipeline optimization infrastructure that automates the design and code generation of neural networks through the use of high-performance computing. The problem has the following objectives: unify neural architecture search and compilation, create a knowledge base for informing the search process, explore various search methods. The search space is complex and deciding which parameters factor into the overall accuracy of a model is a non-trivial task. Once a model is trained, the next step compiles the model, which maps to the backend of a targeted architecture, such as GPUs, embedded mobile phones, and FPGAs. The compilation phase also involves choices, including flags, code transformations, and variable length floating point precision. Various search methods are explored, making use of the knowledge base for efficient exploration. Our previous work reduced the search space for GPU code generation of various domain kernels by 92%, and this work investigates whether the same approach can be applied in neural architecture search and code generation.
Thesis Canvas: pdf