Introduction
The Deep Compiler initiative aims to automate compiler construction using as little specification as possible from compiler developers. By doing so, we aim to minimize the burden of developing and maintaining compilers that produce state-of-the-art performant code.
With the emergence of diverse instructions sets and domain specific architectures, manually designing the optimization algorithms and heuristics to target each specific execution environment is burdensome and challenging. For example, even for ubiquitous execution environments like Intel x86, modern compilers only consider a fraction of all the available instructions when generating code. Compiler heuristics, or even the underlying optimization algorithms themselves, targeting older Intel microarchitectures do not necessarily generalize well to newer microarchitectures, and may require re-tuning and even complete rewrites.
We alleviate these problems by using machine learning driven automation. Our work is fundamentally different from parameter tuning approaches commonly used by the compiler community to tune heuristics of a hand-coded algorithms. Instead, we propose machine learning based techniques to learn the entire end-to-end optimization algorithm. This approach leads to several advantages over traditional hand-coded systems: by operating on the raw input (and not a featurized representation) and measuring performance using raw output signals (instead of a hand-coded cost model), machine learning can be more accurate than hand-written models, while requiring minimal effort to port to new architectures. Learning end-to-end policies, rather than simply tuning hand-coded optimization algorithms, allows the machine learning system to consider transformation choices outside of those considered by traditional hand-coded algorithms.
Projects
Compilers make optimization decisions using three interrelated components: the optimization strategy, the transformation space, and the cost model. The optimization strategy is the core algorithm that the compiler uses to make decisions about how to optimize. Its objective is to select the most profitable set of candidate transformations from a large transformation space of semantically equivalent program transformations. It typically uses a cost model to decide between mutually exclusive options.

We modernize all three components involved in compiler decision making using data-driven techniques. To this end, we have developed the following projects:
Publications
- Compiler Auto-Vectorization with Imitation Learning
Charith Mendis, Cambridge Yang, Yewen Pu, Saman Amarasinghe, Michael Carbin
NeurIPS 2019
[PDF] [Bibtex] - BHive: A Benchmark Suite and Measurement Framework for Validating x86-64 Basic Block Performance Models
Yishen Chen, Ajay Brahmakshatriya, Charith Mendis, Alex Renda, Eric Atkinson, Ondrej Sykora, Saman Amarasinghe, Michael Carbin
IISWC 2019
[PDF] [Bibtex] [Dataset] [Code] - Ithemal: Accurate, Portable and Fast Basic Block Throughput Estimation using Deep Neural Networks
Charith Mendis, Alex Renda, Saman Amarasinghe, Michael Carbin
ICML 2019
Best Paper Award (ML for Systems @ISCA 2019)
[PDF] [Bibtex] [Code]