Offline

Building your DSL compiler in Python

Track:
Machine Learning: Research & Applications
Type:
Talk
Level:
advanced
Duration:
30 minutes
View in the schedule

Abstract

A multitude of Python libraries offer Just In Time (JIT) compilation capabilities for general purpose computations, such as JAX or Numba. But in certain cases these solutions might not be enough, thus we seek more specialized solutions that can include domain knowledge as well.

Compilers for Domain Specific Languages (DSL) allow us to incorporate additional information during compilation, possibly tied to a specific domain, such as sparse arrays computing.

In this talk I'm going to introduce Finch - language and a compiler for sparse and structured multidimensional arrays, which specializes its kernels for control flow and data structures. It supports common control flow (loops, if conditions, etc.) and a wide variety of data structures - dense, sparse list, triangles, coordinate, or symmetry.

The ongoing effort to move Finch from the original Julia implementation to pure Python exposed us to new ways of using Python - as a language for implementing compilers.

In this talk I will present some key aspects of Python language and ecosystem which played a central role in making us productive in the last months during this undertaking:

  • Defining IRs with dataclasses and utilizing Structural Pattern Matching for term rewriting,
  • Using Lark for parsing custom languages into large IR structures,
  • Expressing complex rewrite schemes with rewrite-tools.

The audience will learn our approach to designing the structure of a compiler, together with a few technical decisions made along the way. I hope these insights will be useful for engineers and scientists coming from closely related fields and projects also involving compiling custom languages, such as probabilistic programming or hardware design.