Technical Report: "Introduction to Compiler Generation Using HACS"

By Two Sigma on March 1, 2016
A guy programming in the dark

Higher-order Attribute Contraction Schemes (HACS), a language for programming compilers, makes it possible to create a full compiler from a single source file.

Higher-order Attribute Contraction Schemes (HACS), a language for programming compilers, makes it possible to create a fully functional compiler from a single source file.

The Technical Report Introduction to Compiler Generation Using HACS, by Two Sigma Labs’ Kristoffer H. Rose, explains how to get HACS up and running, and walks through the code of a simple example with each of the main stages of a compiler as specified in HACS: lexical analysis, syntax analysis, semantic analysis, and code generation.

Many systems for writing programs that manipulate other programs, so-called meta-programming, have emerged over the years. These range from generic specification languages, where the goal is only to declare the semantics of the program manipulation, all the way to tools that support specific aspects of program execution or compiler generation.

One such system was CRSX, developed for industrial use at IBM Research by a team led by Kris Rose. CRSX is a language based on higher-order rewriting combined with higher-order abstract syntax (HOAS), further extended for handling environments natively and using pluggable parsers.

The development of the IBM DataPower XQuery compiler using CRSX proved that the approach can drastically reduce the development time of a compiler (the cited XQuery compiler was estimated to have been developed in a quarter of the traditional development time) as well as resulting in a rather more compact and high-level source program.

However, the CRSX notation, based on combinatory reduction systems, which combines λ calculus and term rewriting systems, has proven to be unwieldy for several reasons—first of all by being quite different from standard notations used in compiler construction reference works. Adding a polymorphic sort system to CRSX following contraction systems changed it to be similar to Inductive Type Systems. This helped, but it did not make the system easy enough to, for example, teach compiler construction.

HACS is an attempt to remedy this situation by providing a front-end for CRSX that allows the use of standard notations and concepts of (formal) programming language descriptions to directly program compilers and other systems for manipulating code. HACS has been successfully used to teach the graduate computer science compiler construction class at New York University.

Download PDF — 452.41 KB

This article is not an endorsement by Two Sigma of the papers discussed, their viewpoints or the companies discussed. The views expressed above reflect those of the authors and are not necessarily the views of Two Sigma Investments, LP or any of its affiliates (collectively, “Two Sigma”). The information presented above is only for informational and educational purposes and is not an offer to sell or the solicitation of an offer to buy any securities or other instruments. Additionally, the above information is not intended to provide, and should not be relied upon for investment, accounting, legal or tax advice. Two Sigma makes no representations, express or implied, regarding the accuracy or completeness of this information, and the reader accepts all risks in relying on the above information for any purpose whatsoever. Click here for other important disclaimers and disclosures.