git rebase failing intermittently while Xcode is running

I’ve had a few issues recently with git rebase failing intermittently when there are clearly no conflicts; if you run git rebase –abort and then try the rebase again it often succeeds.

It appears that this can happen when Xcode is running in the background, as described in this discussion; the problem is that an application can be running git status in the background. My own experience supports this (closing Xcode fixes the problem), so if you’re experiencing a similar problem you may find this helpful.

Loci v1.4 Released

The latest version of the Loci compiler is now available. See the release notes for detailed information about what’s included.

The main feature in this release is a brand new compiler diagnostics system with a rewritten lexer and parser. Here’s an example of errors from the old compiler:

And here’s how it looks with the new diagnostics system:

This release is therefore an enormous improvement in the usability of the compiler and the cause of errors/warnings should be very clear to the user.

GLR vs Recursive Descent Parsing

One of the oldest parts of the Loci compiler was replaced earlier this month: the parser. The switch was from a Bison GLR parser to a hand-written recursive descent parser. Rewriting a major compiler component is a significant time/effort investment (it took around 2-3 weeks of relatively intense development), so I’ll explain why I took the leap.

Original GLR Grammar

LL vs LR

If you don’t know anything about LR parsers, I’ll try to give a summary here. Basically LR parsers construct the parse tree bottom-up, trying to match a sequence of symbols to their associated non-terminal. This contrasts with LL parsers, which start from the non-terminal and try to construct the parse tree top-down. There’s an explanation here: http://stackoverflow.com/a/6824545/142088 . Out of the two, LL parsers are the more intuitive, though LR parsers are more ‘powerful’ (i.e. for every language that there’s an LL parser there’s an LR parser, but not vice versa).

Template Syntax

Of course for a generalised systems programming language like Loci it’s not this simple. Most significantly, Loci inherits C++’s template syntax, which uses ‘<‘ and ‘>’ both to delimit template arguments and for less-than/greater-than. This is combined with Loci’s multi-pass compilation, meaning that we don’t know whether a NAME is a type or a function or a value etc., making code like the following potentially ambiguous:

This code could be parsed as two comparisons, i.e.:

Or it could be a call to a templated function call:

(It’s clearly a templated function call and this is the interpretation chosen by the compiler.) All of this could’ve been made easier for the compiler; Loci could’ve been designed to use a different template syntax, such as (from the D programming language):

However in the design of Loci I deliberately focused on doing the easiest thing for the user, which is to preserve the familiarity of the C++ template (and Java generic) syntax. Loci always tries to follow language conventions where they exist, following the principle of least surprise. I personally also think that the C++ template is clearer, since the triangle brackets are noticeable different from the round brackets.

Switching to GLR

The Loci compiler started off with an LALR parser (LALR is essentially a variant of LR which is less powerful but which produces significantly smaller parsing tables). However in the context of syntax like that shown above, it was upgraded to use Bison’s GLR parsing functionality.

A GLR (Generalised LR) parser is a way of running multiple parsers simultaneously. Most of the time the action for any given input token is unambiguous (i.e. one choice), but in some cases there are multiple choices (e.g. treat ‘<‘ as less-than versus treat it as template). A normal LALR parser wouldn’t allow this (it would complain about shift/reduce conflicts), but GLR works around this problem by splitting the parser into two parsers, and hence exploring both opportunities simultaneously. If one of the parsers hits an error it simply vanishes and the other parser continues. If both parsers produce different parse trees for a given non-terminal this is an ambiguity, which can be handled by the parser implementer (e.g. one can emit an error).

Problems

(You can see the old parser, from the v1.3 release, here.)

While GLR was a good solution to the parsing problem, the parser had many problems including (from most significant to least significant):

  • Poor quality diagnostics – This is related to LL vs LR; the top-down LL parser means the context is unambiguous, so it’s easier to produce good error messages. It’s also related to the parser generator, since it’s difficult to get Bison to produce good errors (and adding extra rules for errors often seems to introduce shift/reduce conflicts).
  • AST must be held in a union – Bison requires that the AST nodes corresponding to non-terminals are kept in a union. This is unfortunate, because C++ cannot call class destructors in a union (since it doesn’t know which destructor to call). Hence all the AST nodes were referred to by heap-allocated pointer, with lots of memory leaks (I wrote the early compiler with too little regard for issues like this).
  • Parser is in one large file – It seems that Bison is designed to have the entire parser in a single file. The Loci GLR parser was around 2500 lines, which is impressively concise given the syntactical complexity of the language. However keeping all this logic in a single file makes it harder to move between different parts of the parser (e.g. type parsing versus value parsing), meaning that you tend to end up in a sea of code, not really knowing where you are. I tend to keep source files below 500 lines in length for this reason.
  • C++ <-> Bison communication is messy – This isn’t a deal breaker, but it does make the code harder to read, particularly when following the control flow of the parser. Code readability is extremely important for long-term maintenance of the compiler and for allowing others (beyond myself) to make sense of the code.
  • Invalid conflicts – LR grammars are always unambiguous, but all unambiguous grammars aren’t LR. While Bison did produce some useful shift/reduce conflicts, some conflicts simply arose from the structure of the grammar. Restructuring the grammar fixes the conflicts, but means the grammar is incomprehensible.
  • Token header file generated at build time – One of the tedious aspects of Bison is that you have to wait until build time to get a header file containing the tokens. This means that the tokens can’t be exposed in a pre-written library API, unlike the rest of the compiler code.

For a while most of these issues were manageable, since the goal was to build a compiler that works. However, now that the compiler has reasonably solid support for the language, the challenge is to make the compiler work well, part of which is to get a good user experience.

With this in mind, I decided that the poor diagnostics were no longer acceptable and proceeded to design a new parser.

New parser

Existing Compilers

Unsurprisingly there’s a lot of prior art when it comes to programming language parsers. Given the syntactic similarities between C++ and Loci, it seemed to make sense to look at the mainstream C++ compilers: Clang and GCC.

Clang

Clang is a relatively new compiler that supports C, C++ and Objective-C and targets LLVM. The design of Clang and LLVM (the two are closely linked) is well known to be excellent, since they thoroughly embrace modularity and well-defined API boundaries. The best example of this is LLVM IR, a representation documented in equisite detail that separates compiler front-ends and back-ends. As you may have guessed, the Loci compiler targets LLVM IR.

Given that it’s newer than most, Clang was designed with understanding gained from the development of older compilers. With this experience in hand, the Clang developers chose to write their C++ parser using a hand-written recursive descent parser. They say:

We are convinced that the right parsing technology for this class of languages is a hand-built recursive-descent parser. Because it is plain C++ code, recursive descent makes it very easy for new developers to understand the code, it easily supports ad-hoc rules and other strange hacks required by C/C++, and makes it straight-forward to implement excellent diagnostics and error recovery.

Clang is one of the fastest C++ compilers, produces excellent diagnostics and is used in library form as libclang in an increasing number of tools to do tasks such as code highlighting, analysis etc. So there’s every reason to believe these claims.

GCC

Unlike Clang, GCC is a relatively old compiler (by software standards) and has therefore travelled down a slightly different path. GCC started off a Bison LALR grammar, however in the early 2000s they switched to a hand-written recursive descent parser. Claimed advantages included:

  • Better error-messages
  • Better error-recovery
  • Simplification of many parts of the compiler that presently do funny things to work around oddities in the parser.
  • Easier to debug than a yacc parser. (Don’t have to understand the detailed works of yacc.)
  • Don’t have hard-to-understand data and control flow dependencies between lexer, parser, and rest of compiler.
  • Likely to be smaller, since you don’t have as many duplications due to contortions to deal with complicated parts fo the grammar.
  • Might be faster. – If the C grammar is re-written as the same time, you can perhaps share much of the code.
  • Can use same grammar is other applications (such as gdb), some of which might need a reentrant grammar, which is difficult with yacc. (A re-entrant grammar was a goal for cppexp.c.)

Building the parser

Looking at the existing compilers and analysing all the solutions available, it became clear that a hand-written recursive descent parser was the way to go. My goals were to:

  • Produce excellent diagnostics.
  • Keep the parser code as clear as possible.
  • Split the parser into multiple well-organised files.
  • Avoid using a union to hold AST nodes (and hence avoid potential memory leaks).
  • Ideally, use C++ code to avoid introducing dependencies.

Building a hand-written parser for a language like Loci is a significant project, a much larger scale parser than I’d created before. I therefore decided to first rewrite the lexer, at that time using Flex with similar issues to the Bison parser, in C++ code. This gave me a good test case to develop a structure for the parser and how to represent/issue diagnostics. When I finished the lexer I had a very clear idea about how the parser would be designed. In fact, if you look at the code you can see the lexer has an inferior design to the parser; this will be addressed in due course.

Testing the parser

Most of the compiler testing started off with using the compiler to build a program and then verify the output of running the program. Since then the testing has become more and more fine grained to reduce testing times and to achieve more comprehensive testing of the compiler.

The rewrite took this further by adding suites of unit tests for the lexer and parser; up to then only the support functionality (e.g. arrays) had unit tests. Unit tests essentially take a module, isolate it and then hammer it with a combination of pleasant and unpleasant situations to check it responds appropriately. For example, a unit test of an arithmetic module could hand it zero inputs, or max-sized inputs, to check it doesn’t explode with the edge cases.

The idea of unit testing isn’t new; (non-software) engineering regularly involves testing individual components to ensure they can withstand the expected stresses. This provides a high degree of confidence that when put together the system will operate properly, since only one component has to fail for the system to start to fall apart. Unit tests are also extremely extremely quick: a couple of seconds could cover many thousands of unit tests. And, since they depend only on the code being tested, they’re extremely reliable.

The new diagnostics

Along with the lexer and parser rewrites I added some initial infrastructure for ‘rendering’ diagnostics. Here are some examples of the many diagnostics that the compiler now emits:

These diagnostics are modelled after Clang, because they’re succinct, extremely readable and familiar to C++ developers.

Follow-on work

The new lexer and parser are now done, and the old Flex lexer and Bison parser have been removed. Further improvements will be made to both in due course. Current work is now upgrading Semantic Analysis to use the new diagnostics infrastructure, with the aim of releasing v1.4 in February with considerable improvements to the user experience. Semantic Analysis is also the next target of refactoring, as the slowest and messiest part of the compiler, to bring it up to a good level in preparation for the next batch of language features.

Summary

Re-writing the parser was a significant amount of work. I started just before Christmas (2015) and worked almost continuously until finishing in early January. However the result has been a much more usable and maintainable compiler.

The new lexer and parser were written in a different way to the existing code. A better way. Ultimately the biggest gain may have been reinvigorating my focus on improving programmer productivity and quality, which after all is what Loci is all about.

Ubuntu 15.10 kernel bug

I recently raised (as part of my job) an Ubuntu 15.10 kernel bug (titled “Kernel bug in mm/memory.c when ptrace poking to PROT_NONE map“) that’s hit when trying to use ptrace() to poke to a map that’s marked as having no permissions.

Based on comments from Linus Torvalds, it sounds like it used to be possible to use ptrace() to do things the process being controlled itself cannot do (it is possible to use ptrace() in this way on Ubuntu 14.04), but they’re removing such cases. He says:

Side note: as to why handle_mm_fault() doesn’t just do things itself,
there’s a historical situation where we used to let people do things
in ptrace() that they couldn’t do directly, and punch through
protections (and turn shared read-only pages into a dirty private
page).

So the permissions checking was up to the caller, because some callers
could do things that other callers could not.

I *think* we have gotten rid of all those cases, and I guess we could
consider just making handle_mm_fault() itself stricter. But that’s the
historical background on why callers need to check this.

So it makes sense that this might not work, however clearly it shouldn’t crash the kernel (fortunately we managed to work around the kernel bug in our own code). It’ll be interesting to see how this progresses.

Loci Compiler v1.3 released

The latest version of the Loci compiler is now available. See the release notes for detailed information about what’s included.

A lot of this release is clearing up the low level mechanisms of the language, including:

  • Specify object memory states, which is important for move semantics.
  • Adding sizeof() and alignmask() methods so that unsized types (e.g. opaque structs) can be represented.
  • Adding predicate aliases (or ‘named predicates’) to improve readability.
  • Fixing noexcept to be inferred correctly.
  • Adding static array type.
  • Fixing value templates (primarily for static array element count template argument).

There’s also been a lot of work on the language grammar and parser to work out the more difficult cases. It’s becoming increasingly clear that the Bison parser has many problems and so rewriting it as a recursive descent parser will soon be necessary. However in the meantime the priority is to augment the standard library and make the compiler easier to use.

I’m also planning to shorten the release cycle down to about 2-3 months, so the feature list for v1.4 is naturally trimmed down as well.

LLVM ARM and AArch64 nest attribute patches accepted

In the process of developing a mechanism for efficient virtual method calls in Loci I came across difficulties related to generating the appropriate assembly via LLVM IR. For more information, read the LLVM mailing list discussion.

Fortunately it turns out LLVM has been adding the required features recently, one of which is the ‘nest’ attribute that’s designed to be used for GCC’s nested functions extension to C (Clang needs to be compatible with GCC). The only issue was that while full support is already present for x86 32-bit and 64-bit, the ‘nest’ attribute wasn’t implemented on the LLVM ARM 32-bit backend or AArch64 backend.

Hence I created patches for ARM and AArch64 and these have now been accepted and both committed to LLVM. Separately support has also been implemented for PowerPC. These will all be going into LLVM 3.7.

llvm-abi announced on LLVM mailing list

I announced llvm-abi, a library I created to generate ABI-compliant LLVM IR, on the LLVM mailing list. It seems like other front-end developers are very happy to see this kind of change and are furthermore interested in changes to Clang to expose its ABI code generation functionality to ensure that everyone is using a single high quality code base rather than duplicating the same functionality over and over again.

The llvm-abi library started as a necessity for the Loci compiler, since it needs to generate C functions and must generate code that integrates very closely with C. Unfortunately I found this was a very involved task and required lots of target-dependent logic so I ended up pushing the functionality out of the compiler into a separate library so that other front-ends could benefit and code could be shared.

In the (admittedly unlikely) event that you’re a developer of an LLVM front-end, I strongly recommend using llvm-abi for your ABI compliance needs, particularly since the longer term plan is to have it share code with Clang. If you have any problems just raise an issue on the llvm-abi GitHub project.

WebAssembly – native code IR for the web

Right now various developers are getting excited about WebAssembly, which is a new IR (intermediate representation) designed for supporting languages like C and C++ (and Loci) with native-like performance. Most notably the intention is to remove the parsing overhead of asm.js (a subset of Javascript designed to be a target for native compilation), open up new low level functionality and stop using Javascript as the ‘assembly of the web’.

There’s been a long-held desire for something like this and it finally seems to be happening, given the broad support from browser vendors.

From my point of view the most interesting part of their documentation is the comparison to LLVM IR, in which they say:

The LLVM compiler infrastructure has a lot to recommend it: it has an existing intermediate representation (LLVM IR) and binary encoding format (bitcode). It has code generation backends targeting many architectures is actively developed and maintained by a large community. […] However the goals and requirements that LLVM was designed to meet are subtly mismatched with those of WebAssembly.

WebAssembly has several requirements and goals for its IR and binary encoding:

  • Portability: The IR must be the same for every machine architecture.
  • Stability: The IR and binary encoding must not change over time (or change only in ways that can be kept backward-compatible).

[…]

LLVM IR is meant to make compiler optimizations easy to implement, and to represent the constructs and semantics required by C, C++, and other languages on a large variety of operating systems and architectures. This means that by default the IR is not portable (the same program has different representations for different architectures) or stable (it changes over time as optimization and language requirements change). […] LLVM’s binary format (bitcode) was designed for temporary on-disk serialization of the IR for link-time optimization, and not for stability or compressibility […]

This is a very insightful comparison of WebAssembly versus LLVM IR. A lot of people have wanted LLVM IR to be a portable and stable representation, but that directly contradicts its goals to be excellent for performing optimisations and for splitting frontends/backends. There was a mailing list discussion in 2011 about what LLVM IR is (“LLVM IR is a compiler IR”) and the use cases it does and doesn’t support.

The new model seems to be using WebAssembly to communicate code across machines/platforms/architectures but on the machines themselves a translation to/from LLVM IR seems likely (of course platforms can use whatever internal representation they prefer). Logically then, work is beginning on a new WebAssembly back-end for LLVM.

Fortunately, now that we have WebAssembly, it looks like LLVM IR can be left to focus on its core objectives. So while WebAssembly is standardised as a stable, portable and backwards compatible representation for communicating programs between machines, LLVM IR can be continuous modified/improved to enable state-of-the-art optimisation and further develop the capabilities of front-ends and back-ends.