Skip to content
Awsaf Alam
GithubCodeforcesLinkedin

Paper Review - Rust as a Language for High Performance GC Implementation

Backend, Rust, Garbage Collector4 min read

This paper introduces an implementation of the IMMIX Garbage collector (GC) in Rust. When compared to a simple Mark-sweep collector, IMMIX is a high-performance garbage collector with a well-documented implementation that helped the authors during the implementation phase. They implemented the GC on both Rust and C and compared their performance. On microbenchmarks, they have similar performance but the Rust implementation outperforms the Boehm-Demers-Weiser (BDW) collector on gcbench microbenchmark. The authors show that a high-level language such as rust highly benefits developers and helps reduce a lot of the errors that are caused due to type-safety and memory-safety.

The authors initially identify that the maintenance of GC is problematic. In most cases, multiple levels of concurrency are needed, which makes it prone to bugs. Normally C++ would be used, but it is not very developer friendly and does not provide the safety guarantees that Rust does. The authors conclude that Rust can help solve major problems without compromising performance.

Rust has a lot of advantages as well as limitations. Some of these features such as the ownership model for variables, static checker & reference borrowing help Rust maintain type, memory, and thread safety. Variable binding in Rust gives a variable exclusive ownership of the value to which it is tied. When an item is borrowed, its ownership remains fixed. This reduces computation by a large amount. Data races are also avoided by requiring that references to both mutable (write) and immutable (read) objects be mutually exclusive. Rust has various wrapper types which guarantee data safety. It offers a safe environment, free from memory errors and data races. This, however, can sometimes be either, too limited or too costly. When implementing the GC, the authors ensure high performance, safe code, and that no changes are made to the Rust language itself.

Effective memory management requires abstraction over user-level objects and raw addresses. In order to differentiate between an address and an object reference, Rust encapsulates addressing types. Address is a location where address arithmetic is permitted, whereas object reference refers to the object at the language level and points to the raw memory that contains the object's data. It is not possible to securely cast an Address into an ObjectReference, however, this may be done via an "unsafe" cast. On the other hand, it is always possible to cast an ObjectReference to an Address.


It is standard to have a large pool of unallocated memory from which local thread collectors pull memory as needed and push it back when recovered. The concept of "block", where each block has its own memory range and meta-data, is introduced to do this. Every block that the GC manages must be assured to be in one of three states: usable, utilized, or assigned to a single thread. The global pool is the owner of every block. A list of the useable and used blocks is created. Allocators collect ownership from the useable Block list. The collector looks for memory among utilized Blocks during collection and marks them as available for subsequent allocation if they are free.

The authors use Rust to access local thread data and globally shared data by using Rust's ownership model and Wraper types. Parallelism is implemented using standard library in Rust and this improves code re-useability.

In some cases, the authors found Rust's type-safety too confining. In these cases, they used “unsafe” code. When implementing “card tables” and “mark tables”, multiple writers need to access a bytemap. However, this operation is not allowed in Rust. So, the authors needed to compromise “safe-code” by declaring an array that multiple writers can access. Hence, the responsibility here lies on the programmer instead of Rust to validate safety.

Authors claim that the Rust implementation results in much safer code and better performance overall. Although GC is operated heavily on raw memory, since Rust is typesafe, it ensures 96% of the total code is safe. The rest of the 4% unsafe code mainly comes where raw memory is used, here the authors use unsafe casting. This unsafe portion is minimized by using proper abstractions.

In the micro-benchmarks, performance is measured based on tasks such as allocation, object marking, and object tracing. Same metrics are used for both C and Rust and time spent is measured. The Rust implementation is comparable to C when performance tuning is done in performance-critical paths, and when tuning is not done, it is 10% slower.

Performance scaling of parallel GC is also measured, and the authors observed some performance degradation when the number of threads is increased. They provided a hypothesis that this could be improved by fine-tuning, and this can be identified as a possible future work that the authors left incomplete.

When comparing with the BDW collector on the gcbench and mt-gcbench microbenchmarks 79% and 2x improvement was found respectively, this is mainly because the Rust implementation uses a bump pointer allocator versus a free list allocator in BDW. GC is conservative when it comes to stacks and heap, but BDW supports dynamic expansion of discontiguous heap.

In conclusion, the authors found that Rust's programming language is restricted, but not in an unnecessary way. Most of the GC can be implemented without compromising Rust's safety guarantees. Moreover, having standard libraries, and typechecking makes it easier to find bugs and maintain the codebase while maintaining a similar level of performance compared to C.