"Memory Safe" - Gonçalo Tomás

Rust is a language with a lot of baggage for its relatively young age. There have been countless dramas, backlash, and people drinking way too much Kool-Aid, to the point that even mentioning the language tends to evoque certain... feelings.

I've decided to leave all of the drama to the side and check out the language itself since Rust is hardly the only programming language enveloped in drama... After investing around 100 hours of my free time into Rust I can confidently say I enjoy it, but first I need to address some of the claims that people have about the language.

A critique on common Rust evangelism

Many times have I heard people going on ad nauseam about the famous Microsoft report and how you should choose Rust for everything because it is "memory safe". This, frankly, ticks me off.

On a first look, we must confront the reality that not all programming languages experience the memory safety issues that Rust aims to solve. Thinking that Rust is the best language because it is memory safe completely ignores the fact that most languages and tech stacks used in production today are memory safe: Java, C#, Python, NodeJS, Erlang and many others are, for most purposes, memory safe.

Once we get through the first hurdle of Rust supposedly being the best language, whatever that means, we usually get to the argument of "it's better than C and C++". Let's examine this further.

"C/C++ are both fundamentally unsafe"

It is widely known that C and C++ have many known undefined behavior and memory safety issues. However, the community on both languages has built tools and frameworks to limit much of the problem space, to the point that this is mostly a solved issue. Notably:

There are memory safe subsets of these languages: the development guidelines of MISRA C, C++ Core Guidelines, JSF Coding Standards, F' and AUTOSAR C++ prohibit the usage of some language constructs like recursion, exceptions or manual memory management (malloc, free) in favor of alternatives that make control flow more explicit. Programs built with these guidelines can then go through multiple rounds of code analysis in order to get certified as safe for automotive, aerospace and military use, which are some of the strictest environments to program for.¹
It is possible to guarantee memory safety for programs written in these languages: a lot of effort and capital has been put into attesting the safety of C/C++ codebases through formal verification methods. Here are some examples of tools that do this: AbsInt, Polyspace, MALPAS and Verifast. Some of these tools are commercial while others are research tools.

The idea that Rust is coming to fix a whole set of industries relying on unsafe code is risible. From an outsiders perspective it's hard to quantify the collective effort that was put into making sure software in critical applications is safe, but suffice it to say that the statement "C/C++ are fundamentally unsafe" ignores the state of the art for developing in these languages.

Lastly, I should note that in order for a piece of software to be "secure", memory safety is necessary but not sufficient. Memory safety does not prevent logic bugs and indeed there have been plenty of CVEs for Java and C#, for example, which are considered memory safe languages.

TL;DR

Many programmers' first time hearing about Rust is a fallacious argument about how it is superior or "safer" than C/C++. Not only do industry veterans know that C/C++ can be verified to be safe, it's not hard to figure out that Rust can be unsafe too. This detracts people from giving Rust an honest try, as was my case, which is tragic because the language is actually great.

Actually trying out Rust

It took me way longer than it should to ignore the hype around the language and actuallly try it for myself instead of just borrowing someone else's opinion. I wish I had tried it sooner! Rust has a unique combination of features that are relevant to me.

First off, it's an imperative programming language with just the right amount of functional programming features built in:

Pattern matching: being used to Elixir's rich pattern matching, when I write code in an imperative programming language this is almost always the first thing I miss. Rust's pattern matching is quite robust and you can even unpack struct values within a match statement. If you don't use catch all clauses, you have to match for all possible values.
Immutable variables: when looking at a piece of code I've never seen before, knowing that variables are immutable helps making sense of it. Usually immutable variables hold helper or intermediate values, and actual work happens on a couple of mutable variables. The mut keyword is a great way to immediately track where most of the work is happening.
Option and Result: these language constructs are quite similar to the somewhat-common-but-inconsistent pattern of wrapping the results in a {:ok, value} or {:error, error} in Elixir. In my limited experience I found Optional to be most useful for nullable values and Result more common for function returns.
Functional methods: commonly used Enum functions from Elixir like map, filter and many others are present, making data processing quite enjoyable.

Enums

Rust's enums are intriguing: instead of the usual way of just representing different values, which you can do with a list of atoms in Elixir, Rust allows enums to have parameters like this:

enum Projectile {
    Laser(damage: u8),
    Bullet(damage: u8, armor_piercing: bool),
    Arrow(damage: u8, range: u8, incendiary: bool),
}

Memory alignment issues aside, this allows modeling a surprising number of things such that I don't even have to define a struct. As soon as you add or change any of the Projectile values above, the compiler checks each usage to see if all possible values are handled. Having been bitten by many function clause errors in Elixir due to having forgotten one of the possible values, I appreciate this.

Compiler, tooling & ecosystem

Speaking of the compiler, there's a lot of good things to say! I have yet to find a compiler error that didn't tell me exactly what the problem is and how to fix it.² Having debugged Erlang dialyzer errors for hours with no information to go on from, this is a definitive step up.

Rust is a language that tends to be on the more pedantic side where all cases must be handled or the compiler will return errors. I thought this was cool at the beginning, but it became somewhat annoying after a while; when I'm trying to work through the success path in my head I usually want to keep going before I think of what can go wrong, and it turns out unwrap exists for this exact purpose. This method crashes if the operation does not return successfully, and you can add a linting rule to disallow this in CI. I found the ergonomics of unwrap to be mixed, because it does allow you to proceed writing the happy path, but the subsequent backtrack to handle the edge cases is not very clean. This is likely a skill issue and something I'll try to improve.

Cargo is Rust's package manager and build tool. It does what it's meant to do, does it quickly and gets out of my way. That's good enough for me.

As for the packages in crates.io, I found some statistics worth noting. Recall that Rust is a compiled language with no runtime or garbage collection, and the "no runtime" is yet another one of the features that zealots yell out into the void. Well, about 19% of Rust's packages depend on an async runtime³, which is much higher than I expected. Looking a bit deeper, it looks like some libraries like the postgres client are able to provide both synchronous and asynchronous versions of their packages, and if that's the case for most then it shouldn't be that big of an issue.

crates.io deserves some merit for being well thought out: it has all the metrics on packages that Hex.pm does and more. The number of lines of code is listed, and there's a independent link to the source code (not to the original repository but to the published package source). Finally, there's a Security tab that lists out all of the CVEs for every package. This sets a high bar for package managers that I've yet to see elsewhere.

Rust and Elixir

Rustaceans⁴ are dedicated to increasing Rust adoption, so they go out and build interesting projects like Rustler. Having a project like this that enables me to write Rust NIFs instead of C or C++ is actually pretty cool. Similar projects exist for interop with other languages, but I thought it was quite a coincidence that there was a bridge library from my current language of choice (Elixir) into Rust.

I've talked about how Rust's performance surprised me in another post. Casey Muratori's excelent Performance Aware Programming series come to mind for one specific reason. In it, he outlines a hypothesis that high level programming languages have become so pervasive and so little has been invested in optimizations that we have grown unaware of the waste they produce.⁵ Rust to me feels like a programming language that is focused on reducing that waste with no compromises, which feels empowering.

It's interesting to compare how both languages approach failure: Elixir and the underlying OTP foundation assume that failure is certain, and build reliability through supervision trees, while Rust takes its strong type system and the borrow checker to guarantee that all possible cases are handled, with some minor exceptions. I like the idea of doing away without supervision trees because the program is verified for possible errors at compile time.

Having programmed Elixir and Erlang for the better part of 7 years now I'll likely reach out to Elixir to prototype something quick, especially if it's web related. But for other things, if time allows, I'd also like to try building them with Rust.

My learning experience

The learning resources I tried were superb. They were most welcoming to beginners and were expertly written, no doubt making the learning curve much less steep than normal. I went through 100 Exercises To Learn Rust in a couple of weeks and it has now become my benchmark for an introduction to a programming language.

The beginner experience is dramatically impacted by code completions, so I recommend turning them off and actually typing everything out in order to grow some muscle memory about writing Rust.

Initially I dreaded bumping into a borrow checker error that I couldn't fix, but so far I've been able to fix errors quite quickly, which I'm considering a mix of luck and me not having tried really complex code. While somewhat fearful of async Rust and lifetimes, I'll definitely keep learning Rust as I can.

I'm slightly disappointed for dismissing Rust due to it being overhyped. As it turns out, even if something is overblown with dubious claims, it can still hold merit. I'm glad I decided to try with low expectations because I found something I really liked.

There's also Fil-C, which is a new approach allowing existing code to be made memory safe by introducing runtime guardrails on pointers: it allows you to compile an existing C/C++ project with strict memory safety guarantees that go beyond Rust's guarantees. It is highly compatible with existing code, and the author has succeeded in compiling an operating system and many tools, including a web browser, building one of the first memory safe versions of mainstream operating systems. This comes with a performance cost, but it is a novel and interesting approach nonetheless. ↩
I haven't worked with async Rust yet, but I have tried out multi-threading in Rust and found the compiler to be quite helpful. ↩
At the time of writing there are ~227.000 published crates, and the number of Dependents for the tokio, smol and async total ~42.500 ↩
this is the casual term for Rust programmers and enthusiasts ↩
In one of his videos, on a simple example comparing a Python and C implementation, there was a speed up of around 300x on the C implementation. The deeper you look, the more you find: if C is orders of magnitude faster than Python, Assembly can also be multiple times faster than C. In a recent tweet, ffmpeg reported that a custom written assembly function resulted in a 35x speedup over the C implementation through the use of SIMD instructions. I wholeheartedly recommend Casey's Performance Aware Programming series, there are so many great nuggets of information there that I could write a whole post about it. ↩