Rust: More Than Just Performance

Rust has been voted Stack Overflow’s most loved language five years in a row. Taking advantage of its relative youth by learning from those that came before it, Rust has attracted so many developers because it was designed to alleviate a lot of common problems seen in other languages.

The benefits of Rust go beyond just performance or static typing. It’s a hybrid between object-oriented and functional languages, with a lot of features derived from other languages that add up to something truly unique. With that in mind, I wanted to share some of my favorite aspects of this much-loved language (apparently named after a fungus!), including…

  • Immutability by default, mutability by choice
  • Simple types with powerful traits
  • Enums and pattern matching
  • No null pointer exceptions
  • Errors are obvious (and must be handled)

Let’s dive in!

(Im)mutability

We can bind variables in two different ways, depending on whether or not we want the variable to be mutable.

// The `main` function is the entrypoint of Rust applications
fn main() {
    // The types of variables rarely need to be declared, as the
    // compiler is usually able to infer it.
    //
	// `let` bindings are immutable by default...
    let my_tuple = (1, "42");
    
    // ... meaning the variable can't be reassigned
    // error: cannot assign twice to immutable variable `my_tuple`
    // my_tuple = (2, "713");

	// ... and the tuple itself can't be changed
    // error: cannot assign to `my_tuple.0`, as `my_tuple` is not declared as mutable
    // my_tuple.0 = 2;
    
    // But we can make mutable bindings with `let mut`...
    let mut another_tuple = (1, 2, 3);
    
    // ... and can reassign the variable...
    another_tuple = (3, 4, 5);

    // ... as long as length and types are the same
    //
    // error: expected integer, found `char`
    // another_tuple = (3, 4, 'c');
    //
    // error: expected a tuple with 3 elements, found one with 4 elements
    // another_tuple = (3, 4, 5, 6);
    
    // ... and we can mutate the object directly - again, as long as type is the same
    another_tuple.0 = 15;
}

Not only do we have to opt into mutability at the variable level, but functions need to declare whether or not their parameters need to be mutable, and the compiler forbids us from passing non-mutable arguments.

// Ignore the `&` for now; it just means a *reference* to a string.
// `&mut` means a *mutable* reference
fn add_newline(s: &mut String) {
    s.push('\n');
}

// `&String` means an *immutable* reference.
// `usize` refers to an unsigned int whose size is determined
// by platform (eg. 32- or 64-bit) and is the integer type used for
// lengths and array indices, since negative numbers make no sense
// in those contexts.
fn string_length(s: &String) -> usize {
    s.len()
}

fn main() {
    // Literal strings are stored on the stack and can never be changed,
    // but for many purposes we need heap-allocated ones (the `String` type),
    // which we can get via the `.to_string()` calls.
    let string1 = "Go, Team Venture!".to_string();
    let mut string2 = "Go, Team... Brothers..? Guys..?".to_string();

    // Even though we declared `string2` as mutable, we can still
    // reference it immutably for the sake of the function call
    string_length(&string1);
    string_length(&string2);

    // But we can't pass immutable references when `mut` is expected
    // error: types differ in mutability
    // add_newline(&string1);

    // And we can't 'upgrade' an immutable variable to a mutable reference
    // error: cannot borrow `string1` as mutable, as it is not declared as mutable
    // add_newline(&mut string1);

    // And even though we declared `string2` as mutable, we still
    // need to declare the reference as `mut`
    add_newline(&mut string2);
}

The main reason for controlled mutability is preventing data races at compile time, but I find it helps a lot with reasoning about my code - unexpected mutation can be a really big (and subtle) problem, so explicitly requiring mut makes it obvious which code paths can and cannot mutate data, which is particularly helpful when calling third-party package functions.

(Aside: Note the lack of both a return keyword and a ; in string_length? Without getting into detail about statements versus expressions, omitting them in the last  line of a function returns the value automatically.)

Types and traits

Rust's expressive type system enables us to encode business logic in types. Say we want to write a can_rent function that determines whether someone is able to rent a car or not based on a minimum age. We could decide to pass in an int (unsigned, because age can't be negative) and assume it represent age in years.

fn can_rent(age_in_years: usize) -> bool {
    age_in_years >= 25
}

That works, but it's not really explicit other than the parameter name that age should be in years, and the compiler simply can't enforce that. Why have a compiler if we can't make it do a lot of work for us?

Let's make our own type by wrapping a single value, otherwise called the "new type" pattern. There's no runtime overhead because the compiler optimizes it away to just an int, but this lets us control how the int can be used as our code has to interact with the new type. While we're at it, let's also make a Person structure that can hold an age.

// Custom types can have unnamed fields, in which case the struct
// essentially acts like a tuple...
struct AgeInYears(usize);

// ... or they can have named fields, which more closely resembles
// objects in many other languages
struct Person {
    age: AgeInYears,
}

fn can_rent(age: AgeInYears) -> bool {
    // Use `.0` to access the first 'field' of the type
    // as if it were a tuple
    age.0 >= 25
}

fn main() {
    let person = Person { age: AgeInYears(53) };

    can_rent(person.age);     // true
    can_rent(AgeInYears(15)); // false
}

I still don't like having to manually 'convert' by doing person.age or AgeInYears(15). While Rust doesn't have function overloading, traits can give us similar functionality. They define common methods (like Go interfaces) rather than common properties (like Typescript interfaces), and we can use them to express generic types.

struct AgeInYears(usize);

struct Person {
    age: AgeInYears,
}

// We can implement the standard library `Into` trait, which essentially
// defines how to convert from a `Person` to an `AgeInYears`
impl Into<AgeInYears> for Person {
    fn into(self) -> AgeInYears {
        // In this case it's simple - just return the actual object
        self.age
    }
}

// But we can ALSO dictate how built-in types can be converted
impl Into<AgeInYears> for usize {
    fn into(self) -> AgeInYears {
        AgeInYears(self)
    }
}

// `impl Into<AgeInYears>` just says, "accept any type that
// can be converted into an AgeInYears".
fn can_rent(age: impl Into<AgeInYears>) -> bool {
    // Using `.into()` converts the argument into the type we want
    // and we don't have to check what type the argument actually was
    age.into().0 >= 25
}

fn main() {
    let person = Person { age: AgeInYears(53) };

    // Simple calls, no more manually passing in an AgeInYears
    can_rent(person);
    can_rent(15);
}

Enums

AgeInYears(usize) is more explicit than a plan integer, but sometimes we might prefer a distinct set of possible types. For this, we can use enumerated types which, despite the name, are really tagged unions that can be cast to ints like typical enum types in other languages. Perhaps instead of AgeInYears we want a choice between different timespans.

enum Age {
    Days(usize),
    Years(usize),
}

struct Person {
    age: Age,
}

impl Into<Age> for Person {
    fn into(self) -> Age {
        self.age
    }
}

impl Into<Age> for usize {
    fn into(self) -> Age {
        Age::Years(self)
    }
}

fn can_rent(age: impl Into<Age>) -> bool {
    let minimum_years = 25;

    // Now that we can have different variants, we can't use the
    // `Age` value without pattern matching. Eg. if it's the
    // `Days` variant, then bind the `usize` value to the variable
    // `days` and run the block.
    match age.into() {
        Age::Days(days)   => days  >= minimum_years * 365,
        Age::Years(years) => years >= minimum_years,
    }
}

fn main() {
    // We can now decide on an instance level whether a person's
    // age should be in days or years
    let person1 = Person { age: Age::Years(53) };
    
    // Numeric literals can have `_` in them to aid readability
    let person2 = Person { age: Age::Days(7_019) };

    can_rent(person1); // true
    can_rent(person2); // false
}

One of the best parts about pattern-matching is that match is exhaustive, meaning we have to cover every single possibility or the program won't compile. That means if we add another age option...

enum Age {
    Days(usize),
    Years(usize),
    Centuries(usize),
}

... then we have to update our match expression and check for that other option.

No null pointers

Null is a useful concept, even if Tony Hoare famously called it his billion-dollar mistake. Rust doesn't have it, though, which means Rust doesn't have null pointer exceptions. Instead, Rust uses a different strategy (borrowed from Haskell) to express nullability - the generic Option enum.

enum Option<T> {
    Some(T),
    None,
}

The <T> is a generic type parameter which is beyond our scope here, but this basically means any Option<T> value will either be None or the Some variant that itself contains a value of type T.

And like our Age example above, we have to pattern match on an Option value to use it. The compiler enforces that we can't ever use an Option value without knowing which variant it is.

Error handling

I want to know every possible exception that can be raised by a code path. This is often impossible, though, as exceptions in other languages typically aren't documented in function signatures and package authors rarely note them in documentation. We have to see what happens at runtime, adjust our code, write some tests, and deploy again.

Like null, though, exceptions basically don't exist in Rust (unless a program 'panics'). And, like null, errors are represented by an enum.

enum Result<T, E> {
    Ok(T),
    Err(E),
}

As with other enums, a Result value simply cannot be used without checking which variant it is, and the compiler forces us to consider each case.

Let's implement a Centuries variant for Age. The complication is that if a person is 0 centuries old, we actually can't know if they're old enough to rent a car. They could be 1 month old or 99 years, and that means can_rent is now fallible and we should return an error in that situation. And, rather than returning a bool when things are okay, we now need to return an enum value that contains a bool, eg. Ok(true).

enum Age {
    Days(usize),
    Years(usize),
    Centuries(usize),
}

struct Person {
    age: Age,
}

impl Into<Age> for Person {
    fn into(self) -> Age {
        self.age
    }
}

impl Into<Age> for usize {
    fn into(self) -> Age {
        Age::Years(self)
    }
}

fn can_rent(age: impl Into<Age>) -> Result<bool, String> {
    let minimum_years = 25;

    match age.into() {
        // We need to return bools wrapped by `Ok` now
        Age::Days(days)   => Ok(days  >= minimum_years * 365),
        Age::Years(years) => Ok(years >= minimum_years),
        
        // And we can also pattern match on literal values!
        Age::Centuries(centuries) => match centuries {
            0 => Err("We can't figure it out!".to_string()),
            
            // `_` is a catchall - since `match` is exhaustive, the compiler
            // forces us to consider every case. In our situation,
            // any non-zero century count is obviously old enough.
            _ => Ok(true)
        },
    }
}

fn main() {
    let person1 = Person { age: Age::Years(53) };
    let person2 = Person { age: Age::Days(7_019) };
    let person3 = Person { age: Age::Centuries(0) };

    // The return values are enum variants now
    can_rent(person1); // Ok(true)
    can_rent(person2); // Ok(false)
    can_rent(person3); // Err("We can't figure it out!")
}

I used a String as my error type for simplicity, but I would generally define custom types (either a struct or an enum) to embed more useful information. The subtle implication of using Result for errors is that the function signature explicitly states the full range of possible errors that can be returned.

Wrapping up

Rust is a fantastic general purpose language that is already gaining traction among web developers.

Rust's power and expressiveness comes at a cost, as the learning curve is steep (primarily due to the "ownership" memory model) and development can be slower than in most other common web languages. Paying those costs up front, however, eliminates many others down the road. After npm rewrote one of their services in Rust, their engineers claimed to "forget about the Rust service because it caused so few operational issues".

If you want to learn more, head over to Learn Rust and check out The Book - it's perhaps the best free language resource that I've ever seen.