guh.me - gustavo's personal blog

The Rust Programming Language

My personal notes on the book The Rust Programming Language.

Cargo

Create a new project: cargo new {name}. Compile a project: cargo build Compile and run a project: cargo run Checks if the code compiles: cargo check Build for release with optimizations: cargo build --release Update the crates: cargo update

Introduction

Rust imports only a few types into the scope of every program (the prelude). Variables are can be declared with let. Mutable values are declared with let mut. Rust has Result enumerations that can be either Ok or Err. .cmp can be called on anything that can be compared. match expressions are used for pattern matching. Rust allows variable shadowing (i.e. variable name reusing). Shadowing is different than making a variable mutable. String::parse() parses a string into a numeric value (it depends on the variable type). loop creates an infinite loop. You can break them with break and skip an iterate with continue. Constants can be declared with const. The type of the constant must be expliclitly defined.

Data Types

u32 or i32 are the fastest integers Rust supports. usize and isize are most useful when indexing some sort of collection (they are platform dependent). Rust has f32 and f64 primitive floating point types. Booleans have type bool. char represents an Unicode Scalar Value, and can be declared with single quotes let chr = '😻';. Tuples can be defined with parens let tup: (i32, f64) = (500, 6.4). The tuple elements can be accessed with tup.0. Arrays can be defined with let months = ["Jan", "Feb", "Mar"];. An array type and length can be defined with let a: [i32; 5] = [1, 2, 3, 4, 5];

Functions

Functions are defined with fn. Statements do not return values and finish with semicolons ;. Expressions do return values and finish without semicolons. Blocks are expressions with its own scope, and can be created with { let code = 5; code }.

Control Flow

if expressions: if {} else if {} else {}. They can also be used on the left side of a let statement. loops: loop, while, for. They can be broken with break.

Ownership

Ownership is Rust’s approach to memory management.

Data stored in the stack (LIFO) must have a known, fixed size.Data with an unknown size at compile time or a size that might change must be stored on the heap instead.

If we do want to deeply copy the heap data of a String, not just the stack data, we can use a common method called clone. Otherwise Rust will only move the data (do a shallow copy).

Rust has a special annotation called the Copy trait that we can place on types like integers that are stored on the stack. If a type implements the Copy trait, an older variable is still usable after assignment.

Ownership and Functions

The semantics for passing a value to a function are similar to those for assigning a value to a variable. Passing a variable to a function will move or copy, just as assignment does.

References and Borrowing

The ampersands in Rust are references, and they allow you to refer to some value without taking ownership of it.

fn main() {
    let s1 = String::from("hello");

    let len = calculate_length(&s1);

    println!("The length of '{}' is {}.", s1, len);
}

fn calculate_length(s: &String) -> usize {
    s.len()
}

References are immutable by default, but a mutable reference can be defined with &mut String.

Slices

A string slice is a reference to part of a String, and it looks like this:

let s = String::from("hello world");

let hello = &s[0..5];
let world = &s[6..11];

let slice = &s[..2]; // starts at index 0
let slice = &s[3..]; // ends at len()

We can create slices using a range within brackets by specifying [starting_index..ending_index], where starting_index is the first position in the slice and ending_index is one more than the last position in the slice.

Structs

A struct is composed of multiple fields.

struct User {
    username: String,
    email: String,
    sign_in_count: u64,
    active: bool,
}

To use a struct after we’ve defined it, we create an instance of that struct by specifying concrete values for each of the fields.

let user1 = User {
    email: String::from("[email protected]"),
    username: String::from("someusername123"),
    active: true,
    sign_in_count: 1,
};

To get a specific value from a struct, we can use dot notation. If the instance is mutable, we can change a value by using the dot notation and assigning into a particular field.

let mut user1 = User {
    email: String::from("[email protected]"),
    username: String::from("someusername123"),
    active: true,
    sign_in_count: 1,
};

user1.email = String::from("[email protected]");

Using the Field Init Shorthand when Variables and Fields Have the Same Name:

fn build_user(email: String, username: String) -> User {
    User {
        email,
        username,
        active: true,
        sign_in_count: 1,
    }
}

Creating Instances From Other Instances With Struct Update Syntax:

let user2 = User {
    email: String::from("[email protected]"),
    username: String::from("anotherusername567"),
    ..user1
};

Using Tuple Structs without Named Fields to Create Different Types:

struct Color(i32, i32, i32);
struct Point(i32, i32, i32);

let black = Color(0, 0, 0);
let origin = Point(0, 0, 0);

Methods are similar to functions, but theyre defined within the context of a struct (or an enum or a trait object), and their first parameter is always self`, which represents the instance of the struct the method is being called on.

struct Rectangle {
    width: u32,
    height: u32,
}

impl Rectangle {
    fn area(&self) -> u32 {
        self.width * self.height
    }
}

fn main() {
    let rect1 = Rectangle {
        width: 30,
        height: 50,
    };

    println!(
        "The area of the rectangle is {} square pixels.",
        rect1.area()
    );
}

Another useful feature of impl blocks is that we’re allowed to define functions within impl blocks that don’t take self as a parameter. These are called associated functions because they’re associated with the struct. They’re still functions, not methods, because they don’t have an instance of the struct to work with. You’ve already used the String::from associated function. Associated functions are often used for constructors that will return a new instance of the struct.

impl Rectangle {
    fn square(size: u32) -> Rectangle {
        Rectangle {
            width: size,
            height: size,
        }
    }
}

Rectangle::square(5);

Enums and Pattern Matching

Defining and using an enum:

enum IpAddrKind {
    V4,
    V6,
}

let four = IpAddrKind::V4;
let six = IpAddrKind::V6;

Or even so - you can put any kind of data inside an enum variant: strings, numeric types, or structs, for example.

enum IpAddr {
    V4(u8, u8, u8, u8),
    V6(String),
}

let home = IpAddr::V4(127, 0, 0, 1);

let loopback = IpAddr::V6(String::from("::1"));

We’re also able to define methods on enums:

enum Message {
    Quit,
    Move { x: i32, y: i32 },
    Write(String),
    ChangeColor(i32, i32, i32),
}

impl Message {
    fn call(&self) {
        // method body would be defined here
    }
}

let m = Message::Write(String::from("hello"));
m.call();

The Option type is a enum that is used in many places because it encodes the very common scenario in which a value could be something or it could be nothing.

enum Option<T> {
   Some(T),
   None,
}

let some_number = Some(5);
let some_string = Some("a string");

let absent_number: Option<i32> = None;

The match Control Flow Operator

match allows you to compare a value against a series of patterns and then execute code based on which pattern matches.

enum Coin {
    Penny,
    Nickel,
    Dime,
    Quarter,
}

fn value_in_cents(coin: Coin) -> u8 {
    match coin {
        Coin::Penny => 1,
        Coin::Nickel => 5,
        Coin::Dime => 10,
        Coin::Quarter => 25,
    }
}

With Option:

fn plus_one(x: Option<i32>) -> Option<i32> {
    match x {
        None => None,
        Some(i) => Some(i + 1),
    }
}

Matches in Rust are exhaustive: we must exhaust every last possibility in order for the code to be valid. _ can be used as a catch-all pattern.

The if let syntax lets you combine if and let into a less verbose way to handle values that match one pattern while ignoring the rest. However, you lose the exhaustive checking that match enforces.

let some_u8_value = Some(0u8);
if let Some(3) = some_u8_value {
    println!("three");
}

Packages, Crates and Modules

Rust has a number of features that allow you to manage your code’s organization:

A package must contain zero or one library crates, and no more.

src/main.rs and src/lib.rs are crate roots.

Modules let us organize code within a crate into groups for readability and easy reuse. They also control the privacy of items, which is whether an item can be used by outside code (public) or is an internal implementation detail and not available for outside use (private).

mod front_of_house {
    mod hosting {
        fn add_to_waitlist() {}
        fn seat_at_table() {}
    }

    mod serving {
        fn take_order() {}
    }
}

To show Rust where to find an item in a module tree, we use a path, which can take two forms:

Both absolute and relative paths are followed by one or more identifiers separated by double colons (::).

pub fn eat_at_restaurant() {
    // Absolute path
    crate::front_of_house::hosting::add_to_waitlist();

    // Relative path
    front_of_house::hosting::add_to_waitlist();
}

We can also construct relative paths that begin in the parent module by using super at the start of the path.

The way privacy works in Rust is that all items (functions, methods, structs, enums, modules, and constants) are private by default. To define a public module/function/struct/enum use pub:

mod front_of_house {
    pub mod hosting {
        pub fn add_to_waitlist() {}

        pub struct Breakfast {
            pub toast: String,
            seasonal_fruit: String,
        }

        pub enum Appetizer {
            Soup,
            Salad,
        }
    }
}

We can bring a path into a scope once and then call the items in that path as if they’re local items with the use keyword.

mod front_of_house {
    pub mod hosting {
        pub fn add_to_waitlist() {}
    }
}

use crate::front_of_house::hosting;

pub fn eat_at_restaurant() {
    hosting::add_to_waitlist();
    hosting::add_to_waitlist();
    hosting::add_to_waitlist();
}

Paths brought into scope with use also check privacy, like any other paths.

The as keyword lets you assign an alias for a type:

use std::fmt::Result;
use std::io::Result as IoResult;

When we bring a name into scope with the use keyword, the name available in the new scope is private. To enable the code that calls our code to refer to that name as if it had been defined in that codes scope, we can combine pubanduse`.

pub use crate::front_of_house::hosting;

pub fn eat_at_restaurant() {
    hosting::add_to_waitlist();
    hosting::add_to_waitlist();
    hosting::add_to_waitlist();
}

If we want to bring all public items defined in a path into scope, we can specify that path followed by *, the glob operator:

use std::collections::*;

Common Collections

Vec<T>, also known as a vector, allows you to store more than one value in a single data structure that puts all the values next to each other in memory.

let v: Vec<i32> = Vec::new();
let v = vec![1,2,3];

let mut v = Vec::new();
// Add items to a vector
v.push(5);
v.push(6);
v.push(7);

// Read from the vector and panics on failure
let third: &i32 = &v[2];

// Read from the vector and handles failure
match v.get(2) {
    Some(third) => println!("The third element is {}", third),
    None => println!("There is no third element."),
}

// Iterate over the vector values
let v = vec![100, 32, 57];
for i in &v {
    println!("{}", i);
}

Strings work much like vectors. But if you need to perform operations on individual Unicode scalar values, the best way to do so is to use the chars method.

for c in "नमस्ते".chars() {
    println!("{}", c);
}

// bytes returns each raw byte, which might be appropriate for your domain:
for b in "नमस्ते".bytes() {
    println!("{}", b);
}

The type HashMap<K, V> stores a mapping of keys of type K to values of type V.

 use std::collections::HashMap;

let mut scores = HashMap::new();

// Inserting a value
scores.insert(String::from("Blue"), 10);
scores.insert(String::from("Yellow"), 50);

// Reading a value returns an Option<&V>
let team_name = String::from("Blue");
let score = scores.get(&team_name);

// You can iterate over each key/value pair like vectors
for (key, value) in &scores {
    println!("{}: {}", key, value);
}

// Insert a value only if it does not exist
scores.entry(String::from("Yellow")).or_insert(50);

Error Handling

Rust groups errors into two major categories: recoverable and unrecoverable errors. Rust doesn’t have exceptions. Instead, it has the type Result<T, E> for recoverable errors a nd the panic! macro that stops execution when the program encounters an unrecoverable error.

When the panic! macro executes, your program will print a failure message, unwind and clean up the stack, and then quit.

panic!("crash and burn");

We can also run our programs with the flag below to display the backtrace:

RUST_BACKTRACE=1 cargo run

The Result enum is defined as having two variants, Ok and Err, as follows:

enum Result<T, E> {
    Ok(T),
    Err(E),
}

T represents the type of the value that will be returned in a success case within the Ok variant, and E represents the type of the error that will be returned in a failure case within the Err variant.

use std::fs::File;
use std::io::ErrorKind;

fn main() {
    let f = File::open("hello.txt");

    let f = match f {
        Ok(file) => file,
        Err(error) => match error.kind() {
            ErrorKind::NotFound => match File::create("hello.txt") {
                Ok(fc) => fc,
                Err(e) => panic!("Problem creating the file: {:?}", e),
            },
            other_error => {
                panic!("Problem opening the file: {:?}", other_error)
            }
        },
    };
}

The Result<T, E> type has many helper methods defined on it to do various tasks.unwrap is a shortcut method that, if the Result value is the Ok variant, will return the value inside the Ok; if the Result is the Err variant, unwrap will call the panic! macro for us.

use std::fs::File;

fn main() {
    let f = File::open("hello.txt").unwrap();
}

Another method, expect, which is similar to unwrap, lets us also choose the panic! error message:

use std::fs::File;

fn main() {
    let f = File::open("hello.txt").expect("Failed to open hello.txt");
}

Errors can be propagated with the ? operator.

use std::fs::File;
use std::io;
use std::io::Read;

fn read_username_from_file() -> Result<String, io::Error> {
    let mut f = File::open("hello.txt")?;
    let mut s = String::new();
    f.read_to_string(&mut s)?;
    Ok(s)
}

The ? placed after a Result value is will return the function with Err or return the value from Ok. ? works by calling the from method from the From trait of the standard library. ? can also be chained:

use std::fs::File;
use std::io;
use std::io::Read;

fn read_username_from_file() -> Result<String, io::Error> {
    let mut s = String::new();

    File::open("hello.txt")?.read_to_string(&mut s)?;

    Ok(s)
}

The ? operator can be used in functions that have a return type of Result, Option or another type that implements std::ops::Try.

Generic Types, Traits and Lifetimes

Generics must be defined in the function signature:

fn largest<T>(list: &[T]) -> T {
    // code here...
}

But also works on structs:

struct Point<T> {
    x: T,
    y: T,
}

fn main() {
    let integer = Point { x: 5, y: 10 };
    let float = Point { x: 1.0, y: 4.0 };
}

We can also use different types:

struct Point<T, U> {
    x: T,
    y: U,
}

fn main() {
    let both_integer = Point { x: 5, y: 10 };
    let both_float = Point { x: 1.0, y: 4.0 };
    let integer_and_float = Point { x: 5, y: 4.0 };
}

This also works for enums:

enum Result<T, E> {
    Ok(T),
    Err(E),
}

Method definitions must also include the generic type:

struct Point<T> {
    x: T,
    y: T,
}

impl<T> Point<T> {
    fn x(&self) -> &T {
        &self.x
    }
}

We can also implement a method for a specific type:

impl Point<f32> {
    fn distance_from_origin(&self) -> f32 {
        (self.x.powi(2) + self.y.powi(2)).sqrt()
    }
}

Rust compiles generic code into code that specifies the type in each instance, we pay no runtime cost for using generics.

Traits: Defining Shared Behavior

A trait tells the Rust compiler about functionality a particular type has and can share with other types. Trait definitions are a way to group method signatures together to define a set of behaviors necessary to accomplish some purpose.

pub trait Summary {
    fn summarize(&self) -> String;

    /** This is a method with a default implementation **/
    fn read_ome(&self) -> String {
        String::from("Read more...")
    }
}

To implement a trait:

pub struct NewsArticle {
    pub headline: String,
    pub location: String,
    pub author: String,
    pub content: String,
}

impl Summary for NewsArticle {
    fn summarize(&self) -> String {
        format!("{}, by {} ({})", self.headline, self.author, self.location)
    }
}

We can’t implement external traits on external types. This restriction is part of a property of programs called coherence, and more specifically the orphan rule, so named because the parent type is not present. This rule ensures that other people’s code can’t break your code and vice versa.

We can require traits as parameters:

pub fn notify(item: &impl Summary) {
    println!("Breaking news! {}", item.summarize());
}

This works the same as the trait bound syntax:

pub fn notify<T: Summary>(item: &T) {
    println!("Breaking news! {}", item.summarize());
}

We can also specify multiple traits:

pub fn notify(item: &(impl Summary + Display)) {
// Or like this:
pub fn notify<T: Summary + Display>(item: &T) {

When multiple traits are used, we can use where clauses:

fn some_function<T, U>(t: &T, u: &U) -> i32
    where T: Display + Clone,
          U: Clone + Debug
{

We can also use traits as a return type:

fn returns_summarizable() -> impl Summary {

Validating References with Lifetimes

Every reference in Rust has a lifetime, which is the scope for which that reference is valid. Most of the time, lifetimes are implicit and inferred. The main aim of lifetimes is to prevent dangling references, which cause a program to reference data other than the data it’s intended to reference.

The Rust compiler has a borrow checker that compares scopes to determine whether all borrows are valid.

Lifetime annotations don’t change how long any of the references live.

The names of lifetime parameters must start with an apostrophe (’) and are usually all lowercase and very short, like generic types. Most people use the name 'a. We place lifetime parameter annotations after the & of a reference, using a space to separate the annotation from the reference’s type.

&i32        // a reference
&'a i32     // a reference with an explicit lifetime
&'a mut i32 // a mutable reference with an explicit lifetime

In practice, this is how we return a reference to a string slice:

fn longest<'a>(x: &'a str, y: &'a str) -> &'a str {
    if x.len() > y.len() {
        x
    } else {
        y
    }
}

Every reference has a lifetime and that you need to specify lifetime parameters for functions or structs that use references.

One special lifetime is 'static, which means that this reference can live for the entire duration of the program. All string literals have the 'static lifetime:

let s: &'static str = "I have a static lifetime.";

Writing Automated Tests

A test in Rust is a function that’s annotated with the test attribute.

#[cfg(test)]
mod tests {
    #[test]
    fn it_works() {
        assert_eq!(2 + 2, 4);
    }
}

Tests are run with Cargo: cargo test

Results can be asserted with macros:

Panics can be tested with the attribute #[should_panic(expected="Message goes here...")].

We can use use Result<T, E> in tests:

#[cfg(test)]
mod tests {
    #[test]
    fn it_works() -> Result<(), String> {
        if 2 + 2 == 4 {
            Ok(())
        } else {
            Err(String::from("two plus two does not equal four"))
        }
    }
}

The #[cfg(test)] annotation on the tests module tells Rust to compile and run the test code only when you run cargo test, not when you run cargo build.

Closures

Defining a closure, with optional type annotations:

let expensive_closure = |num: u32| -> u32 {
    println!("calculating slowly...");
    thread::sleep(Duration::from_secs(2));
    num
};

The Fn traits represent closures. The closure above would look like Fn(u32) -> u32. Closures can capture their environment and access variables from the scope in which they’re defined.

Closures can capture values from their environment in three ways: taking ownership, borrowing mutably, and borrowing immutably. These are encoded in the three Fn traits as follows:

Iterators

The iterator pattern allows you to perform some task on a sequence of items in turn. An iterator is responsible for the logic of iterating over each item and determining when the sequence has finished. When you use iterators, you don’t have to reimplement that logic yourself.

In Rust, iterators are lazy.

let v1 = vec![1, 2, 3];
let v1_iter = v1.iter();

for val in v1_iter {
    println!("Got: {}", val);
}

All iterators implement a trait named Iterator that is defined in the standard library. To create an iterator, you must implement the next() map.

Combining Iterators and Closures

fn shoes_in_size(shoes: Vec<Shoe>, shoe_size: u32) -> Vec<Shoe> {
    shoes.into_iter().filter(|s| s.size == shoe_size).collect()
}

Iterators are one of Rust’s zero-cost abstractions, by which we mean using the abstraction imposes no additional runtime overhead.

Cargo and Crates.io

In Rust, release profiles are predefined and customizable profiles with different configurations that allow a programmer to have more control over various options for compiling code. Each profile is configured independently of the others.

Cargo has two main profiles: the dev profile Cargo uses when you run cargo build and the release profile Cargo uses when you run cargo build --release.

Documentation

/// Adds one to the number given.
///
/// # Examples
///
/// ```
/// let arg = 5;
/// let answer = my_crate::add_one(arg);
///
/// assert_eq!(6, answer);
/// ```
pub fn add_one(x: i32) -> i32 {
    x + 1
}

Documentation comments use three slashes, ///, instead of two and support Markdown notation for formatting the text.

For convenience, running cargo doc --open will build the HTML for your current crate’s documentation (as well as the documentation for all of your crate’s dependencies) and open the result in a web browser.

Running cargo test will run the code examples in your documentation as tests.

Another style of doc comment, //!, adds documentation to the item that contains the comments rather than adding documentation to the items following the comments. We typically use these doc comments inside the crate root file (src/lib.rs by convention) or inside a module to document the crate or the module as a whole.

Exporting a Convenient Public API with pub use

If the structure of the crate convenient for others to use from another library, you don’t have to rearrange your internal organization: instead, you can re-export items to make a public structure that’s different from your private structure by using pub use.

Filename: src/lib.rs

//! # Art
//!
//! A library for modeling artistic concepts.

pub use self::kinds::PrimaryColor;
pub use self::kinds::SecondaryColor;
pub use self::utils::mix;

pub mod kinds {
    // --snip--
}

pub mod utils {
    // --snip--
}

Therefore, instead of using:

use art::kinds::PrimaryColor;
use art::utils::mix;

The library user can:

use art::mix;
use art::PrimaryColor;

Cargo Workspaces

Cargo offers a feature called workspaces that can help manage multiple related packages that are developed in tandem. A workspace is a set of packages that share the same Cargo.lock and output directory.

Smart Pointers

A pointer is a general concept for a variable that contains an address in memory. The most common kind of pointer in Rust is a reference, which are indicated by the & symbol and borrow the value they point to.

Smart pointers, on the other hand, are data structures that not only act like a pointer but also have additional metadata and capabilities.

Smart pointers are usually implemented using structs. They implement the Deref and Drop traits:

Using Box<T> to Point to Data on the Heap

Box<T> points to data on the heap. Use cases: wen you have a type whose size can’t be known at compile time; you have a large amount of data and you want to transfer ownership but ensure the data won’t be copied when you do so; you want to own a value and you care only that it’s a type that implements a particular trait rather than being of a specific type.

let b = Box::new(5);

One type whose size can’t be known at compile time is a recursive type. However, boxes have a known size, so by inserting a box in a recursive type definition, you can have recursive types. Boxes provide only the indirection and heap allocation; they don’t have any other special capabilities and also don’t have any performance overhead.

Treating Smart Pointers Like Regular References

Implementing the Deref trait allows you to customize the behavior of the dereference operator, *.

Deref coercion is a convenience that Rust performs on arguments to functions and methods. Deref coercion works only on types that implement the Deref trait.

Similar to how you use the Deref trait to override the * operator on immutable references, you can use the DerefMut trait to override the * operator on mutable references.

Running Code on Cleanup with the Drop Trait

Drop lets you customize what happens when a value is about to go out of scope - like a descructor on C++. You can provide an implementation for the Drop trait on any type, and the code you specify can be used to release resources like files or network connections.

std::mem::drop can be used to drop a value earlier.