Rust Development Classes

by Marcel Ibes

Preface

Welcome to the Rust Development Classes. This book and the examples in it have been written to support the online Rust development sessions. I've been guiding developers in learning Rust for a few years now, and I've found that the best way to learn Rust is by doing. This book is a collection of examples and exercises that I've used in the sessions.

It also explains how to deal with common, practical issues that you might encounter when developing with Rust.

Feel free to glance through the book, try out the examples, and ask questions during class, or by email using the link below.

If you are completely new to software development, I've started a new section: "back to basics". It covers the very basics of coding. You can explore this section at your own pace.

If you already have some development experience and want to learn Rust at a much higher pace, skip ahead to the classes section of this book. I hope that both rookie and seasoned Rust developers can learn something from this book and find it useful.

For suggestions, questions, requests, or feedback, please feel free to reach out to me at [email protected].

Ferris Teaching

See you in class!

-- Marcel

Preparation

This chapter will help you set up your Rust development environment. If you are concerned about installing anything, you can alternatively try the code snippets in the playground in this book.

Getting ready

Before we can do anything useful, we need to set up the Rust development environment. This chapter describes the steps required to install the Rust build toolchain and to install one of these development environments:

  • JetBrains RustRover (https://www.jetbrains.com/rust/)
  • Visual Studio Code with the RustAnalyzer extension. (https://code.visualstudio.com)

Installing Rust

There is little value in rehashing the excellent Rust Getting Started Guide: https://www.rust-lang.org/learn/get-started. Please follow the steps on this page to install Rust on your development machine. Return here when you can successfully run: cargo --version.

Good luck!

Adding some tooling

Once Rust is installed, we'll be adding the Clippy lint tool. You'll be seeing a lot of Clippy later. Open a terminal window and perform these steps:

rustup component add clippy

Also add the Rust formatting tool: rustfmt:

cargo install rustfmt

Setting up the development environment

Pick one of the following development environments. Both are excellent choices. The JetBrains RustRover is a full-blown IDE, while Visual Studio Code is a lightweight editor with a lot of extensions. We'll be using RustRover in this book.

JetBrains RustRover

In this book, we'll be using the newly developed RustRover IDE from JetBrains.

  • Step 1. Visit https://www.jetbrains.com/rust/ and download RustRover for your operating system.

  • Step 2. Launch RustRover. If there are any projects open, close them until you see the "Welcome to RustRover" screen.

  • Step 3. Open the Settings page (Cmd+,).

    • Click on "Languages and Frameworks" in the left-hand menu.
    • Click on "Rust".
    • Verify that a Rust toolchain is detected and that the "expand macros" option is checked.
    • Then expand the "Rust" entry in the left-hand menu and select "Clippy" for the External tool in External Linters.
    • It is recommended that you enable "Run external linter on the fly". This will slow down your IDE a bit, but it will show you Clippy warnings in real-time. You can always disable this later if it slows down your machine too much.
    • Then select "Rustfmt" and select "Use Rustfmt instead of the builtin formatter".
    • Then click on "Configure actions on save..."
    • Enable "Reformat code" and click "OK".

Your IDE is now ready for use.

Visual Studio Code with Rust Analyzer

Visual Studio Code is a popular editor for Rust development. It comes with a lot of extensions that make Rust development a breeze. The must-have extension for Rust development is the Rust Analyzer. It is a language server that provides code completion, refactoring, and other IDE features.

  • Step 1. Visit https://code.visualstudio.com and download VSCode for your operating system.

  • Step 2. Once installed, launch VSCode and open the Command Palette (Cmd+Shift+P) and type 'shell command' to find the Shell Command: Install 'code' command in PATH command.

  • Step 3. Then open the Extensions tab (Cmd+Shift+X) and install these extensions:

    • rust-analyzer
    • crates
    • GitLens
    • Better TOML
    • Cargo.toml Snippets
    • Error Lens
    • and if you want: Intellij Keybindings
  • Step 4. It is highly recommended to also change these settings. Open the Command Palette (Cmd+Shift+P) and type ' settings' and click on 'Preferences: Open Settings (UI)'

    • Editor: Format on Save: true
    • Rust-analyzer › Check On Save: Command: clippy in favor of check

Your IDE is now ready for use.

Test the development environment

On the RustRover welcome screen, click on "New Project" and create a new project using the Binary (application) template. This will create a new Cargo project with a main.rs file.

Explore the project and open the src > main.rs file by double-clicking on it. It should look like this:

fn main() {
    println!("Hello, world!");
}

Now run the program by clicking the play-button ▶️ in front of fn main(). Check the output and confirm you see Hello, world!

Exercise

Our first exercise is to change the program to print your name instead of "world". Press the play-button again and confirm that the output has changed.

This concludes the "Getting Ready" steps. See you in class!

Back to the basics

The other day my better half approached me and asked if I could teach her how to write software. With the whole world moving towards software and related technologies, she felt she was missing out by not knowing anything about software development. (Maybe the boredom of the global pandemic had something to do with it as well... 🤔)

Taking up the challenge, we started having regular coding sessions. Starting with the most basic "Hello World" example, she's added more functionality over the past few weeks. This section of the book captures some of the lessons and accompanying examples. I hope she sticks with my lessons, so we can expand this chapter as her skills grow.

Keep in mind that this section is intended for people with 0% programming experience. If you've coded before and know what a variable is, this section is not for you; skip ahead to the classes section to go through the Rust specifics at a much faster pace.

-- This section is for you, my coding queen!

Hello World

To develop software, you need a working development environment on your PC. This consists of two components:

  1. A compiler that can convert your source code into a program that a computer can run.
  2. An editor to write your source code and help you with common tasks.

Complete the Getting Started chapter to set this up. If you do not know how to open a terminal or command prompt, start JetBrains' RustRover instead and create a new Rust project. Select "Binary (Application)" as the type. Choose "hello_world" as the project name. This will construct a standard Rust project with a single source file: main.rs and this content:

fn main() {
    println!("Hello, world!");
}

You can find the main.rs file in the src directory of your newly created project.

Exploring Hello World

The project has a single function: main. This is the entry point for your application. The code is executed line by line from top to bottom. The current example has only one statement: println!("Hello, world!"); As you can imagine. This prints "Hello, world!" to standard output; the screen.

Code is held together in blocks that start with a { and end with a }. So all the code between these braces belongs to the main function. A function is defined by the fn keyword. The two ( and ) define what input a function can take. In our case, there is no input to the main function.

Standard Rust binary code always has a main function. It is placed in a file called main.rs in src.

The ; marks the end of a statement. It is used by Rust to identify where a statement ends and the next statement begins. Line breaks and spaces have no special meaning in Rust. They are just used to make the code more readable.

Expanding Hello World

In this paragraph we'll introduce a single variable first_name to the Hello World example:

fn main() {
    let first_name = "Marcel";
    println!("Hello, {first_name}!");
}

A variable is used to store data in the computer's memory. Think of the computer's memory as a cabinet with drawers. A variable is a single drawer in such a cabinet. The name of the variable, in our case first_name, is like a label that you put on the drawer so that you can remember what is stored in it.

Cabinet with drawers

In Rust a variable is defined by the let statement. The line let first_name = "Marcel"; means:

  1. Label a (new) drawer in the cabinet with "first_name"
  2. Put "Marcel" in the drawer

By convention variables in Rust use snake_case. They can not have any spaces. Use the underscore _ instead of a space.

Now that we've stored my first name in memory. We use the subsequent line to print it to the screen: println!("Hello, {}!", first_name);

The two curly brackets: {} are used as placeholders for the variable that follows the comma. In our case a single variable: first_name. There are many ways to format strings. See Formatting output for more examples.

A more convenient way to print the variable is to use it inline inside the {} brackets. Like this: println!("Hello, {first_name}!");. This will print the value of the first_name variable.

We'll use this way of printing variables throughout the book.

Run the program by clicking on the play (▶️) icon in front of fn main() and selecting Run 'Run...'. This will compile the above example into machine language and run your program. You can see the output at the bottom of the screen. It looks like this:

/Users/me/.cargo/bin/cargo run --color=always --package hello_world --bin hello_world
   Compiling hello_world v0.1.0 (/Projects/hello_world)
    Finished dev [unoptimized + debuginfo] target(s) in 1.48s
     Running `target/debug/hello_world`
Hello, world!

Process finished with exit code 0

Changing variables

In the next example, we'll explore how to change the contents of a variable. In the cabinet analogy, we'll replace the contents of the drawer with something else without relabeling the drawer.

fn main() {
    let mut first_name = "Marcel";
    println!("Hello, {first_name}!");

    first_name = "Tom";
    println!("Greetings, {first_name}!");
}

In Rust, you need to specify in advance that you may want to replace the contents of the variable - the drawer - later in your program. You do this with the mut keyword. In the example above, we create the variable first_name and write "Marcel" into it, but at the same time we mark the variable as 'mutable', telling Rust that we will change the contents of first_name at some point.

On line 2, the variable first_name has the value "Marcel", and this is what we're going to print on the screen: Hello, Marcel!.

On line 4 we assign a new value to first_name. The contents of first_name will be replaced with "Tom". In essence, we've opened the first_name drawer, removed "Marcel" and replaced it with "Tom". From now on, every time you check the first_name drawer, it will contain the value "Tom".

Note that we don't use the let statement when reassigning a variable.

Now that we've replaced the first_name value, line 5 will print: Greetings, Tom!. Try this out yourself by running the program with the play (▶️) symbol in front of fn main().

By mutating existing variables, we reuse the same drawer over and over again. This saves space in memory, i.e., in the cabinet.

Exercise

Try adding a few more greetings to this program. Use a different first name for each greeting.

Back to the basics - functions

In the previous chapter, we've explored variables a bit. You've written a small program that printed several greetings with different first names. If you review the code that you've written, you will see that it is quite repetitive. It will probably resemble something like this:

fn main() {
    let mut first_name = "Marcel";
    println!("Hello, {first_name}!");

    first_name = "Tom";
    println!("Hello, {first_name}!");

    first_name = "Dick";
    println!("Hello, {first_name}!");

    first_name = "Harry";
    println!("Hello, {first_name}!");
}

Functions are used to get rid of these repetitions and to make the software development more efficient.

The signature of a function is: fn [name] ([parameters]) -> [result type] {} Where:

  • [name] is a name of your choosing that describes what the function does
  • [parameters] define the input to the function (optional)
  • [result type] defines the kind of output you are returning from your function (optional)

Here are some examples of function signatures:

fn do_nothing() {}
fn say_hello(first_name: &str) {}
fn read_temperature() -> i32 {}
fn sum_numbers(x: i32, y: i32) -> i32 {}

Functions without an arrow in the signature do not return any data. Functions without any input parameters do not need any input to work.

In these examples you see that we've introduced these data types: i32 and str. i32 represents a signed 32-bit number, and str represents a string of text.

Data types

Going back to the cabinet analogy for the use of memory, these data types define the size of the drawer in the cabinet. Sometimes these drawers can be subdivided into smaller compartments; to hold small items like screws, nails, or bolts.

The smallest data type in a computer is called a bit. It can have one of two values: 1 or 0. If you were to divide a drawer into the smallest possible compartments, you could store a single bit in each compartment. This is the smallest unit of data that a computer can work with.

The smallest number that Rust can work with is an i8. This is a number that is made up of 8 bits. In terms of drawers, it is a drawer that holds 8 bit compartments. The drawer can hold a single number from -128 to 127. There is also an unsigned variant: u8. This drawer is equal in size, but by convention we can only store positive numbers in such a drawer. The smallest value is 0, the highest is 255.

If you need to store bigger numbers, you can double the size of the drawer into an i16 or u16. If that is still not big enough, you can double it again to a i32 or u32. And again: i64 and u64, and again: i128 and u128. At this time the i128 is the biggest drawer available for numeric data types. If you need even bigger numbers, you need to do some clever math by storing parts of the number in different drawers, but we'll leave that for what it is right now.

Rust also has a data type for numbers with decimal points: f32 and f64

The other data type is the str. This is a drawer that is dynamically sized when you assign a text value to it. As long as there is enough room in the cabinet, you can make this drawer very, very big.

The final data type that we'll cover now is the bool. A bool is similar to a bit in the way that it can have two states: true or false.

Rust will attempt to infer a data type whenever it can. This is also why we did not need to specify a data type for first_name in our earlier examples. Rust figured out that this must be a str type, because we assigned it a string value.

Taking vs borrowing

In the above example you see another character: &. It is used in combination with a datatype, like: &str. If you pass a value to a function, by default, that function will take the variable. It will open the drawer, take the contents of the drawer, and close it. After calling the function, the drawer is empty.

More often than not, you want to use whatever you've stored in the drawer at a later point in your program. You can do this by adding the ampersand (&) in front of your type signature. By doing this you tell the function that they can use the contents of the drawer, but that they must put it back in the drawer when done. Just like in real-life, Rust will make sure that no function can use a drawer that is already opened by another function.

Back to the functions

With that little side step out of the way, let's create a small function to help us greet several people.

fn main() {
    let mut first_name = "Marcel";
    greet(first_name);

    first_name = "Tom";
    greet(first_name);

    first_name = "Dick";
    greet(first_name);

    first_name = "Harry";
    greet(first_name);
}

fn greet(some_name: &str) {
    println!("Hello, {some_name}!");
}

We've introduced the greet function that has a single parameter: some_name of type &str, and that does not return any data.

The some_name parameter of the greet function acts as a locally declared variable with the name some_name.

I've intentionally given the input parameter of greet a different name to the first_name variable used in main to demonstrate that these names are not related. You can give the variable and the input parameter any name you want.

Exercise

Rename the first_name variable and the some_name parameter to a name of your choosing. Run the program to verify that these names have no effect on the logic of the software. You can use the Shift+F6 keyboard combination to quickly rename variables and field in RustRover.

Although we now have two functions, our program still starts at the top of the main function and finishes after the first }, i.e., the end of the main function.

In this example, we've not dramatically decreased the size of our program. We've actually added a few lines. However, the greeting logic is now all in a single place. If we want to change our greeting, we can do this by modifying a single line. For example:

fn main() {
    let mut first_name = "Marcel";
    greet(first_name);

    first_name = "Tom";
    greet(first_name);

    first_name = "Dick";
    greet(first_name);

    first_name = "Harry";
    greet(first_name);
}

fn greet(first_name: &str) {
    println!("{first_name}! I greet you.");
}

Because of the & in front of str we can re-use the first_name variable in subsequent calls to greet.

Benefits of functions

The real benefit of functions becomes clear when their complexity grows. Imagine we want to extend our greeting:

fn main() {
    let mut first_name = "Marcel";
    greet(first_name);

    first_name = "Tom";
    greet(first_name);

    first_name = "Dick";
    greet(first_name);

    first_name = "Harry";
    greet(first_name);
}

fn greet(first_name: &str) {
    println!("{first_name}! I greet you.");
    println!("Welcome to the world of Rust!");
}

Adding this single instruction to the greet function would have taken four additional lines if we hadn't used the greet function. In this way, functions help to reduce complexity.

Exercise

Add a new function that says goodbye after greeting a person.

Exercise

The name you pick for a variable or a function is completely at your discretion. This book and the examples within, are written in English. If you are a non-native English speaker, attempt to rewrite the previous example using variable and function names in your local language. Of course, you should also translate the output to your local language.

Back to the basics - processing input

In the previous chapter, we've looked at functions. In this chapter, we'll add a function to read some input from the user of our program, and do some basic processing on it.

fn main() {
    println!("What is your name?");
    let input = read_string();
    println!("Your name is: {input}");
}

fn read_string() -> String {
    let mut input = String::new();
    std::io::stdin()
        .read_line(&mut input)
        .expect("can not read user input");
    input
}

This example prints the line: What is your name?, waits for the user to type a name and hit the [Enter] button. Then prints Your name is: followed by the user's name.

We're using the read_string() function to read the user's input. Let's explore that function.

We see that that read_string returns a String datatype. A String is a convenient wrapper around the str data type, we've used in the previous chapters. Unlike str it can be mutated which is exactly what we need in our function.

Before we can read anything, we need to create a place to store the user's input. The String::new() function creates a new String in memory, i.e., the 'drawer' that can hold a String type. We use the mut keyword to indicate that we'll be changing the String later on. The :: separator is used to indicate that we'll be calling a function that belongs to the preceding item. In this case we call the new() function that belongs to the String datatype.

A function that belongs to a datatype is called an associated function. Because the new() function is associated with String, Rust knows we are creating a new String.

The next line actually reads a line of text from the "standard input". stdin is short for "standard input". The standard input is usually the keyboard.

We're using Rust's standard library std::io to take care of the actual reading. io stands for input & output.

The dot . chains together a sequence of operations. In our case:

  1. std::io::stdin() to get access to the keyboard
  2. read_line(&mut input) to read a string of text and store the result in input
  3. expect("can not read user input") to write an error in case something prevents us from reading (no keyboard??)

It does not matter if you write all these operations on a single line, or separate them over several lines. The ; at the end determines the end of the statement, not the line break.

Notice that the read_line function takes &mut input as the input parameter. You've learned that the & is used to lend an item to a function. Similar to let vs let mut, we can indicate whether our variable can be changed by the function or not. In the cabinet analogy, whether the contents of the drawer may be replaced by the function or not.

The &mut keyword indicates that the read_line function can modify the contents of the input variable, i.e. the drawer labelled with input.

Finally the contents of input is returned at the end of the read_string function. Notice that there is no ; at the end of the input statement. Omitting the ; at the end of input tells Rust to return the content of that variable as a result of the function. It is shorthand for return input; Which would essentially do the same.

I can see how the above is quite a lot to take in. Let's take some of these new concepts and dig into them a bit. We'll revisit our example later.

Strings, associated functions and methods

In the above example, we've seen that we can create a new String with String::new(). You've learned that the :: separator is used to call an associated function; a function that belongs to the String type. There are more associated functions available on the String type. RustRover editor will actually show what functions you can use when you pause typing after the :: separator. There is the from() associated function to create a new String filled with a piece of text.

fn main() {
    let my_name = String::from("Marcel");
    println!("Your name is: {my_name}");
}

After we've created a String we can call certain functions that belong to the String object, not the type. These are called "methods". A methods is called with the dot '.' separator, rather than the '::' separator. Here's an example:

fn main() {
    let my_name = String::from("Marcel");
    let my_name_in_lower_case = my_name.to_lowercase();
    println!("Your name is: {my_name_in_lower_case}");
}

So what is the difference between an associated function and a method?

I'll try to explain it with the same cabinet analogy we've used thus far.

Think of an associated function as an instruction that comes with the cabinet and describes how to construct the drawer. The instruction is related (or associated) with the drawer, but only tells you how to build the drawer. It does not describe how to manipulate the item that is stored in the drawer.

A method is an instruction that belongs to the finished drawer. It tells Rust how to manipulate an item that is stored in that specific drawer.

In our case the to_lowercase() method takes the item from the my_name drawer and converts all the characters to lower case. We are storing the result of the to_lowercase() operation in a new drawer called: my_name_in_lower_case.

Typically associated functions are used to create a new instance of a particular type. Methods are used to manipulate the particular type after it has been created.

Let's look at another String method:

fn main() {
    let bad_name = String::from("Marcil");
    let fixed_name = bad_name.replace("i", "e");
    println!("Your name is: {}", fixed_name);
}

Exercise

Call a method on a newly created String that converts the text to upper case and print the result.

Clippy

Now that our code becomes a bit more complex, we're likely going to run into some coding mistakes, or bugs. Rust comes with a 'friend' that can help identify these mistakes and who often offers some suggestions on how they can be remediated. This 'friend' is called: Clippy.

You can ask Clippy to inspect your program by:

  1. Double-tap the 'ctrl' key on your keyboard to open the "run anything" popup window.
  2. Type cargo clippy and press the Enter key

Inspect the output window

If it looks like this, you're good:

/Users/marcel/.cargo/bin/cargo clippy --color=always
    Finished dev [unoptimized + debuginfo] target(s) in 0.00s

Process finished with exit code 0

You can use the down arrow key (⬇️) on your keyboard after opening the "run anything" window, on subsequent runs to quickly select cargo clippy from the list of previous commands.

Do not ignore the warning given by Clippy!

These warnings often point to a programming mistake or a missed opportunity for optimization.

Processing input

Back to our original example:

fn main() {
    println!("What is your name?");
    let input = read_string();
    println!("Your name is: {}", input);
}

fn read_string() -> String {
    let mut input = String::new();
    std::io::stdin()
        .read_line(&mut input)
        .expect("can not read user input");
    input
}

There is a method on String that strips any empty characters and new-line breaks from a piece of text. It is called trim().

Exercise

Modify the above example to clean the input with the trim() method and print the result. Try the example by adding a number of spaces to the beginning of your name when typing. Remember that you can run the program with the play (▶️) symbol in front of fn main().

Exercise

Now modify your code to include a new function: read_clean_string, that reads the input, cleans it with trim(), and then returns it. You need to call the to_string() method on the cleaned result to convert it to a String. You can check the spoiler if you get stuck.

Congratulations! By completing the above exercise, you've created a fully functional program that reads some data, does some manipulation on the data and outputs it (to the screen). You will see that pretty much all software programs follow this exact paradigm.

In the next chapters we'll build on this concept, so make sure you have good understanding on variables and functions, before moving on.

Back to the basics - making choices

In the previous chapter, you learned how to get input from the user, manipulate the input, and display it on the screen. Another common operation is to make a choice based on the input provided. In Rust, this is done with the if statement.

Let's start with a small modification to our earlier example.

fn main() {
    println!("What is your name?");
    let input = read_string();
    if input == "" {
        println!("You did not enter your name...");
    } else {
        println!("Your name is: {input}");
    }
}

fn read_string() -> String {
    let mut input = String::new();
    std::io::stdin()
        .read_line(&mut input)
        .expect("can not read user input");
    let cleaned_input = input.trim().to_string();
    cleaned_input
}

Note that we've extended the read_string() function to trim the whitespace from the input before returning.

The if statement has this format:

if [comparison] { code to execute when 'true' } else { code to execute when 'false' }

Common comparisons are:

  • a == b; a equals b
  • a != b; a does not equal b
  • a < b; a is less than b
  • a <= b; a is less than, or equal to b

The else block is optional.

The same { and } braces are used to group the statements that should be executed when the if is true or false.

You can chain if statements together to cover multiple conditions:

fn main() {
    println!("What is your name?");
    let input = read_string();
    if input == "" {
        println!("You did not enter your name...");
    } else if input == "Marcel" {
        println!("{input} is a great name!");
    } else {
        println!("Your name is: {input}");
    }
}

fn read_string() -> String {
    let mut input = String::new();
    std::io::stdin()
        .read_line(&mut input)
        .expect("can not read user input");
    let cleaned_input = input.trim().to_string();
    cleaned_input
}

Note that only one of the if branches is ever executed.

You can match on multiple conditions in a single if statement. To do this, separate the conditions with either:

  • || for 'or'
  • && for 'and'

So imagine we want to print "{} is a great name!" when the user types either "Marcel" or "marcel", we can use this if block:

if input == "" {
    println!("You did not enter your name...");
} else if input == "Marcel" || input == "marcel" {
    println!("{input} is a great name!");
} else {
    println!("Your name is: {input}");
}

Exercise

Add a fourth branch to the if block that prints "I love the name {}!" Pick two names for which you want to print this message, and create an appropriate condition for this branch.

Back to the basics - Data structures

Until now, we've used the read_string() function to read a single data item. In this chapter, we'll explore more complex data structures.

There are many cases where you need to work on multiple data elements that somehow belong together. Think if an address, that consists of: the street, house number, postal code and city. Also, names are typically split in a first name and last name.

In Rust this type of data can be grouped in a struct. Like so:

struct Address {
    street: String,
    house_number: i16,
    postal_code: String,
    city: String
}

Like variables, field names are defined in snake_case. The name of the structure is defined in PascalCase. This means that the name is written without spaces, using capital letters to separate words.

We'll extend our example to ask for both the first- and the last-name.

struct Person {
    first_name: String,
    last_name: String,
}

fn main() {
    println!("What is your first name?");
    let first_name = read_string();
    println!("What is your last name?");
    let last_name = read_string();
    let person = Person {
        first_name: first_name,
        last_name: last_name,
    };
    print_person(&person);
}

fn read_string() -> String {
    let mut input = String::new();
    std::io::stdin()
        .read_line(&mut input)
        .expect("can not read user input");
    let cleaned_input = input.trim().to_string();
    cleaned_input
}

fn print_person(person: &Person) {
    println!("Hello {} {}", person.first_name, person.last_name);
}

Exercise

Run cargo clippy and see if there are any recommendations for this code. Implement the recommendations until Clippy is happy.

Although it is not strictly needed, it is a convention to write data definitions, like the Person structure at the top of your code.

We can access individual fields of a struct with the dot . separator, as you can see in the print_person function.

Note that we lend our person to the print_person function with the & operator. The & must be added to the signature of the print_person function which os taking a &Person as input, as well as the calling function that is passing &person.

Exercise

Add an "age" field to your person and ask the user for their age. Change the print_person function to write "You are {} years old." after the greeting.

Adding an associated function

Remember that when we looked at the String type there was an associated function new(), that we used to construct a new String. We can do the same for type Person type.

Rewrite the example like this:

struct Person {
    first_name: String,
    last_name: String,
}

impl Person {
    fn new(first_name: String, last_name: String) -> Person {
        Person {
            first_name,
            last_name,
        }
    }
}

fn main() {
    println!("What is your first name?");
    let first_name = read_string();
    println!("What is your last name?");
    let last_name = read_string();
    let person = Person::new(first_name, last_name);
    print_person(&person);
}

fn read_string() -> String {
    let mut input = String::new();
    std::io::stdin()
        .read_line(&mut input)
        .expect("can not read user input");
    let cleaned_input = input.trim().to_string();
    cleaned_input
}

fn print_person(person: &Person) {
    println!("Hello {} {}", person.first_name, person.last_name);
}

The impl Person { } block is used to define associated functions, or methods that belong to the Person type. We've added a new() function that takes two input parameters: first_name and last_name and returns a new Person object.

Adding a method

You can add methods in the same impl block. The only difference between the associated function and a method is in the type signature. A method always takes &self as the first input parameter. We can re-write the example to change print_person into a print method on the Person type:

struct Person {
    first_name: String,
    last_name: String,
}

impl Person {
    fn new(first_name: String, last_name: String) -> Person {
        Person {
            first_name,
            last_name,
        }
    }

    fn print(&self) {
        println!("Hello {} {}", self.first_name, self.last_name);
    }
}

fn main() {
    println!("What is your first name?");
    let first_name = read_string();
    println!("What is your last name?");
    let last_name = read_string();
    let person = Person::new(first_name, last_name);
    person.print();
}

fn read_string() -> String {
    let mut input = String::new();
    std::io::stdin()
        .read_line(&mut input)
        .expect("can not read user input");
    let cleaned_input = input.trim().to_string();
    cleaned_input
}

Note the use of self.first_name and self.last_name in the print method to access the fields of the Person. We've also changed print_person(&person) from the previous example to person.print().

Exercise

Try to add an associated function new_from_input that reads first- and last-name from input and constructs a new Person with this data. All the needed instructions can be found in the above example, within the main() and Person::new() functions.

A Rust struct can embed other structures. This can be useful when you want to add an address to the Person. Like so:

struct Address {
    street: String,
    house_number: i16,
    postal_code: String,
    city: String
}

struct Person {
    first_name: String,
    last_name: String,
    address: Address
}

Back to the basics - Loops

Let's revisit one of the earlier examples. We've used this code to explore the use of functions:

fn main() {
    let mut first_name = "Marcel";
    greet(first_name);

    first_name = "Tom";
    greet(first_name);

    first_name = "Dick";
    greet(first_name);

    first_name = "Harry";
    greet(first_name);
}

fn greet(first_name: &str) {
    println!("{first_name}! I greet you.");
}

As you can see there is a lot of repetition in the main function. We're calling greet four times in the same way, only passing in a different first_name. Luckily Rust allows us to create a list of items, a so-called vector, or Vec. Once we have such a list, we can use a loop to iterate through these items and execute a block of code using the item as an input parameter. Here's how the example will look when using a list:

fn main() {
    let list_of_names = vec!["Marcel", "Tom", "Dick", "Harry"];
    for first_name in list_of_names {
        greet(first_name);
    }
}

fn greet(first_name: &str) {
    println!("{first_name}! I greet you.");
}

The first thing we do is create a list of names. If you copy this example in the RustRover editor, you will see that list_of_names is of type Vec<&str>. In human speech: a list of strings.

The vec![] statement is a useful Rust helper to quickly create a vector of items of the same type.

We then use the for ... in ... { } statement to run through each of the names. The for loop is executed four times, on every pass first_name is updated to hold the next value from the list_of_names. This is how we can greet four times with a different name.

The for in statement can be used to iterate through nearly any sequence of items. Here are a few examples:

fn main() {
    println!("I count from 1 to 9: ");

    for number in 1..10 {
        print!("{number},")
    }
}
fn main() {
    println!("And I count from 1 to 10: ");
    for number in 1..=10 {
        print!("{number},")
    }
}
fn main() {
    println!("I can spell 'Marcel':");
    for letter in "Marcel".chars() {
        print!("{letter},")
    }
}

The .. and ..= operators are a common way to create a sequence of numbers to run through.

Exercise

Amend the above example to count from 10 to 20.

A sequence is just another type. Like other types it has a number of methods that are useful to change the iteration logic. There is a rev() method to reverse the order:

fn main() {
    println!("I count from 10 to down to 1:");
    for number in (1..=10).rev() {
        print!("{number},")
    }
}

Exercise

Knowing that there is a step_by(..) method on the number sequence. Can you change one of the above examples to count from 1 to 10, skipping all even numbers?

The result of the rev() method is another sequence, so you can chain operations together. Like:

fn main() {
    println!("I count from 9 to down to 1:");
    for number in (1..=10).rev().skip(1) {
        print!("{number},")
    }
}

You are encouraged to call other functions inside the for loop as complexity increases:

fn main() {
    for number in 1..=10 {
        odd_or_even(number);
    }
}

fn odd_or_even(number: i32) {
    if number % 2 == 0 {
        println!("{number} is an even number");
    } else {
        println!("{number} is an odd number");
    }
}

The number % 2 computes the remainder of number divided by 2. If the remainder is 0, we know we have an even number.

Lists or Vectors

As seen above, we can use the vec![] statement to quickly create a Vector (= list) of items. It can be used to create a list of predefined numbers, like:

fn main() {
    let list_of_numbers = vec![10, 25, 8, 14, 3, 42];
    println!("I can list these numbers:");
    for number in list_of_numbers {
        print!("{number},")
    }
}

Once a Vec is created, you can use its methods to manipulate the list of items. Run this example to see what it prints:

fn main() {
    let mut list_of_numbers = vec![10, 25, 8, 14, 3, 42];
    list_of_numbers.remove(0);
    list_of_numbers.push(59);

    println!("I can list these numbers:");
    for number in list_of_numbers {
        print!("{number},")
    }
}

Don't forget the mut keyword if you want to change the Vector.

Exercise

Amend the example and manipulate the list in a few different ways. See what happens if you try to remove an item that is beyond the length of the Vector.

Note that borrowing values also applies to for loops. So this code will fail to compile:

fn main() {
    let list_of_numbers = vec![10, 25, 8, 14, 3, 42];
    println!("I can list these numbers:");
    for number in list_of_numbers {
        print!("{number},")
    }

    println!();
    println!("Oops, I can not list these numbers again:");
    for number in list_of_numbers {
        print!("{number},")
    }
}

The reason is simple if you know what is happening. Going back to the cabinet with the drawers. Imagine that each number in the list_of_numbers is in a separate drawer in our cabinet; 6 drawers in total. During the first for loop we go through the drawers, taking the number from the drawer and using it in the print! function. This means that once the first for loop finishes all the drawers are empty.

What we want in this case is to borrow the numbers during the first loop. We don't care about taking them in the second loop, because we have no further use for the numbers anyway. So with a small change, our code is fine again:

fn main() {
    let list_of_numbers = vec![10, 25, 8, 14, 3, 42];
    println!("I can list these numbers:");
    for number in &list_of_numbers {
        print!("{number},")
    }

    println!();
    println!("Now, I *can* list these numbers again:");
    for number in list_of_numbers {
        print!("{number},")
    }
}

See the & in front of the list_of_numbers indicating that we want to borrow these items, rather than take them.

If you revisit the error message on the earlier example, you will see that there is actually a very useful suggestion embedded in the output:

help: consider borrowing to avoid moving into the for loop: `&list_of_numbers`

It is embedded in this block:

4   |     for number in list_of_numbers {
    |                   ---------------
    |                   |
    |                   `list_of_numbers` moved due to this implicit call to `.into_iter()`
    |                   help: consider borrowing to avoid moving into the for loop: `&list_of_numbers`

The number '4' in front of the block points to the line number for which the fix is suggested.

As you can see, this is exactly the fix we've implemented; replace list_of_numbers on line 4 with &list_of_numbers.

It is worthwhile to explore the compiler error messages. They often suggest a fix for the issue at hand.

Make sure that when you borrow values from a list in a loop, any function that is used inside the for block must also borrow the value:

fn main() {
    let list_of_numbers = vec![10, 25, 8, 14, 3, 42];
    println!("I can list these numbers:");
    for number in &list_of_numbers {
        print_number(number)
    }
}

fn print_number(number: &i32) {
    print!("{number},")
}

If the print_number function would not borrow, but take the number, the code will not compile. It would be similar to you borrowing an item from me, but then giving it away to someone else. The Rust compiler is fair and will not let you do that!

Back to the basics - conversions

So far we've used the read_string() function to capture some input from the user. This has been quite useful so far, but sometimes you need a data type other than text. For example, if you want to work with numbers. In this chapter, we'll look at how to convert the user's text-based input into a number. We'll also cover the basics of error handling.

We'll start with an earlier example that I've modified slightly:

fn main() {
    println!("How old are you?");
    let age = read_string();
    println!("You are {age} years old.");
}

fn read_string() -> String {
    let mut input = String::new();
    std::io::stdin()
        .read_line(&mut input)
        .expect("can not read user input");
    input.trim().to_string()
}

Note that I've removed the intermediate cleaned_input and now return the result of input.trim().to_string() directly from the read_string() function. Remember that we can do this by omitting the ; at the end of the last statement in a function.

Capturing age as a number

The goal is to capture the age of the user and convert it into one of the numeric data types: i8, i16, i32, u8, u16 or u32.

Exercise

Consider the range of numbers these data types can hold and choose what would be the best fit.

Now let's modify the above code and add a new read_number function.

use std::str::FromStr;

fn main() {
    println!("How old are you?");
    let age = read_number();
    println!("You are {age} years old.");
}

fn read_string() -> String {
    let mut input = String::new();
    std::io::stdin()
        .read_line(&mut input)
        .expect("can not read user input");
    input.trim().to_string()
}

fn read_number() -> u8 {
    let input = read_string();
    u8::from_str(&input).expect("input is not a number")
}

As you can see, I've chosen the u8 type. u8 holds numbers from 0 to 255, which should be enough for any living person at this point. The alternative would have been i8. i8 goes up to 127. According to wikipedia, the oldest human was 122, which does not leave us much room. i8 also holds negative numbers, which is not needed in our case. Since u8 and i8 both cover the same amount of memory, 8 bits or 1 byte, u8 makes the most sense. Anything larger, like u16 or u32 would be a waste of memory.

There are a few other things that have been added to the code. The RustRover editor highlighted an error when I added the u8::from_str related method. Apparently it was missing some code that it needed for this operation and suggested that I add the line use std::str::FromStr;. So I did. We'll discuss importing external functions later, for now just make sure you include the use statement at the top of your main.rs file.

The associated u8::from_str function will try to convert the input to a u8 number type. This can fail in many ways: the user could have typed something that isn't a number, or they could have typed a number that isn't in the u8' range, like 300', or they could have just hit the enter key and not given us any input.

The expect() method will deal with these errors. Well... handling the error might be a bit of an exaggeration, it will make sure that the program does not continue with an invalid return value. In fact, our program will crash with an output like this:

thread 'main' panicked at 'input is not a number: ParseIntError { kind: Overflow }', src/main.rs:20:26
stack backtrace:
   0: rust_begin_unwind
             at /rustc/7eac88abb2e57e752f3302f02be5f3ce3d7adfb4/library/std/src/panicking.rs:483
   1: core::panicking::panic_fmt
             at /rustc/7eac88abb2e57e752f3302f02be5f3ce3d7adfb4/library/core/src/panicking.rs:85
   2: core::option::expect_none_failed
             at /rustc/7eac88abb2e57e752f3302f02be5f3ce3d7adfb4/library/core/src/option.rs:1234
   3: core::result::Result<T,E>::expect
             at /Users/mibes/.rustup/toolchains/stable-x86_64-apple-darwin/lib/rustlib/src/rust/library/core/src/result.rs:933
   4: hello_world::read_number
             at ./src/main.rs:20
   5: hello_world::main
             at ./src/main.rs:5
   6: core::ops::function::FnOnce::call_once
             at /Users/mibes/.rustup/toolchains/stable-x86_64-apple-darwin/lib/rustlib/src/rust/library/core/src/ops/function.rs:227
note: Some details are omitted, run with `RUST_BACKTRACE=full` for a verbose backtrace.

The crucial piece of information is: input is not a number: ParseIntError { kind: Overflow }.

Surely there must be a better way to handle such input errors. Of course there is. Lots of ways, actually. Let's start with the simplest case. We'll provide a default age when someone makes an input error.

use std::str::FromStr;

fn main() {
    println!("How old are you?");
    let age = read_number();
    println!("You are {age} years old.");
}

fn read_string() -> String {
    let mut input = String::new();
    std::io::stdin()
        .read_line(&mut input)
        .expect("can not read user input");
    input.trim().to_string()
}

fn read_number() -> u8 {
    let input = read_string();
    u8::from_str(&input).unwrap_or(0)
}

As you can see, I replaced the expect() with unwrap_or(0). In Rust, unwrap() attempts to retrieve a valid value from the preceding function or method. It is similar to expect(), but does not provide a meaningful message if it fails. You should rarely use unwrap or expect because it often indicates that you are not handling error conditions properly.

**However, unwrap's brother unwrap_or() can be quite useful. It tries to take a valid value, just like unwrap, but if it fails, it returns the default value instead, in our case 0. So unlike unwrap or expect, unwrap_or will not crash your program.

Run the above example and see what happens if you enter an invalid u8 number.

Considering that I have never seen a baby type on a keyboard, returning 0 seems like an acceptable default. We can even check for the 0 in our main function.

use std::str::FromStr;

fn main() {
    println!("How old are you?");
    let age = read_number();

    if age > 0 {
        println!("You are {age} years old.");
    } else {
        println!("You've entered an invalid age");
    }
}

fn read_string() -> String {
    let mut input = String::new();
    std::io::stdin()
        .read_line(&mut input)
        .expect("can not read user input");
    input.trim().to_string()
}

fn read_number() -> u8 {
    let input = read_string();
    u8::from_str(&input).unwrap_or(0)
}

Obviously, returning a default value cannot be used in all cases, and you may even question whether it is legitimate not to accept 0 as a valid input.

In the next chapter, we'll examine the use of optional return values to handle our case in a nicer way.

Back to the basics - optional values

We'll continue with the code from the previous chapter. If you've skipped ahead, you may want to circle back and ensure you cover that chapter, before continuing.

In our current code, we're returning the default value 0 when the user makes an input error. In essence, we are using the perfectly valid, although unlikely, age of 0 to indicate an error. Rather than relying on default values, Rust allows us to return an optional value from read_number(). This means we can return a valid u8 number, or nothing. This is achieved with the Option type.

The signature of Option is: Option< [embedded type] > Where [embedded type] is the data type we want our Option to wrap. In our case we would use Option<u8> to wrap the u8 in an Option.

Let's amend our code to reflect this.

use std::str::FromStr;

fn main() {
    println!("How old are you?");
    let optional_age = read_number();

    if optional_age.is_some() {
        println!("You are {} years old.", optional_age.unwrap());
    } else {
        println!("You've entered an invalid age");
    }
}

fn read_string() -> String {
    let mut input = String::new();
    std::io::stdin()
        .read_line(&mut input)
        .expect("can not read user input");
    input.trim().to_string()
}

fn read_number() -> Option<u8> {
    let input = read_string();
    u8::from_str(&input).ok()
}

The return value of read_number() has been changed to return a Option<u8>. The unwrap_or(0) has been replaced with ok().

Valid return values from a Option<u8> are: Some(u8) or None. The ok() method will attempt to take a valid u8 from the from_str() method. If one is found, it will return it as Some(number) otherwise it will return None.

Because the u8 is now wrapped inside the Option we can test if there was a return value. Our if statement now checks for age.is_some(). Which is true when there is a valid u8 returned in the Option.

Exercise

Run the example and see what happens when you enter a 0 as input.

Much nicer, isn't it?!

There is however one nasty piece of code in our program: optional_age.unwrap().

Because we are first testing that there is a valid value in the Option, the unwrap will always succeed. So from that perspective it is a completely valid piece of code.

However, in the previous chapter I've told you to be scarce in the use of unwrap, so there must surely be a way to handle the optional return without resorting to unwrap or expect.

In Rust, we can use the if let statement for this:

use std::str::FromStr;

fn main() {
    println!("How old are you?");
    let optional_age = read_number();

    if let Some(age) = optional_age {
        println!("You are {} years old.", age);
    } else {
        println!("You've entered an invalid age");
    }
}

fn read_string() -> String {
    let mut input = String::new();
    std::io::stdin()
        .read_line(&mut input)
        .expect("can not read user input");
    input.trim().to_string()
}

fn read_number() -> Option<u8> {
    let input = read_string();
    u8::from_str(&input).ok()
}

The if let Some(age) = optional_age does two things in one step:

  1. It attempts to assign the value inside optional_age, which is an Option<u8>, to age.
  2. It checks if this was successful.

For the most part it behaves in the same way as a regular if statement. But in the positive case, there is now a new local variable age that you can use within that block of code. age is of type u8.

There you have it! A piece of code that handles input errors elegantly, without the need for unwrap or expect.

Exercise

Rewrite the above example, without the intermediate optional_age variable.

Matching results

Alternative to the if let statement, Rust offers a match statement. match allows you to run a piece of code for all possible return values. In the case of an Option, this is either Some(value) or None. We'll rewrite the above code using a match block.

use std::str::FromStr;

fn main() {
    println!("How old are you?");
    let optional_age = read_number();

    match optional_age {
        None => {
            println!("You've entered an invalid age");
        }
        Some(age) => {
            println!("You are {} years old.", age);
        }
    }
}

fn read_string() -> String {
    let mut input = String::new();
    std::io::stdin()
        .read_line(&mut input)
        .expect("can not read user input");
    input.trim().to_string()
}

fn read_number() -> Option<u8> {
    let input = read_string();
    u8::from_str(&input).ok()
}

This code behaves identical to the previous example. The match statements checks the result of optional_age and runs either the None block, or the Some(age) block. The => is used as a separator, between the return value and the block that is run.

It is up to the developer - you - to decide if you prefer the if let construct, or a match block.

If you are executing only a single command in a match block, you can ditch the { pairs, and write the code more concisely. Here's the same code written in fewer lines:

use std::str::FromStr;

fn main() {
    println!("How old are you?");
    match read_number() {
        None => println!("You've entered an invalid age"),
        Some(age) => println!("You are {} years old.", age),
    }
}

fn read_string() -> String {
    let mut input = String::new();
    std::io::stdin()
        .read_line(&mut input)
        .expect("can not read user input");
    input.trim().to_string()
}

fn read_number() -> Option<u8> {
    let input = read_string();
    u8::from_str(&input).ok()
}

Back to the basics - returning results

Previously we've used an Option<u8> to return a valid age, or nothing. As you've seen, an Option can either have a Some(value) or None. Sometimes it is more useful, maybe even better, to return a specific value in the positive case, and another value in an error condition.

As an alternative to Option, Rust supports the Result type. Where an Option can only hold a single data type, a Result can return one of two possible data types. It has this signature:

Result< [good_type], [bad_type] >

The good_type and bad_type can be any valid Rust type.

In our case we will use this combination: Result<u8,String>. This returns a u8 in the positive case, and a String in case of an error. The values are specified with either Ok(value) or Err(err_value). In our case we'll use: Ok(u8) or Err(String). We use a match block to analyze the result.

This is what the full example would look like:

use std::str::FromStr;

fn main() {
    println!("How old are you?");
    match read_number() {
        Ok(age) => println!("You are {age} years old."),
        Err(err) => println!("{err}"),
    }
}

fn read_string() -> String {
    let mut input = String::new();
    std::io::stdin()
        .read_line(&mut input)
        .expect("can not read user input");
    input.trim().to_string()
}

fn read_number() -> Result<u8, String> {
    let input = read_string();
    u8::from_str(&input).or(Err("You've entered an invalid age".to_string()))
}

Let's look at the code.

The value in the .or(Err("You've entered an invalid age".to_string())) method at the end of read_number() will be returned in case the from_str conversion failed. When this happens, we are returning Err("You've entered an invalid age".to_string())

As described at the beginning of this chapter, the Err(...) method sets the [bad_type] of the Result. In our case from_str already takes care of setting the Ok(...) in case the conversion was successful.

The match block is very similar to the one we've used in the previous chapter, when we checked the returned Option. Only in this case it has two different arms: Ok(age) and Err(err). Both age and err are local variables that get set with the value from the 'good' or 'bad' response, respectively.

The program still behaves the same as in the previous chapter. However, in this case the error message to be printed is set in the read_number() function, not in main().

Exercise

Rewrite the error message to a more generic "You've entered an invalid number. Please enter a value between 0 and 255."

Now that we are returning a more generic error message, we can use the read_number() function to request more numbers than just age, and print the same error if needed.

Since we are returning the error message from the read_number() function, we can now also distinguish between an invalid input and no input.

use std::str::FromStr;

fn main() {
    println!("How old are you?");
    match read_number() {
        Ok(age) => println!("You are {age} years old."),
        Err(err) => println!("{err}"),
    }
}

fn read_string() -> String {
    let mut input = String::new();
    std::io::stdin()
        .read_line(&mut input)
        .expect("can not read user input");
    input.trim().to_string()
}

fn read_number() -> Result<u8, String> {
    let input = read_string();

    if input.is_empty() {
        Err("You did not enter any data".to_string())
    } else {
        u8::from_str(&input).or(Err("You've entered an invalid number".to_string()))
    }
}

We test input.is_empty() to check if the user did not enter any data. We use this in an if block to differentiate between no input and some input.

Notice the lack of ; in the last five lines. This ensures that the result of the if block is returned by the read_number() function.

Back to the basics - structuring code

I could have also called this chapter, "cleanup my project." Now that our application is growing, it would be unwise to continue adding all functions to tha main.rs file. At some point, it would become too big to work on in an effective way.

In this chapter, we'll split the code from the previous chapter into to files:

  • main.rs
  • utils.rs

Both these files reside under src within our project. Let's first create the new 'utils.rs' file. Follow these steps:

  1. Select the src folder in the project tree on the left
  2. Then click "File" -> "New" -> "Rust File"
  3. Enter utils as the file name (without .rs)

There should be a banner at the top of the empty utils.rs file: "File is not included in module tree..." Click on the "Attach file to main.rs" link in the banner.

Your main.rs should now look like this:

mod utils;

use std::str::FromStr;

fn main() {
    println!("How old are you?");
    match read_number() {
        Ok(age) => println!("You are {} years old.", age),
        Err(err) => println!("{}", err),
    }
}

fn read_string() -> String {
    let mut input = String::new();
    std::io::stdin()
        .read_line(&mut input)
        .expect("can not read user input");
    input.trim().to_string()
}

fn read_number() -> Result<u8, String> {
    let input = read_string();

    if input.is_empty() {
        Err("You did not enter any data".to_string())
    } else {
        u8::from_str(&input).or(Err("You've entered an invalid number".to_string()))
    }
}

At the top of main.rs the mod utils line is added.

Rust code is organized in so-called "modules". In its simplest form each Rust file under src is a module. You need to explicitly add the modules you need in your program with the mod statement. Otherwise, the functions in that module are unavailable.

Now we'll start the cleanup. Cut & paste the two functions read_string() and read_number() in their entirety to the utils.rs file. Do the same with the use std::str::FromStr; line.

By default, functions in another module are private to that module. This means they can be used within that module by other functions, but they can not be used in other files. Because we want to use both read_string() and read_number() in (future) programs we need to make them publicly available. You do this by adding pub in front of the fn.

When done, your files look like this:

utils.rs

use std::str::FromStr;

pub fn read_string() -> String {
    let mut input = String::new();
    std::io::stdin()
        .read_line(&mut input)
        .expect("can not read user input");
    input.trim().to_string()
}

pub fn read_number() -> Result<u8, String> {
    let input = read_string();

    if input.is_empty() {
        Err("You did not enter any data".to_string())
    } else {
        u8::from_str(&input).or(Err("You've entered an invalid number".to_string()))
    }
}

main.rs

mod utils;

fn main() {
    println!("How old are you?");
    match read_number() {
        Ok(age) => println!("You are {} years old.", age),
        Err(err) => println!("{}", err),
    }
}

Exercise

Run clippy and see if there are any errors. If there is an error implement the suggestion that Clippy makes.

Did you notice that there is a click-able link embedded in the error that takes you directly to the source of the problem?

When done, your main.rs file looks like this:

use crate::utils::read_number;

mod utils;

fn main() {
    println!("How old are you?");
    match read_number() {
        Ok(age) => println!("You are {} years old.", age),
        Err(err) => println!("{}", err),
    }
}

Exercise

Confirm that the code runs and that it behaves the same as before.

Now that we have cleaned up the project, I would like to make one final change. Rather than printing the question in main(), before reading the input, I'd like to have two new functions in utils.rs:

  • ask_for_a_string(question: &str) -> String
  • ask_for_a_number(question: &str) -> Result<u8, String>

These functions should wrap around the existing read_string() and read_number() functions, and print the question, before capturing the result.

Exercise

Implement these two functions in utils.rs. Use the ask_for_a_number() function in main(). Fix the compiler errors until the program runs again. Ignore any warning for now if you please. You can check the spoiler if you get stuck.

Exercise

Review main.rs and (hopefully) conclude that the code became clearer, i.e. the main() function is less noisy and describes better what it is actually doing.

Version control

This is also a good moment to introduce the concept of version control. All modern development environments come with an integrated Version Control System (VCS). RustRover has built-in support for Git (amongst others). Git is the standard VCS that Rust uses. If your project files do not light up in red or green, you may need to install Git on your PC: Installing Git. By default, the main.rs should be colored red.

What does a VCS do, and why would you need a VCS?

The VCS keeps track of the changes you make to the code of your program. Once you have completed a set of changes, you can commit those changes to the VCS. It is like taking a snapshot of your code at that moment. The VCS allows you to roll back errors to a previous commit, or compare your current code to a previous commit. This is very useful as your program grows. The VCS also provides a kind of "reset" option to undo any changes if you get stuck and your code won't compile or work anymore, this is called rollback.

The good thing is that initializing a new Rust project with the cargo init command will already initialize the Git VCS system for your project. Unfortunately, it's not yet "tuned" for use with the RustRover editor. There is a file called .gitignore in the top-level directory of the project. You can open it by double-clicking on it. By default, it looks like this:

/target

The .gitignore file contains a list of files and directories that should be excluded from version control. The target directory is excluded because this is where the compiled project code goes. Including it would make the version control database unnecessarily large without adding any value.

RustRover stores a number of project configuration files in a directory called .idea. You most likely would like to exclude that directory from the version control as well. You do this by adding the .idea line to the .gitignore file:

/target
.idea

Now save the file. The final step is to select the project (name) at the top of the project tree on the left. This is the one in bold. Now open the Git menu from the main menu at the top. Select "Selected Directory", and click on "+ Add".

If you are on an older version of RustRover, this menu may be called "VCS". In that case, look for similar menu options, but it would be better to upgrade to the latest release of the RustRover editor.

After this step, your project files should be colored green. You're all set!

Committing files to Git

I use this four-step approach to commit my files:

  1. Select the files: Select the top level project in the project tree on the left, and open the Git menu from the main menu at the top. Select "Selected Directory", and click on "Commit Directory..."
  2. Check the changes: Verify that the changed files show up in the top window and are marked with a checkbox.
  3. Comment: Provide a meaningful comment to describe what you have changed since the previous commit. If this is the first time you are committing code, write that: "Initial commit".
  4. Commit then click on the "Commit" button

After committing, the change list at the top should be empty.

You can use the "Project" tab at the far left of the screen to go back to the project tree view.

I'll point out later on when it is a good time to commit any changes. We'll also explore some of the features of Git, like checking the history of a file.

At least now we have a good starting point with our cleaned-up project in Git!

Back to the basics - Reading and writing files

Up until now, we've used the keyboard and the screen to input some data and display some values. This is a common form of input and output, or I/O. Another common practice is reading and writing data to files. Storing data in files allows you to persist data over time. This saves the user from inputting th same data over and over again.

In this chapter we'll build upon the main.rs and utils.rs from the previous chapter. We'll ask the user for some input, save it to a file. Then we'll read the content back and print it on the screen. Don't worry, we'll do this in baby steps.

Adding I/O capabilities to our project

At the top of our main.rs file there is this 'use' statement: use crate::utils::ask_for_a_number;. This statement allows us to use the ask_for_a_number function from the utils module. ask_for_a_number is a function you've written yourself. Rust also provides a whole library of functions developed by the Rust community; the standard library.

You can use function from the standard library by adding a use std:: statement to the source file. In our case we'll add this line to main.rs: use std::fs::write;

Adding this line allows us to use the write function from the fs (filesystem) module in the standard library.

Showing documentation for a function

In RustRover you can view instructions on the use of a function, by pressing and holding the "cmd" button on your keyboard and hovering the mouse over the function in the use statement. The function will become underlined, like a link on a web page. You can click the link to open a new tab with the documentation (and the source code of the function). Ignore the source mumbo jumbo and focus on the documentation directly over the function. In the newer versions of RustRover this is rendered in a nicely formatted way.

Exercise

Use the above method to highlight the write function, click on it, and read the documentation. You can clone the fs.rs tab afterwards.

We've seen that write can be used to write contents to a file. Exactly what we need!

The complete Rust standard library is documented here: doc.rust-lang.org

Getting some data from the user.

We'll use the Person structure from the Data structures chapter to hold the user input. Add the code above the main() function, and include an age field:

struct Person {
    first_name: String,
    last_name: String,
    age: u8,
}

Now we'll ask the user to provide some input to fill this structure:

main.rs

use crate::utils::{ask_for_a_number, ask_for_a_string};
use std::fs::write;

mod utils;

struct Person {
    first_name: String,
    last_name: String,
    age: u8,
}

fn main() {
    let first_name = ask_for_a_string("What is your first name?");
    let last_name = ask_for_a_string("What is your last name?");
    let age = ask_for_a_number("How old are you?").unwrap_or(0);
    let person = Person {
        first_name,
        last_name,
        age,
    };
}

For simplicity’s sake I've used the .unwrap_or(0) to fall back to a default age of 0 when the user makes an input error.

Exercise

Run the above code and see if you can input first name, last name and age.

Now we'll write the contents of Person to a file named people.txt. For this we'll create a new function: write_person:

fn write_person(person: &Person) -> std::io::Result<()> {
    let mut output = String::new();
    output.push_str(&person.first_name);
    output.push('\n');
    output.push_str(&person.last_name);
    output.push('\n');
    output.push_str(&person.age.to_string());
    output.push('\n');
    write("people.txt", output)
}

Unfortunately the Person data structure cannot be written to a file as-is. We can however write a String to a file. The write_person function creates a new empty String called output and appends the fields of person to the output.

The push_str method adds the contents of the provided field to output. The push method adds a single character to output.

The '\n' represents the new line character. By pushing this to the output, we effectively add an "Enter" after the field's content has been added. We do this so that we can read the file later in an easy way.

Finally we use the write function to write the output to the people.txt file.

The std::io::Result<()> that we are returning, is actually the output of the write function. We are returning this transparently to the main function. The std::io::Result Is similar to the Result type we've explored earlier. It either holds an Ok(()) or an std::io::Error. We can use the Error type to notify the user if something bad happened when writing the people.txt file to the hard drive.

The () in the std::io::Result<()> is Rust speak for "nothing". So this means that in the positive case, when we return Ok(()), we are actually returning nothing to the calling function, other than the fact that the operation was a success.

Now that we've created the write_person function, we can add it to the bottom of the main() function, immediately after creating the person variable:

write_person(&person);

Exercise

Run the program, input the needed fields and wait for the program to finish. Now check the top-level project directory and look for the people.txt file. Open it by double-clicking. Explore the contents.

If all went well, you should have a people.txt file that holds the data you've inputted. Pretty neat, don't you think?

Exercise

Use the instructions from the previous chapter to commit your code. Use the message: "write Person to disk", or something similar.

Checking for errors

If your program compiled, and ran, but the people.txt file was not created. This chapter is for you!

We've seen that the write function can possibly return an Error as part of the Result. We've ignored this error for now. Let's add some code to the main.rs to write a message to the user in case of a failure.

Exercise

Add a match block for the call to write_person(&person) in the main() function. Do this by replacing the write_person(&person); line with:

   match write_person(&person) {
       
   }

Notice the red line under match. Click on it and wait for the red light-bulb to appear. Click on the light-bulb and select "Add remaining patterns". See what happens.

I've added some feedback to the match block. main.rs now looks like this:

use crate::utils::{ask_for_a_number, ask_for_a_string};
use std::fs::write;

mod utils;

struct Person {
    first_name: String,
    last_name: String,
    age: u8,
}

fn main() {
    let first_name = ask_for_a_string("What is your first name?");
    let last_name = ask_for_a_string("What is your last name?");
    let age = ask_for_a_number("How old are you?").unwrap_or(0);
    let person = Person {
        first_name,
        last_name,
        age,
    };

    match write_person(&person) {
        Ok(_) => println!("people.txt was written successfully"),
        Err(err) => println!("There was error while writing people.txt: {}", err),
    }
}

fn write_person(person: &Person) -> std::io::Result<()> {
    let mut output = String::new();
    output.push_str(&person.first_name);
    output.push('\n');
    output.push_str(&person.last_name);
    output.push('\n');
    output.push_str(&person.age.to_string());
    output.push('\n');
    write("people.txt", output)
}

The _ in the Ok(_) statement means that we want to ignore the data that was wrapped in the Ok(). As seen before the write_person function is returning () in an Ok which means: "nothing". So there is no need for us to create a variable for this. It is empty anyway.

Exercise

Run the program again and watch for any errors.

Reading the data

You can take a break here if you want. Make sure to commit your changes, before you leave!

Now that we've written the people.txt file we'll use a new function to read the Person back from the file. You've guessed it... read_person.

To read something from a file, we need to use a new function from the standard library: read_to_string. Replace the use std::fs::write line at the top of main.rs with this:

use std::fs::{write, read_to_string};

Now add the read_person function:

fn read_person() -> Result<Person, std::io::Error> {
    let input = read_to_string("people.txt")?;
    let mut lines = input.split('\n');
    let first_name = lines.next().unwrap_or("").to_string();
    let last_name = lines.next().unwrap_or("").to_string();
    let age_as_string = lines.next().unwrap_or("0").to_string();
    let age = u8::from_str(&age_as_string).unwrap_or(0);
    let person = Person {
        first_name,
        last_name,
        age: age,
    };
    Ok(person)
}

Exercise

Notice that the from_str is highlighted in red. Use the same "light-bulb" method from before, to " Import" the missing 'use' statement.

The read_person function does exactly the opposite to the write_person function.

It starts with the read_to_string("people.txt") to read the contents of the people.txt file to the input variable.

Notice the ? at the end of the line. The ? is a super powerful feature of Rust. It will check the result of the preceding method. If there is an error, it will exit out of the function, retuning the error. Otherwise, it will assign the value in the Ok to the variable. So the ? is like a match block in disguise.

We know for sure that input will hold the contents of the people.txt file. In case of an error, due to the ?, the read_person function would have already stopped and returned the error.

Next we use the split('\n') method on input.

Do you recognize the \n character? This matches up with the separator we've used when creating the people.txt file. Because we were so clever before, we can now easily re-create the input fields with the split method.

The next three lines read the split values, using an empty string """ in case something went wrong.

Recall the use of unwrap_or from the earlier chapter.

Now our age field needs a special treatment. We read the age_as_string which holds the text-version of the age field. We use the same from_str associated function, from before, to convert the String to a u8.

Finally, we create a new Person structure from these fields, and return this wrapped in an Ok.

Let's add the read_person function to main() and see if it works. The main.rs should look like this:

use crate::utils::{ask_for_a_number, ask_for_a_string};
use std::fs::{read_to_string, write};
use std::str::FromStr;

mod utils;

struct Person {
    first_name: String,
    last_name: String,
    age: u8,
}

fn main() {
    let first_name = ask_for_a_string("What is your first name?");
    let last_name = ask_for_a_string("What is your last name?");
    let age = ask_for_a_number("How old are you?").unwrap_or(0);
    let person = Person {
        first_name,
        last_name,
        age,
    };

    match write_person(&person) {
        Ok(_) => println!("people.txt was written successfully"),
        Err(err) => println!("There was error while writing people.txt: {}", err),
    }

    match read_person() {
        Ok(person) => println!("people.txt was read successfully"),
        Err(err) => println!("There was error while reading people.txt: {}", err),
    }
}

fn write_person(person: &Person) -> std::io::Result<()> {
    let mut output = String::new();
    output.push_str(&person.first_name);
    output.push('\n');
    output.push_str(&person.last_name);
    output.push('\n');
    output.push_str(&person.age.to_string());
    output.push('\n');
    write("people.txt", output)
}

fn read_person() -> Result<Person, std::io::Error> {
    let input = read_to_string("people.txt")?;
    let mut lines = input.split('\n');
    let first_name = lines.next().unwrap_or("").to_string();
    let last_name = lines.next().unwrap_or("").to_string();
    let age_as_string = lines.next().unwrap_or("0").to_string();
    let age = u8::from_str(&age_as_string).unwrap_or(0);
    let person = Person {
        first_name,
        last_name,
        age: age,
    };
    Ok(person)
}

Printing the Person

There is one thing left to do; print the Person back to the user.

Exercise

Use the knowledge from the previous chapters to add a new print() method to Person that prints the three fields back to the user.

Modify the last match block to look like this:

    match read_person() {
        Ok(person) => {
            println!("people.txt was read successfully:");
            person.print();
        },
        Err(err) => println!("There was error while reading people.txt: {}", err),
    }

Run the program and see how your print method works.

Congratulations! Another major milestone achieved!

Cleaning up

Now that our program is functional, let's take some time to clean it up. We'll start by creating an associated function to construct the Person:

impl Person {
    fn new() -> Self {
        let first_name = ask_for_a_string("What is your first name?");
        let last_name = ask_for_a_string("What is your last name?");
        let age = ask_for_a_number("How old are you?").unwrap_or(0);
        Person {
            first_name,
            last_name,
            age,
        }
    }
}

I'd also like to change write_person and read_person into write_people and read_people:

The write_people function is a pretty straightforward change from write_person:

fn write_people(people: Vec<Person>) -> std::io::Result<()> {
    let mut output = String::new();

    for person in people {
        output.push_str(&person.first_name);
        output.push('\n');
        output.push_str(&person.last_name);
        output.push('\n');
        output.push_str(&person.age.to_string());
        output.push('\n');
    }

    write("people.txt", output)
}

For read_people I've split the code between read_person and read_people:

fn read_person(lines: &mut Split<char>) -> Option<Person> {
    let first_name = lines.next()?;
    let last_name = lines.next()?;
    let age_as_string = lines.next()?;
    let age = u8::from_str(&age_as_string).unwrap_or(0);
    let person = Person {
        first_name: first_name.to_string(),
        last_name: last_name.to_string(),
        age,
    };
    Some(person)
}

fn read_people() -> Result<Vec<Person>, std::io::Error> {
    let input = read_to_string("people.txt")?;
    let mut lines = input.split('\n');
    let mut people = vec![];
    while let Some(person) = read_person(&mut lines) {
        people.push(person)
    }
    Ok(people)
}

The read_people is the function that will be called from main. It now returns a list of Person, a Vec<Person>. The while let is similar to the if let, we've seen in previous chapters, but while let loops like for until the condition becomes false. In this case, it will add a person to the people vector, until the read_person function returns None.

I've modified the read_person function, and added a few question marks (?). The ? ensures that None is returned when there is no more data that can be read. This typically means, we've reached the end of the file.

The main function is adjusted to use the new functions:

fn main() {
    let person = Person::new();
    match write_people(vec![person]) {
        Ok(_) => println!("people.txt was written successfully"),
        Err(err) => println!("There was error while writing people.txt: {}", err),
    }

    match read_people() {
        Ok(_people) => println!("people.txt was read successfully"),
        Err(err) => println!("There was error while reading people.txt: {}", err),
    }
}

Exercise

Now that all the needed functions are defined, please move all functions, except main to the utils.rs. Also move the struct Person and impl Person. Use pub to make functions, fields and structs, accessible by main.rs. Make sure the code compiles. You can have clippy check this for you.

Exercise

If you want to make your code really nice, create a new db.rs file, and more the people-related code in that file, rather than into utils.rs.

Validate that your code matches with: this.

Now that the project is nice and tidy, we're ready to move to the next chapter!

Back to the basics - A small database

In this chapter, we'll add some more people to the people.txt file, and explore how iterators can be used to perform various operations on this list of Person.

First update main to read the data for 4 people:

use crate::utils::{read_people, write_people, Person};

mod utils;

fn main() {
    let mut people = vec![];
    for _ in 0..4 {
        let person = Person::new();
        people.push(person)
    }

    match write_people(people) {
        Ok(_) => println!("people.txt was written successfully"),
        Err(err) => println!("There was error while writing people.txt: {err}"),
    }

    match read_people() {
        Ok(people) => print_people(people),
        Err(err) => println!("There was error while reading people.txt: {err}"),
    }
}

fn print_people(people: Vec<Person>) {
    println!(
        "people.txt was read successfully. {} people found.",
        people.len()
    );
}

Exercise

Run the program and enter the information for the people. Include two children (age < 18). When done, check the contents of people.txt.

If all went well, your program should end with the line: people.txt was read successfully. 4 people found.

Now that we have some data in our database, we do not want to re-enter this data when we restart the program. This means we need to find a way to check if the database already has entries, and only show the data input if it does not. We'll make the following changes:

use crate::utils::{read_people, write_people, Person};

mod utils;

fn main() {
    match read_or_create_db() {
        Ok(people) => print_people(people),
        Err(err) => println!("There was error while reading people.txt: {}", err),
    }
}

fn print_people(people: Vec<Person>) {
    println!(
        "people.txt was read successfully. {} persons found.",
        people.len()
    );
}

fn create_db() -> std::io::Result<Vec<Person>> {
    let mut people = vec![];
    for _ in 0..4 {
        let person = Person::new();
        people.push(person)
    }

    write_people(people)?;
    read_people()
}

fn read_or_create_db() -> std::io::Result<Vec<Person>> {
    match read_people() {
        Ok(people) => {
            if people.len() < 4 {
                // not enough people in the database; let's re-create
                create_db()
            } else {
                Ok(people)
            }
        }
        Err(_err) => {
            // database error; let's re-create
            create_db()
        }
    }
}

Notice the lack of ; at the end of the match and if arms. This means we are returning either the result of create_db(), or Ok(people) as the result of read_or_create_db().

Because the code is getting more complex, with a nested if block inside a match. I've added some comments to help myself (and others) read the code.

You can add a comment by prepending // to a line. Comments are ignored by the compiler.

Exercise

Run the program again and notice that it skips the input and immediately prints the number of people found in the database. Now open the file people.txt and delete the last 3 lines. Save it and run the program again. Check that it asks for the people data. For completeness, you should also delete the people.txt file completely and confirm that you are asked to enter the people again.

At this stage we have created a small database with 4 people in it. The database is read on startup and if successful, the print_people function is called with the list of people read.

Filtering people

We've done all this groundwork in order to experiment with iterators. Iterators allow you to perform a series of tasks on a sequence of items. One item at a time. We'll start by implementing a filter that will filter out all the children in our list of people:

fn print_people(people: Vec<Person>) {
    let kids: Vec<Person> = people.into_iter().filter(is_kid).collect();

    println!(
        "people.txt was read successfully. {} kids found.",
        kids.len()
    );
}

fn is_kid(person: &Person) -> bool {
    person.age < 18
}

Exercise

Run the program and check the output.

All the action is done with this one line:

let kids: Vec<Person> = people.into_iter().filter(is_kid).collect();

Let's focus on the second part of the statement. We see the following methods that are called in sequence on people:

  • into_iter()
  • filter()
  • collect()

into_iter and collect do the opposite. In our case into_iter takes the vector of people and creates an iterator for the list of people. The iterator is used by subsequent methods to perform a particular action on each of the elements individually. The result of this operation is sent to collect. collect gathers all the items and converts them back into a vector.

The filter method runs the is_kid function for every item. It will keep the items for which true is returned. Or maybe better it will filter out any items that return false.

We had to explicitly tell Rust the type of kids:

let kids: Vec<Person>

Rust will automatically figure out the type of a variable in most cases. In this case, the `collect' method needs a little help to understand what type we are collecting the results in.

Exercise

Experiment with the is_kid function by changing the test condition. See what effect this has on the number of items that is returned.

Operations in iterators can be chained together. Imagine we only want to display teenagers, we could add another filter to the chain of operations:

fn print_people(people: Vec<Person>) {
    let kids: Vec<Person> = people
        .into_iter()
        .filter(is_kid)
        .filter(is_a_teenager)
        .collect();

    println!(
        "people.txt was read successfully. {} teenagers found.",
        kids.len()
    );
}

fn is_kid(person: &Person) -> bool {
    person.age < 18
}

fn is_a_teenager(person: &Person) -> bool {
    person.age >= 10
}

Another common operation is map. map takes the input and converts it to another type. In our case, I'd like to display the names of the teenagers. I will use the map function to take these names from the Person and make a String with their name.

fn print_people(people: Vec<Person>) {
    let kids: Vec<String> = people
        .into_iter()
        .filter(is_kid)
        .filter(is_a_teenager)
        .map(extract_name)
        .collect();

    println!(
        "people.txt was read successfully. These teenagers found: {:?}",
        kids
    );
}

fn is_kid(person: &Person) -> bool {
    person.age < 18
}

fn is_a_teenager(person: &Person) -> bool {
    person.age >= 10
}

fn extract_name(person: Person) -> String {
    format!("{} {}", person.first_name, person.last_name)
}

Note that the vector's signature has changed to: Vec<String>.

The format! function in the extract_name function works similarly to print!. It formats the output in the same way, but instead of printing the result to the screen, it returns the resulting String.

Exercise

Experiment with format! to display the children in different ways.

The cool thing that we can do with a vector of strings is that we can use join to stitch these individual strings together into one big String, separated by a specific piece of text:

fn print_people(people: Vec<Person>) {
    let kids: Vec<String> = people
        .into_iter()
        .filter(is_kid)
        .filter(is_a_teenager)
        .map(extract_name)
        .collect();

    let kid_names = kids.join(", ");

    println!(
        "people.txt was read successfully. These teenagers found: {}",
        kid_names
    );
}

Closures

In the examples above, we've created three functions that contain only a single line of code: is_kid(), is_a_teenager(), and extract_name(). This is a bit of a waste of space, especially if we don't use these functions anywhere else in our code.

Rust provides a way to quickly create an anonymous function; a closure. A closure is a function without a name, so we call them anonymous functions. Closures have a slightly different signature than regular functions. In over 90% of cases, you do not need to specify the input and output of a closure, Rust will figure it out.

So the fn is_kid(person: &Person) -> bool can be rewritten as a closure like this:

|person| person.age < 18

We use the pipe symbol: | to capture the input parameters. Again, you typically do not need to specify the type of the input parameters. Closures capture their environment and can therefore figure out the types out on their own.

If we rewrite our iterator chain with closures, it looks like this:

fn print_people(people: Vec<Person>) {
    let kids: Vec<String> = people
        .into_iter()
        .filter(|person| person.age < 18)
        .filter(|person| person.age >= 10)
        .map(|person| format!("{} {}", person.first_name, person.last_name))
        .collect();

    let kid_names = kids.join(", ");

    println!(
        "people.txt was read successfully. These teenagers found: {}",
        kid_names
    );
}

We can now also delete the three functions we've replaced with closures. The combination of iterators and closures allows you to write very expressive code in a condensed way.

Exercise

Rewrite the iterator in the print_people function to display the lastnames of seniors (65+). If you need to amend your database you can either modify it directly in RustRover (by double-clicking the people.txt file), or delete the file and re-run your code to input the data from scratch.

Back to the basics

Spoilers

This section contains the completed exercises. Only visit this section when you're completely stuck. Learning programming is about doing. Making mistakes is part of the learning process.

Back to the basics - processing input

Spoiler

fn main() {
    println!("What is your name?");
    let input = read_clean_string();
    println!("Your name is: {}", input);
}

fn read_string() -> String {
    let mut input = String::new();
    std::io::stdin()
        .read_line(&mut input)
        .expect("can not read user input");
    input
}

fn read_clean_string() -> String {
    let input = read_string();
    input.trim().to_string()
}

Back to the basics - structuring code

Spoiler

main.rs

use crate::utils::ask_for_a_number;

mod utils;

fn main() {
    match ask_for_a_number("How old are you?") {
        Ok(age) => println!("You are {age} years old."),
        Err(err) => println!("{err}"),
    }
}

utils.rs

use std::str::FromStr;

pub fn read_string() -> String {
    let mut input = String::new();
    std::io::stdin()
        .read_line(&mut input)
        .expect("can not read user input");
    input.trim().to_string()
}

pub fn read_number() -> Result<u8, String> {
    let input = read_string();

    if input.is_empty() {
        Err("You did not enter any data".to_string())
    } else {
        u8::from_str(&input).or(Err("You've entered an invalid number".to_string()))
    }
}

fn ask(question: &str) {
    println!("{question}");
}

pub fn ask_for_a_string(question: &str) -> String {
    ask(question);
    read_string()
}

pub fn ask_for_a_number(question: &str) -> Result<u8, String> {
    ask(question);
    read_number()
}

Back to the basics - Reading and writing files

Spoiler

main.rs

use crate::utils::{read_people, write_people, Person};

mod utils;

fn main() {
    let person = Person::new();
    let people = vec![person];
    match write_people(people) {
        Ok(_) => println!("people.txt was written successfully"),
        Err(err) => println!("There was error while writing people.txt: {}", err),
    }

    match read_people() {
        Ok(_people) => println!("people.txt was read successfully"),
        Err(err) => println!("There was error while reading people.txt: {}", err),
    }
}

utils.rs

use std::fs::{read_to_string, write};
use std::str::{FromStr, Split};

pub struct Person {
    pub first_name: String,
    pub last_name: String,
    pub age: u8,
}

impl Person {
    pub fn new() -> Self {
        let first_name = ask_for_a_string("What is your first name?");
        let last_name = ask_for_a_string("What is your last name?");
        let age = ask_for_a_number("How old are you?").unwrap_or(0);
        Person {
            first_name,
            last_name,
            age,
        }
    }
}

pub fn read_string() -> String {
    let mut input = String::new();
    std::io::stdin()
        .read_line(&mut input)
        .expect("can not read user input");
    input.trim().to_string()
}

pub fn read_number() -> Result<u8, String> {
    let input = read_string();

    if input.is_empty() {
        Err("You did not enter any data".to_string())
    } else {
        u8::from_str(&input).or(Err("You've entered an invalid number".to_string()))
    }
}

fn ask(question: &str) {
    println!("{}", question);
}

pub fn ask_for_a_string(question: &str) -> String {
    ask(question);
    read_string()
}

pub fn ask_for_a_number(question: &str) -> Result<u8, String> {
    ask(question);
    read_number()
}

pub fn write_people(people: Vec<Person>) -> std::io::Result<()> {
    let mut output = String::new();

    for person in people {
        output.push_str(&person.first_name);
        output.push('\n');
        output.push_str(&person.last_name);
        output.push('\n');
        output.push_str(&person.age.to_string());
        output.push('\n');
    }

    write("people.txt", output)
}

fn read_person(lines: &mut Split<char>) -> Option<Person> {
    let first_name = lines.next()?;
    let last_name = lines.next()?;
    let age_as_string = lines.next()?;
    let age = u8::from_str(&age_as_string).unwrap_or(0);
    let person = Person {
        first_name: first_name.to_string(),
        last_name: last_name.to_string(),
        age,
    };
    Some(person)
}

pub fn read_people() -> Result<Vec<Person>, std::io::Error> {
    let input = read_to_string("people.txt")?;
    let mut lines = input.split('\n');
    let mut people = vec![];
    while let Some(person) = read_person(&mut lines) {
        people.push(person)
    }
    Ok(people)
}

Expanding Hello World

To get the most out of these lessons, I will suggest some books and snippets that you can read to make yourself familiar with the particular topic at hand.

Recommended reading: Rust Getting Started; chapter 1 only.

Objective

We'll take the box-standard Hello World and expand the code throughout this chapter with additional functionality. During this exercise, we'll cover these topics:

  • formatting output
  • variables
  • functions & ownership
  • 'production' deployment

Formatting output

Rust has built-in formatting capabilities that can be used to print data. I'm starting with this chapter because many of our early examples will require some sort of output to demonstrate correct behavior.

We'll use the pre-constructed Hello World main.rs as our baseline:

fn main() {
    println!("Hello, world!");
}

Please notice the "play" button in the above sample to see what the code does in real-time.

As you expected this code prints Hello, world! to the standard output with a new-line at the end. Strings can include the {} placeholder, which will be replaced by the arguments that follow the string.

fn main() {
    println!("Hello, {} world!", "Rust");
}

Rust figures out the "right" format for most data-types through the Display trait. (I'll cover later what traits are, don't worry about it for now)

So this works just fine:

fn main() {
    println!("Hello, {} time!", 1);
}

Like in many other development languages, you can chain the arguments together:

fn main() {
    println!("Hello, {}. You were here {} times!", "Marcel", 2);
}

If you want to re-use an argument, or arguments are in a different order that your format, you can number the placeholders:

fn main() {
    println!("Hello, {1}. {1} was here {0} times!", 2, "Marcel");
}

(Complex) types, that do not implement the Display trait, often implement a Debug trait. Content for debug purposes can be visualized with {:?}. Like:

fn main() {
    println!("Hello, {:?}...", ("Marcel", 1));
}

The {} and {:?} formats will cover 99% of the use cases. More complex formatting use-cases are described here: String formatting

Exercise

With the information from the above-mentioned link, try to format the following number: 3.14159265359, such that it only shows the first 2 decimal places: 3.14.

Good luck!

fn main() {
    println!("π is {}!", 3.14159265359);
}

Reference material

Variables

In Rust, variables are created with the let statement. By default, this creates an immutable variable. An immutable variable cannot be changed after it has been assigned a value.

fn main() {
    let name = "Rust";
    println!("Hello, {} world!", name);
}

In the above example we use the {} placeholder to insert the value of the name variable into the string. Since Rust 1.58 you can also pass the variable directly in the string format, like this:

fn main() {
    let name = "Rust";
    println!("Hello, {name} world!");
}

It is a matter of taste, but I prefer this method, because it is easier to read and understand. We'll use this method in the rest of the book.

Back to the variables. The let statement creates a variable binding. By default, these bindings are immutable. If we attempt to change this variable, we get a compilation error:

fn main() {
    let name = "Rusty";
    name = "Marcel";
    println!("Welcome, {name} to the Rust world!");
}

Errors and warning returned by the compiler are typically very descriptive and will often provide a hint on how to fix the issue, Do not ignore these comments.

In our case it is suggested to fix it by: make this binding mutable: mut name Let's try that.

fn main() {
    let mut name = "Rusty";
    name = "Marcel";
    println!("Welcome, {name} to the Rust world!");
}

By adding mut to the let statement we have made the variable mutable. This means we can change the value of the variable after it has been assigned.

This is a great moment to introduce you to "Clippy", because our code is suboptimal. If you don't use "on-the-fly" Clippy, you can run it manually double-pressing the "Ctrl" button to open the "Run Anything" window, and type cargo clippy.

Notice the output from Clippy.

warning: value assigned to `name` is never read
 --> src/main.rs:2:13
  |
2 |     let mut name = "Rusty";
  |             ^^^^
  |
  = note: `#[warn(unused_assignments)]` on by default
  = help: maybe it is overwritten before being read?

Clippy is your pedantic Rust friend. Listen to its advice and save yourself a lot of trouble down the road! Run cargo clippy often! It is faster than a regular compilation, and the output is beneficial, especially for rookie Rust developers.

Clippy suggests that we're not reading the name variable, before we're overwriting with a new value. This is an opportunity for optimization, or an indicator that we have a logic error. Let's change the code:

 fn main() {
    let mut name = "Rusty";
    println!("Welcome, {name} to the Rust world!");
    name = "Marcel";
    println!("Welcome, {name} to the Rust world!");
}

Re-run cargo clippy (remember the double-ctrl click) and check the output. Clippy should be happy now.

Type assignment and inference

We can assign a type to a variable with the : operator. For example:

 fn main() {
    let name: &str = "Rusty";
    println!("Welcome, {name} to the Rust world!");
}

Whenever possible, Rust will try to infer the type of the variable. The compiler will look at the value you assign, and it will look ahead in your code to determine how you are using the variable and try to assign a type to it. This is best illustrated with some examples where the type is not immediately obvious.

 fn main() {
    let a = 1;
    let b = 2;
    let c: u32 = a + b;
    println!("c = {c}");
}

If you type the above example live (instead of lazily copying & pasting), you will see that the type of a' and b' changes to u32 the moment you add the line let c: u32 = a + b;. The Rust compiler is smart enough to figure out that without additional type conversions, a and b must be u32 due to the assignment to c later in the code.

A list of built-in data types can be found here.

Exercise

Fix the following example by changing the types of a, b and/or 'c'. Try a few different combinations and see what happens.

fn main() {
    let a: u8 = 128;
    let b: u8 = 128;
    let c: u32 = a + b;
    println!("c = {c}");
}

Shadowing

Rust allows you to shadow a variable. This means you can re-use the same variable name for a new variable. The new variable will shadow the old variable, effectively hiding it from the rest of the code.

fn main() {
    let name = "Rusty";
    println!("Welcome, {name} to the Rust world!");

    let name = "Marcel";
    println!("Welcome, {name} to the Rust world!");
}

You can use this feature to change the type of variable, while keeping the same name. This can be useful when you want to ensure that the old value can no longer accidentally be used.

fn main() {
    let name = "Rusty";
    println!("Welcome, {name} to the Rust world!");

    let name = name.to_uppercase();
    println!("Welcome, !{name}! to the Rust world!");
}

Note that the second name variable has a String type, while the first name variable has a &str type.

Reference material

Functions & ownership (part 1)

In this chapter, we'll explore Rust functions and ownership of variables. This is where stuff gets 'interesting'. Let's look at ownership and lifetimes first.

Because memory of a Rust program is not managed by a garbage collector at run-time, the Rust compiler needs to know when to free up memory at compile time. This can only be done deterministically when the compiler knows when a variable is no longer needed. This is where ownership comes in. Ownership of variables is a concept that is unique to Rust.

Variables are scoped in Rust. This means that a variable is only valid within the scope it is defined in. The scope is defined by the curly braces {}. When a variable goes out of scope, it is dropped.

Default variable lifetime

As you can see, variables cannot outlive their scope. This is an essential concept in Rust, and it is enforced by the compiler. This concept is referred to as "lifetimes" of variables.

All the lifetime constraints are checked at compile time. The compiler has no way of knowing the actual runtime behavior of the program, so it will only accept code guaranteed to be safe at runtime. This can be frustrating at first, especially when you're coming from a language that does not have these constraints, and you are sure, or believe you are sure, that your code is correct.

Let's dive into the concept of ownership.

Functions

When you define a function in Rust, you can pass variables to the function. The function can either take ownership of the variable or borrow the variable.

Rust functions are defined with the fn statement, followed by the function name, and the parameters in parentheses. The return type is defined after the parameters, with a -> followed by the type. The function body is enclosed in curly braces {}. Those braces define the scope of the function.

For example:

fn build_greeting(name: &str) -> String {
    format!("Welcome, {name} to the Rust world!")
}

The function build_greeting takes a &str as a parameter and returns a String. The function body uses the format! macro to build a string and returns it.

Note the lack of a ; at the end of the function body. This is because the last expression in a function is the return value. There is no need to use the return keyword in Rust.

Moving variables

When you pass a variable to a function, the variable is moved to the function. This means that the function takes ownership of the variable, and the variable is dropped when the function goes out of scope.

Moving variables into functions

The calling scope no longer has access to the variable after it has been moved to the function. You can only move a variable once. If you try to use a variable after it has been moved, the compiler will complain.

Borrowing variables

An alternative way of passing variables to a function is by lending the variable to the function. This means that the function can use the variable, but the ownership remains with the calling function. The variable is not dropped when the function goes out of scope. This is called: "borrowing". You use the & symbol to borrow a variable.

Borrowing variables to functions

You can have many read-only borrows at the same time, but only one mutable borrow at a time. This ensures that the variable is not modified by multiple functions at the same time, which could lead to undefined behavior.

Cloning

If you are having trouble with ownership and borrowing, you can clone the variable. This creates a new variable with the same value, and you can pass this to the function. Cloning can be expensive, but in practice the actual cost is negligible for most situations.

Cloning variables

In case you need to clone a variable, you can use the clone() method. For "expensive" types, like structs you can often wrap them in a smart pointer like Rc or Arc. This allows you to clone the reference to the struct, instead of the struct itself. These smart pointers are also a great way of dealing with code that behaves correctly at runtime, but the compiler cannot verify this at compile time.

Don't fight the borrow checker. Use clone(), possibly with Rc or Arc to get around lifetime issues. You can always revisit your code and search for these clones and optimize once your Rust toolset has expanded.

Enhance the Hello World with functions

Let's enhance our Hello World with a greeter function (how original!):

fn greet(name: &str) {
    println!("Welcome, {name} to the Rust world!");
}

fn main() {
    greet("Rusty");
}

We're using the variable type: &str in the above code. As seen, the & means we are borrowing the variable. The &str type is a string slice, which is a reference to a string. "Rusty" is a literal string (implicitly typed to: &'static str). We can assign string literals to &str without conversion. String slices have a fixed length and cannot be modified.

If you need to modify a string, you can use the String type. This is a growable, mutable string. You can convert a &str to a String by calling the to_string() method on the string slice. The String type offers many methods that allow you to manipulate the string.

If we re-write the above example to use String, it would look like this:

fn greet(name: String) {
    println!("Welcome, {name} to the Rust world!");
}

fn main() {
    greet("Rusty".to_string());
}

We can modify the greet function to greet many folks by passing a list (= vector) of strings:

fn greet(names: Vec<String>) {
    println!("Welcome, {names:?} to the Rust world!");
}

fn main() {
    greet(vec!["Rusty".to_string(), "Marcel".to_string()]);
}

As explained earlier, there are two things you should know about passing variables:

  1. In Rust functions are greedy. When you pass a variable to a function, it will take the variable and not give it back, unless you tell it otherwise; i.e. the variable is moved to the function.,
  2. In Rust only one function can own a variable at any moment in time.

So with these two rules in mind, let's look at the following code:

fn greet(names: Vec<String>) {
    println!("Welcome, {names:?} to the Rust world!");
}

fn main() {
    let mut names = vec!["Rusty".to_string(), "Marcel".to_string()];
    greet(names);
    names.push("John".to_string());
    greet(names);
}

The goal that I had in mind with this code is to greet a number of people: "Rusty" and "Marcel". Then add a person ("John") to the list and greet the lot again.

When you try to run this, we get an error:

error[E0382]: borrow of moved value: `names`
 --> src/main.rs:8:5
  |
6 |     let mut names = vec!["Rusty".to_string(), "Marcel".to_string()];
  |         --------- move occurs because `names` has type `std::vec::Vec<std::string::String>`, which does not implement the `Copy` trait
7 |     greet(names);
  |           ----- value moved here
8 |     names.push("John".to_string());
  |     ^^^^^ value borrowed here after move

What has happened is that we gave the "greet" function the "names" variable. It happily took this variable, and as noted before, it does not return this to the calling function ("main").

In our case, after the first call to "greet" the names variable is cleared. The "greet" function owns the "names" variable, and because there is no further use for it after println! the variable is freed.

We can change this behavior by lending the variable to the "greet" function. This is called borrowing and is done with the & statement, in this way:

fn greet(names: &Vec<String>) {
    println!("Welcome, {names:?} to the Rust world!");
}

fn main() {
    let mut names = vec!["Rusty".to_string(), "Marcel".to_string()];
    greet(&names);
    names.push("John".to_string());
    greet(&names);
}

Please note that although this code compiles and runs, Clippy has an optimization recommendation for you!

In the above code, we temporarily lend the "names" variable to the "greet" function, but keep the ownership in the " main" function. In this way, the "names" variable is freed at the end of the "main" function.

We'll revisit the topic of ownership and borrowing a few more times, because this can become 'painful' quickly when not understood 100%.

Borrowing and mutating

Imagine we want the "greet" function to clear the list of names after greeting. (not debating if this is good practice of not!). You could come up with something like this:

fn greet(names: &Vec<String>) {
    println!("Welcome, {names:?} to the Rust world!");
    names.clear();
}

fn main() {
    let mut names = vec!["Rusty".to_string(), "Marcel".to_string()];
    greet(&names);
    names.push("John".to_string());
    greet(&names);
}

The compiler won't let you run this code though:

error[E0596]: cannot borrow `*names` as mutable, as it is behind a `&` reference
 --> src/main.rs:3:5
  |
1 | fn greet(names: &Vec<String>) {
  |                 ------------ help: consider changing this to be a mutable reference: `&mut std::vec::Vec<std::string::String>`
2 |     println!("Welcome, {names:?} to the Rust world!");
3 |     names.clear();
  |     ^^^^^ `names` is a `&` reference, so the data it refers to cannot be borrowed as mutable

Although we're lending the "names" to the "greet" function, we're not explicitly allowing the "greet" function to modify the list of names. If we want to do this, we need to pass it as a "mutable reference", like this:

fn greet(names: &mut Vec<String>) {
    println!("Welcome, {names:?} to the Rust world!");
    names.clear();
}

fn main() {
    let mut names = vec!["Rusty".to_string(), "Marcel".to_string()];
    greet(&mut names);
    names.push("John".to_string());
    greet(&mut names);
}

Why is there a difference? Although more functions can borrow a read-only reference simultaneously, only one function can borrow a mutable reference at any moment in time. This is more evident when we look at multi-threading. Just keep it in mind for now.

We can mix-and-match as needed:

fn greet_and_replace(names: &mut Vec<String>) {
    println!("Welcome, {names:?} to the Rust world!");
    names.clear();
    names.push("John".to_string());
}

fn greet(names: &Vec<String>) {
    println!("Welcome, {names:?} to the Rust world!");
}

fn main() {
    let mut names = vec!["Rusty".to_string(), "Marcel".to_string()];
    greet_and_replace(&mut names);
    greet(&names);
}

There is another way to pass and return ownership: actually taking and returning the variable as part of the function. Let's look at that.

Returning data from a function

We'll stick with the greeter, but modify the example slightly.

fn greet(mut names: Vec<String>) -> Vec<String> {
    println!("Welcome, {names:?} to the Rust world!");
    names.clear();
    names.push("John".to_string());
    names
}

fn main() {
    let names = vec!["Rusty".to_string(), "Marcel".to_string()];
    let new_names = greet(names);
    greet(new_names);
}

In this example, we're not lending the "names" to "greet", but we give the ownership by passing the variable. The "greet" function modifies the list (that it owns) and returns the modified list to "main". By doing this, it also passes the ownership to "main"!

There are two things to notice:

  1. the "mut" statement before "names" is mandatory, to allow the "greet" function to mutate the list. This has no > relationship to ownership. As you can see that the "names" in "main" is not mutable.
  2. the ; is missing on the last statement in the "greet" function.

In Rust, the result of the last executed statement in a function is returned, when there is no ;. This means that the signature, or type, of the last statement must match that of the function it is returning from. Try this:

fn greet(mut names: Vec<String>) -> Vec<String> {
    println!("Welcome, {names:?} to the Rust world!");
    names.clear();
    names.push("John".to_string())
}

fn main() {
    let names = vec!["Rusty".to_string(), "Marcel".to_string()];
    let new_names = greet(names);
    greet(new_names);
}

You will see that the compiler will complain:

expected struct `std::vec::Vec`, found `()`

This is because the .push() function is returning nothing (), which does not match the Vec<String> that is expected. You can determine the signature of a function by holding down the "cmd" button (on a Mac) and hovering over the function. Often you can click (while holding the "cmd" button) to navigate to the source-code of the function, and if you're lucky, there is actually some documentation and a usage example.

Note that adding a ; to the "names" statement gives a very similar error!

fn greet(mut names: Vec<String>) -> Vec<String> {
    println!("Welcome, {names:?} to the Rust world!");
    names.clear();
    names.push("John".to_string());
    names;
}

fn main() {
    let names = vec!["Rusty".to_string(), "Marcel".to_string()];
    let new_names = greet(names);
    greet(new_names);
}

Maybe obvious, but you cannot pass a variable to a function when you're not the owner:

fn greet(mut names: Vec<String>) -> Vec<String> {
    println!("Welcome, {names:?} to the Rust world!");
    names.clear();
    names.push("John".to_string());
    names
}

fn main() {
    let names = vec!["Rusty".to_string(), "Marcel".to_string()];
    greet(names);
    greet(names);
}

Nevertheless, you will see this type of error a lot during your first weeks with Rust!

error[E0382]: use of moved value: `names`
  --> src/main.rs:11:11
   |
9  |     let names = vec!["Rusty".to_string(), "Marcel".to_string()];
   |         ----- move occurs because `names` has type `std::vec::Vec<std::string::String>`, which does not implement the `Copy` trait
10 |     greet(names);
   |           ----- value moved here
11 |     greet(names);
   |           ^^^^^ value used here after move

You can fix this in two ways:

  1. Lend the variable to the "greet" function as per the previous examples.
  2. If it really your intention to pass the same data twice, make a clone.
fn greet(mut names: Vec<String>) -> Vec<String> {
    println!("Welcome, {names:?} to the Rust world!");
    names.clear();
    names.push("John".to_string());
    names
}

fn main() {
    let names = vec!["Rusty".to_string(), "Marcel".to_string()];
    greet(names.clone());
    greet(names);
}

Don't be afraid to use clone() to get out of these situations (initially). You can always revisit your code and search for these clones and optimize once your Rust toolset has expanded. Or as a colleague said: "sprinkle your code with these clones() until it compiles." :-)

The same happens in this example:

fn greet(name: String) {
    println!("Welcome {name}");
}

fn main() {
    let name = "Marcel".to_string();
    let other_name = name;
    greet(name);
    greet(other_name);
}

The two solutions:

fn greet(name: String) {
    println!("Welcome {name}");
}

fn main() {
    let name = "Marcel".to_string();
    let other_name = name.clone();
    greet(name);
    greet(other_name);
}
fn greet(name: &String) {
    println!("Welcome {name}");
}

fn main() {
    let name = "Marcel".to_string();
    let other_name = &name;
    greet(&name);
    greet(other_name);
}

Note that the type of "other_name" is different in the two examples: "String" vs. "&String".

Be careful with the second example. Although it is more efficient, it sets you up for another common error. Check these two examples:

fn greet(name: String) {
    println!("Welcome {name}");
}

fn main() {
    let mut name = "Marcel".to_string();
    let other_name = name.clone();
    name = "Horaci".to_string();
    greet(name);
    greet(other_name);
}

vs:

fn greet(name: &String) {
    println!("Welcome {name}");
}

fn main() {
    let mut name = "Marcel".to_string();
    let other_name = &name;
    name = "Horaci".to_string();
    greet(&name);
    greet(other_name);
}

You will curse at this one a few more times in the next weeks:

error[E0506]: cannot assign to `name` because it is borrowed
  --> src/main.rs:8:5
   |
7  |     let other_name = &name;
   |                      ----- borrow of `name` occurs here
8  |     name = "Horaci".to_string();
   |     ^^^^ assignment to borrowed `name` occurs here
9  |     greet(&name);
10 |     greet(other_name);
   |           ---------- borrow later used here

Swapping the two lines will do away with the error for now, but this was clearly not the intention of the programmer:

fn greet(name: &String) {
    println!("Welcome {name}");
}

fn main() {
    let mut name = "Marcel".to_string();
    name = "Horaci".to_string();
    let other_name = &name;
    greet(&name);
    greet(other_name);
}

You can fix it as it was intended using interior mutability. This is a more advanced topic. For now, remember that you can use Rc and RefCell to get around these issues. We'll revisit this topic later.

Warning advanced example upcoming, ignore at will.
use std::cell::RefCell;
use std::rc::Rc;

fn greet(name: Rc<RefCell<String>>) {
    println!("Welcome {}", name.borrow());
}

fn main() {
    let name = Rc::new(RefCell::new("Marcel".to_string()));
    let other_name = name.clone();
    greet(name.clone());
    greet(other_name.clone());
    name.replace("Horaci".to_string());
    greet(name);
    greet(other_name);
}

Although you can pretty much forget about this example for now, it is worth noting that the .clone() operations in this example are cloning the reference (Rc) to the String, not the String itself. The inner RefCell wrapper allows us to mutate the String. So for folks coming from the C-world, we are effectively creating a (smart) pointer to a string.

Reference material

'Production' deployment

Deploying software to production is much more than building a production-optimized binary. However, it is an important step, and the only step we'll cover in this chapter.

So far, we've been building unoptimized development builds. These are faster to build, but not optimized for speed or size.

One way of building an optimized version of your application, is to double-tap Ctrl and enter: cargo build --release from within RustRover.

Let's revert back to an earlier example and do just that:

fn greet(name: &str) {
    println!("Welcome, {name} to the Rust world!");
}

fn main() {
    greet("Rusty");
}

After compilation is complete we'll find our 'production'-ready binary in target/release as opposed to target/debug. This is where we'll discover one of the great aspects of Rust:

du -hs ./target/release/hello_rust_world                                                                                                                                 13:25:16  ☁  master ☂ ✭
280K	./target/release/hello_rust_world

Our optimized binary is 280KB in size!

Binary size will increase as our application becomes more complex and as we pull in more dependencies, but it will still be significantly smaller than many other high-level languages out on the market.

(Optimized) Rust code competes with C in terms of absolute speed: Benchmark games

Session 2 - Beyond Hello World

During this lesson, we'll move beyond the Hello World examples from the previous session and introduce concepts like control flow and complex data types.

Objective

After this session, you should comprehend these mechanisms in Rust:

  • control flow
  • returning results
  • complex data types
  • optionals

Control flow

Recommended reading: Control flow

The Control flow chapter of the Rust book covers all the standard use cases in a very clear way. People with experience in any other programming language should have no issues with this concept. That's why I won't repeat it again in this chapter and assume you're familiar with the basics.

In this chapter, we'll focus on some of the hidden gems that may not be immediately recognized when reading the book.

Assigning values from an if statement

fn main() {
    let age = 18;
    let is_child = if age < 18 { true } else { false };
    println!("Is child: {is_child}");
}

Notice the lack of ; behind the 'true' and 'false'. This allows us to return the value as a result of the if statement.

Extending the concept of returning from an if statement, we could write something like this. (Not arguing that there are better ways of getting to the same result!)

    fn main() {
    let age = 15;
    let is_teenager = if age < 18 {
        if age >= 10 {
            true
        } else {
            false
        }
    } else {
        false
    };
    println!("Is teenager: {is_teenager}");
}

We can combine this with a function:

fn is_child_a_teenager(age: i32) -> bool {
    age >= 10
}

fn main() {
    let age = 15;
    let is_teenager = if age < 18 {
        is_child_a_teenager(age)
    } else {
        false
    };
    println!("Is teenager: {is_teenager}");
}

To effectively end up with:

fn is_child(age: i32) -> bool {
    age < 18
}

fn is_child_a_teenager(age: i32) -> bool {
    age >= 10
}

fn main() {
    let age = 15;
    let is_teenager = is_child(age) && is_child_a_teenager(age);
    println!("Is teenager: {is_teenager}");
}

Check these examples and pay attention to the (lack of) semicolons ; where we are returning results.

Did you run cargo clippy on the first two examples? Did you see how Clippy can help you write better code?!

There is another type of control flow type in Rust that is commonly used; the match operator.

The match operator

Match operators can be used to match a single value against a variety of patterns. This is best demonstrated with an example:

fn main() {
    let age = 15;

    match age {
        0..=9 => println!("child"),
        10..=17 => println!("teenager"),
        _ => println!("adult"),
    }
}

Match patterns must be exhaustive. If not all cases are explicitly handled, the last statement must be a "catch-all" without conditions.

As with if statements, you can return a value from a match statement.

fn main() {
    let age = 15;

    let description = match age {
        0..=9 => "child",
        10..=17 => "teenager",
        _ => "adult",
    };

    println!("{description}");
}

Notice the ; at the end of the match block!

Match blocks can also be used to replace a set of if...else statements.

fn main() {
    let age = 15;

    let description = match age {
        _ if age < 10 => "child",
        _ if age >= 18 => "adult",
        _ => "teenager",
    };

    println!("{description}");
}

The _ in the above examples are actually variables that we ignore. In Rust variables that are not used can be ignored with an _-prefix. Or just an _, like in these examples. If needed, we can capture the value rather than ignore it:

fn main() {
    let age = 15;

    let description = match age {
        _ if age < 10 => "child".to_string(),
        _ if age >= 18 => "adult".to_string(),
        a => format!("teenager of {} years old", a),
    };

    println!("{description}");
}

If you are - like me - a fan of early returns, you can use match blocks to quickly stop execution. Typically this is used to stop processing in case of a error, but for the sake of demonstration, have a look at this example:

fn age_group(age: i32) -> String {
    let valid_age = match age {
        _ if age < 0 => return "not born yet".to_string(),
        _ if age > 150 => return "seriously?!!".to_string(),
        validated => validated,
    };

    match valid_age {
        _ if age < 10 => "child".to_string(),
        _ if age >= 18 => "adult".to_string(),
        a => format!("teenager of {} years old", a),
    }
}

fn main() {
    let age = 15;
    let description = age_group(age);
    println!("{description}");
}

Notice that valid_age only gets a value assigned when 0 <= age <= 150. The other conditions - with the explicit return - exit the function early with an 'error' condition.

You can do some super powerful stuff with matching on patterns, like matching on parts of an array or struct, but we'll cover that a bit later.

Reference material

Loops

If you read the Control flow, you have already been introduced to loops. To recap, there are basically three ways to create a loop:

  • for
  • loop
  • while

for loops

A super common way to repeat an action a number of times is with a for loop:

fn main() {
    for i in 0..10 {
        print!("{} ", i)
    }

    println!();
}

Like many other development languages, Rust also supports custom step-sizes while iterating:

fn main() {
    for i in (0..10).step_by(2) {
        print!("{} ", i)
    }

    println!();
}

... or reverse traversal:

fn main() {
    for i in (0..10).rev() {
        print!("{} ", i)
    }

    println!();
}

loop loops

For custom scenario's a loop might be more suited. loop loops are often combined with a break statement to exit the loop. As per the Rust book, it can be very useful to capture the result of a loop in a variable.

fn main() {
    let mut i = 103;

    let result = loop {
        i += 1;
        if i % 25 == 0 {
            break i;
        }
    };

    println!("{result} can be divided by 25");
}

Of course, you could just print the value of i in the above example. Pretend you didn't notice that.

inner loops

Rust supports labels for loops. This can be useful when you have nested loops, and you want to break out of the outer loop from the inner loop.

fn main() {
    'outer: loop {
        println!("Entered the outer loop");
        loop {
            println!("Entered the inner loop");
            break 'outer;
        }

        println!("This point will never be reached");
    }

    println!("Exited the outer loop");
}

while loops

Conditional loops with while might be used to achieve a similar result.

fn main() {
    let mut i = 103;

    while i % 25 != 0 {
        i += 1;
    }

    println!("{i} can be divided by 25");
}

Don't go crazy!

As you've seen, results of if, match and loop blocks can be assigned to a variable. Although this is a very powerful mechanism, don't go crazy on nesting these variations. The fact that something can be done, does not mean it must be done. Someone needs to maintain and debug this code!

Imagine you need to work on something like the below:

fn main() {
    for i in &[0, 2, 4, 17, 29, 45, 102] {
        let age = *i;
        let description = match age {
            0..=9 => {
                if age < 2 {
                    "baby"
                } else {
                    match age {
                        2..=3 => "toddler",
                        _ => "child",
                    }
                }
            }
            10..=17 => "teenager",
            a => {
                if a < 30 {
                    "young adult"
                } else {
                    let mut i = a;
                    loop {
                        i += 1;
                        if i == 65 {
                            break "adult";
                        } else if i > 65 {
                            break "elderly";
                        }
                    }
                }
            }
        };

        println!("{age} = {description}");
    }
}

Using variables in loops

Let's have a quick look at the below example. At first glance, this may seem a valid code, greeting someone 10 times.

fn greet(name: String) {
    println!("Welcome {name}");
}

fn main() {
    let name = "Marcel".to_string();
    for _i in 0..10 {
        greet(name);
    }
}

When we compile the code, we are greeted with our old friend: error[E0382]: use of moved value: 'name'. This happens, because the name variable and the ownership of the name variable is passed to the greet() function during the first execution of the loop. Subsequent iterations therefore no longer have access to the name variable and can't run.

In this case we can easily borrow the name variable to get us out of this situation:

fn greet(name: &str) {
    println!("Welcome {name}");
}

fn main() {
    let name = "Marcel".to_string();
    for _i in 0..10 {
        greet(&name);
    }
}

Reference material

Returning results

Let's take a look at how you return results in Rust. Just to recap, in Rust, the way functions return values is a bit different from other languages. In Rust, the last expression in a function is the return value. This means that you don't need to use the return keyword to return a value from a function. The last expression in the function is automatically returned.

fn add(a: i32, b: i32) -> i32 {
    a + b
}

fn main() {
    let result = add(1, 2);
    println!("The result is: {result}");
}

As seen in the previous example, the last expression in the add function is a + b, which is automatically returned from the function. This is a very convenient feature of Rust.

Note that there is no ; at the end of the a + b expression. This is because the ; would not return the result of the calculation a+b, but would return the unit type (). This is a common mistake for people coming from other languages. The () unit type is the only type that can be returned from a function that does not have a return type specified; in C this would be void.

If you combine this with what you've learned in the previous chapter, you can see that you can return the result of a match block, an if statement, or a loop block. Here's an example showing how to return the result of an if statement:

fn max(a: i32, b: i32) -> i32 {
    if a > b {
        a
    } else {
        b
    }
}

fn main() {
    let result = max(1, 2);
    println!("{result} is the bigger number");
}

Rust also supports the return keyword. This is especially useful when you want to return early from a function. Here's an example:

fn max(a: i32, b: i32) -> i32 {
    if a > b {
        return a;
    }
    b
}

fn main() {
    let result = max(1, 2);
    println!("{result} is the bigger number");
}

The last two examples are equivalent. The return keyword is often used to make the code more readable. And prevent nested if statements.

The Result type

So far, we have been returning "errors" as well as good results in the same way. This is not very useful, especially for downstream error handling.

In Rust, functions that have a positive and negative path typically return a Result type . The Result is an enumeration, holding the positive result in Ok() and the negative result (error) in Err().This is referred to as a "recoverable error" type.

Let's rewrite the previous example to introduce this concept.

fn age_group(age: i32) -> Result<String, String> {
    let valid_age = match age {
        _ if age < 0 => return Err("not born yet".to_string()),
        _ if age > 150 => return Err("seriously?!!".to_string()),
        validated => validated,
    };

    let result = match valid_age {
        _ if age < 10 => "child".to_string(),
        _ if age >= 18 => "adult".to_string(),
        a => format!("teenager of {} years old", a),
    };

    Ok(result)
}

fn main() {
    let age = 15;
    match age_group(age) {
        Ok(description) => println!("{}", description),
        Err(err) => println!("Error: {}", err),
    }
}

This example introduces a few new concepts. First we have the Result type itself. As you can see it is defined in this way: Result<T,E>, where T = the positive result type, and E = the error result type. There are also a few short-hands in the code to facilitate the use of Result:

  • Ok(...)
  • Err(...)

So when returning a Result we should use one of these shorthands to indicate if the Result is 'Ok' or an 'Err'. On the receiving end, we can use a match block to easily separate the Ok and Err responses.

There may be situations where an unrecoverable error occurs. These are situations that cannot be handled downstream and have only one remedy: stop execution.

You can use the panic! statement in these situations. It is used similar to the println! operation that we've used extensively so far.

fn age_group(age: i32) -> Result<String, String> {
    if age < 0 || age > 150 {
        panic!("age is out of range");
    }

    let result = match age {
        _ if age < 10 => "child".to_string(),
        _ if age >= 18 => "adult".to_string(),
        a => format!("teenager of {} years old", a),
    };

    Ok(result)
}

fn main() {
    let age = -1;
    match age_group(age) {
        Ok(description) => println!("{}", description),
        Err(err) => println!("Error: {}", err),
    }
}

Check the output of the above example

Type alias

If you are using the Result<String, String> result-type throughout your code, it makes sense to create a type alias for this result combination. This is done with the type operator like this:

type AgeResult = Result<String, String>;

fn age_group(age: i32) -> AgeResult {
    if age < 0 || age > 150 {
        panic!("age is out of range");
    }

    let result = match age {
        _ if age < 10 => "child".to_string(),
        _ if age >= 18 => "adult".to_string(),
        a => format!("teenager of {} years old", a),
    };

    Ok(result)
}

fn main() {
    let age = 20;
    match age_group(age) {
        Ok(description) => println!("{}", description),
        Err(err) => println!("Error: {}", err),
    }
}

The ? operator

Rust provides a very convenient operator to test if an operation succeeded, if so, capture the positive result, if not, return out of the function with the error. This is especially useful when you are running a sequence of operations that could fail as part of a function. Like so:

fn too_young(age: i32) -> Result<i32, String> {
    if age < 0 {
        Err("too young".to_string())
    } else {
        Ok(age)
    }
}

fn too_old(age: i32) -> Result<i32, String> {
    if age > 150 {
        Err("too old".to_string())
    } else {
        Ok(age)
    }
}

fn check_age(age: i32) -> Result<i32, String> {
    let age = too_young(age)?;
    let age = too_old(age)?;
    Ok(age)
}

fn age_group(age: i32) -> Result<String, String> {
    let validated_age = check_age(age)?;

    let result = match validated_age {
        _ if age < 10 => "child".to_string(),
        _ if age >= 18 => "adult".to_string(),
        a => format!("teenager of {} years old", a),
    };

    Ok(result)
}

fn main() {
    let age = 200;
    match age_group(age) {
        Ok(description) => println!("{}", description),
        Err(err) => println!("Error: {}", err),
    }
}

The ? operator can be used as long as the error signatures of the functions match.

if let statement

If you are only interested in the positive result, you can create a conditional path to handle the positive case. Or in reverse, you can create a path that takes care of the error situation and let the positive case pass through. Here's an example:

fn age_group(age: i32) -> Result<String, String> {
    let valid_age = match age {
        _ if age < 0 => return Err("not born yet".to_string()),
        _ if age > 150 => return Err("seriously?!!".to_string()),
        validated => validated,
    };

    let result = match valid_age {
        _ if age < 10 => "child".to_string(),
        _ if age >= 18 => "adult".to_string(),
        a => format!("teenager of {} years old", a),
    };

    Ok(result)
}

fn main() {
    let age = 15;
    let age_result = age_group(age);

    if let Ok(description) = age_result {
        println!("{}", description);
    }
}

Now check what happens if you add the negative path as well:

fn age_group(age: i32) -> Result<String, String> {
    let valid_age = match age {
        _ if age < 0 => return Err("not born yet".to_string()),
        _ if age > 150 => return Err("seriously?!!".to_string()),
        validated => validated,
    };

    let result = match valid_age {
        _ if age < 10 => "child".to_string(),
        _ if age >= 18 => "adult".to_string(),
        a => format!("teenager of {} years old", a),
    };

    Ok(result)
}

fn main() {
    let age = 15;
    let age_result = age_group(age);

    if let Ok(description) = age_result {
        println!("{}", description);
    }

    if let Err(err) = age_result {
        println!("Error: {}", err);
    }
}

We see a familiar error:

error[E0382]: use of moved value: `age_result`
  --> src/main.rs:25:23
   |
21 |     if let Ok(description) = age_result {
   |               ----------- value moved here
...
25 |     if let Err(err) = age_result {
   |                       ^^^^^^^^^^ value used here after partial move
   |
   = note: move occurs because value has type `std::string::String`, which does not implement the `Copy` trait

What happened?

When we execute the if let statement, we are actually moving part of the Result into the description variable, which moves the ownership with it. This means that age_result is no longer available after the if let statement. Often you can borrow the Result in these situations:

fn age_group(age: i32) -> Result<String, String> {
    let valid_age = match age {
        _ if age < 0 => return Err("not born yet".to_string()),
        _ if age > 150 => return Err("seriously?!!".to_string()),
        validated => validated,
    };

    let result = match valid_age {
        _ if age < 10 => "child".to_string(),
        _ if age >= 18 => "adult".to_string(),
        a => format!("teenager of {} years old", a),
    };

    Ok(result)
}

fn main() {
    let age = 15;
    let age_result = age_group(age);

    if let Ok(description) = &age_result {
        println!("{}", description);
    }

    if let Err(err) = &age_result {
        println!("Error: {}", err);
    }
}

Reference material

Complex types

So far, we have been using simple types. In our business, we're typically dealing with complex data types. In Rust, we use struct to name and group together multiple values that somehow belong together.

We can adopt our previous example to handle a more complex data type.

struct Person {
    name: String,
    age: u8,
}

fn age_group(person: &Person) -> String {
    if person.age > 150 {
        panic!("age is out of range");
    }

    if person.age < 10 {
        return "child".to_string();
    }

    if person.age >= 18 {
        return "adult".to_string();
    }

    format!("teenager of {} years old", person.age)
}

fn main() {
    let person = Person {
        name: "Marcel".to_string(),
        age: 40,
    };

    let description = age_group(&person);
    println!("{} is a {}", person.name, description);
}

In this example, we've grouped "name" and "age" in a structure called "Person". We've typed age to be of type u8, and name to be a String. We're borrowing person to the age_group function, such that we can still use it later on in our println! statement.

In Rust, struct content is not only limited to types, but you can actually add functionality to a struct type. Let's say we want to add the age_group() function to Person, we can rewrite the example like this:

struct Person {
    name: String,
    age: u8,
}

impl Person {
    fn age_group(&self) -> String {
        if self.age > 150 {
            panic!("age is out of range");
        }

        if self.age < 10 {
            return "child".to_string();
        }

        if self.age >= 18 {
            return "adult".to_string();
        }

        format!("teenager of {} years old", self.age)
    }
}

fn main() {
    let person = Person {
        name: "Marcel".to_string(),
        age: 40,
    };

    println!("{} is a {}", person.name, person.age_group());
}

Notice that the age_group method is borrowing self (&self) in order to reference its own properties.

Functions that take &self as the first parameter, are referred to as methods

For smaller struct types, it may be fine to create the struct directly, like in the above example. Often however, it is a better choice to implement a new function to construct the structure with the mandatory fields:

struct Person {
    name: String,
    age: u8,
}

impl Person {
    fn new(name: String, age: u8) -> Self {
        Person { name, age }
    }

    fn age_group(&self) -> String {
        if self.age > 150 {
            panic!("age is out of range");
        }

        if self.age < 10 {
            return "child".to_string();
        }

        if self.age >= 18 {
            return "adult".to_string();
        }

        format!("teenager of {} years old", self.age)
    }
}

fn main() {
    let person = Person::new("Marcel".to_string(), 40);
    println!("{} is a {}", person.name, person.age_group());
}

Notice that the new function is not referencing &self. Other languages may call the new function a static function. In Rust these types of functions are called associated functions. The function is returning Self which is the same as if it was returning Person.

Also notice the shorthands for age and name in the new() function where Person is constructed.

Destructuring structs

You can destructure a struct into its parts with a simple let statement:

struct Person {
    name: String,
    age: u8,
}

impl Person {
    fn new(name: String, age: u8) -> Self {
        Person { name, age }
    }
}

fn main() {
    let person = Person::new("Marcel".to_string(), 40);
    let Person { name: my_name, age } = person;
    println!("{my_name} is {age} years old");
}

Reference material

Optionals

In Rust, there is no concept of "null" or "nil". Which is a good thing! In order to use variables that may not have a value, you wrap it in an Option. Clearly you need to test if an Option has a value, before you can use it, which is what we will explore in this chapter.

struct Person {
    name: String,
    age: u8,
    address: Option<String>,
}

impl Person {
    fn new(name: String, age: u8) -> Self {
        Person {
            name,
            age,
            address: None,
        }
    }
}

fn main() {
    let person = Person::new("Marcel".to_string(), 40);
    let Person {
        name: my_name,
        age,
        address: _address,
    } = person;

    println!("{my_name} is {age} years old");
}

This example behaves exactly the same as the previous one, but has a placeholder for an optional address, that is initialized without a value by the new function.

Let's expand the example and assign a value to address:

struct Person {
    name: String,
    age: u8,
    address: Option<String>,
}

impl Person {
    fn new(name: String, age: u8) -> Self {
        Person {
            name,
            age,
            address: None,
        }
    }
}

fn main() {
    let mut person = Person::new("Marcel".to_string(), 40);
    person.address = Some("Developer Ave 10".to_string());

    let Person {
        name: my_name,
        age,
        address,
    } = person;

    if let Some(my_address) = address {
        println!("{my_name} is {age} years old and lives at {my_address}");
    } else {
        println!(
            "{my_name} is {age} years old. We have no address on file"
        );
    }
}

You can see that we use the None and Some shorthand wrappers to assign a value, or no value, to an Option. Before an optional value can be used, it has to be tested to see if it contains a value. The if let Some(...) statement is a typical way of doing this.

In the above example the if let Some(my_address) = address statement, creates a new variable my_address in case address has a value. The my_address variable is scoped to the if block.

If you know for fact that an Option has a value you can forcefully "unwrap" the Option. You use either the unwrap() or expect() method. Both methods will panic in case the value is None. expect() allows you to provide a meaningful message while panicking. That is why you will unlikely use unwrap() anywhere outside of development. It makes debugging unnecessary hard.

Same example with `expect()

struct Person {
    name: String,
    age: u8,
    address: Option<String>,
}

impl Person {
    fn new(name: String, age: u8) -> Self {
        Person {
            name,
            age,
            address: None,
        }
    }
}

fn main() {
    let mut person = Person::new("Marcel".to_string(), 40);
    person.address = Some("Developer Ave 10".to_string());

    if person.address.is_some() {
        println!(
            "{} is {} years old and lives at {}",
            person.name,
            person.age,
            person.address.expect("address has no value")
        );
    } else {
        println!(
            "{} is {} years old. We have no address on file",
            person.name, person.age
        );
    }
}

The use of if and is_some() is not the idiomatic way of doing this. The if let Some(...) statement is the preferred way of doing this. The above example is just to show you that you can use is_some() to check if an Option has a value.

See what happens when you remove the address assignment and the is_some() check:

struct Person {
    name: String,
    age: u8,
    address: Option<String>,
}

impl Person {
    fn new(name: String, age: u8) -> Self {
        Person {
            name,
            age,
            address: None,
        }
    }
}

fn main() {
    let person = Person::new("Marcel".to_string(), 40);
    println!(
        "{} is {} years old and lives at {}",
        person.name,
        person.age,
        person.address.expect("address has no value")
    );
}

Panicking is not undefined behaviour, but it is not a good practice to panic in production code. It is better to handle the situation gracefully. In the above example, we could have used the if let Some(...) statement to handle the situation.

Since Rust 1.65 you can also use the let else statement to handle the situation. Here's the above example using the let else statement:

struct Person {
    name: String,
    age: u8,
    address: Option<String>,
}

impl Person {
    fn new(name: String, age: u8) -> Self {
        Person {
            name,
            age,
            address: None,
        }
    }
}

fn main() {
    let person = Person::new("Marcel".to_string(), 40);
    let Some(address) = person.address else {
        println!(
            "{} is {} years old. We have no address on file",
            person.name, person.age
        );
        return;
    };

    println!(
        "{} is {} years old and lives at {}",
        person.name, person.age, address
    );
}

The let else can be a useful pattern to avoid deep nesting of if let Some statements. It uses the quick return pattern.

Note that the else part of the let else statement must diverge from the happy path. In the above example, the else part is a println! statement and a return statement. The return statement is necessary to stop the execution of the function. If you don't return, you will get a compiler error.

You will typically use return, break, continue or panic! to diverge from the happy path.

Functions in Rust that deal with optional data will either return an Option or panic!. If a function that could point to a non-existent item does not return an Option, you can expect it to panic when it can't find the item you're requesting. Check the documentation to understand the different behaviour.

Here's such an example:

Panics:

fn main() {
    let mut names = vec!["Tom", "Dick", "Harry"];
    let last_name_in_list = names.remove(3);
    println!("{last_name_in_list}")
}

If the intent was to remove and display the last item from the list, the pop() function would have been a better fit.

fn main() {
    let mut names = vec!["Tom", "Dick", "Harry"];
    if let Some(last_name_in_list) = names.pop() {
        println!("{last_name_in_list}")
    }
}

Similar to if let, Rust supports a while let variant to loop until a None value is found. Let's adapt the above example to print all names in the list (in reverse order).

fn main() {
    let mut names = vec!["Tom", "Dick", "Harry"];
    while let Some(last_name_in_list) = names.pop() {
        println!("{last_name_in_list}")
    }
}

Enumerations

In the previous chapter we looked at the Option type. If we examine the definition of Option we see that it is actually an enumeration:

#![allow(unused)]
fn main() {
pub enum Option<T> {
    None,
    Some(T),
}
}

An enumeration is a type that can have a fixed set of values. In the case of Option, the values are None and Some(T).

Unlike in other languages, Rust's enumerations can hold data. In the case of Option, the Some variant holds a value of type T. T is a generic type parameter, which means that Option can hold any type of value.

Let's look at another example of an enumeration:

#![allow(unused)]
fn main() {
enum Direction {
    Up,
    Down,
    Left,
    Right,
}
}

In this example, Direction is an enumeration with four variants: Up, Down, Left, and Right. Each variant represents a direction.

Imagine we would like to include the velocity of the movement in the Direction enumeration. We can do this by associating data with each variant:

#![allow(unused)]
fn main() {
enum Direction {
    Up(u32),
    Down(u32),
    Left(u32),
    Right(u32),
}
}

The power of enumerations becomes apparent when we use them in combination with match expressions. A match expression allows us to match on the different variants of an enumeration and execute different code based on the variant.

Let's look at an example:

fn main() {
    move_player(Direction::Up(10));
}

enum Direction {
    Up(u32),
    Down(u32),
    Left(u32),
    Right(u32),
}

fn move_player(direction: Direction) {
    match direction {
        Direction::Up(velocity) => println!("Moving up with velocity: {velocity}"),
        Direction::Down(velocity) => println!("Moving down with velocity: {velocity}"),
        Direction::Left(velocity) => println!("Moving left with velocity: {velocity}"),
        Direction::Right(velocity) => println!("Moving right with velocity: {velocity}"),
    }
}

The associated data in the enumeration variants can be different for each variant. For an elevator, we might want to treat the vertical movement differently from the horizontal movement. We can do this by associating different data with the Up and Down variants.

fn main() {
    move_elevator(Direction::Up(10));
}

enum Direction {
    Up(u32),
    Down(u32),
    Left,
    Right,
}

fn move_elevator(direction: Direction) {
    match direction {
        Direction::Up(velocity) | Direction::Down(velocity) => println!("Moving up or down with velocity: {velocity}"),
        Direction::Left | Direction::Right => println!("Call 911!"),
    }
}

C-style Enumerations

Rust also has C-style enumerations, which are similar to enumerations in C. They are defined using the enum keyword without any associated data:

#[repr(u8)]
enum Color {
    Red = 0,
    Green = 1,
    Blue = 2,
}

fn main() {
    println!("Red: {}", Color::Blue as u8);
}

This is a less-commonly used feature in Rust, but it can be useful when you need to map values to integers.

All about self

In this chapter we'll explore the self keyword and its uses.

In previous chapters we have seen some examples of methods that take &self as a parameter. We've also seen that self can be used to call methods on a type.

For example:

struct Circle {
    radius: f64,
}

impl Circle {
    fn area(&self) -> f64 {
        std::f64::consts::PI * self.radius.powi(2)
    }
}

fn main() {
    let circle = Circle { radius: 10.0 };
    println!("Area: {}", circle.area());
}

In this example, self is used to call the area method on the circle instance. Why do we need the & in fn area(&self)?

The & is a reference to the Circle instance. This is because we don't want to take ownership of the Circle instance when calling the area method. We only want to borrow it.

So... Are there other ways to use self? Yes, there are. Let's explore them.

self in methods

In the previous example, we used &self to borrow the Circle instance. We can also use self to take ownership of the instance. This is useful when we want to 'consume' the instance and return a new instance in its place. This is commonly used to implement type conversion methods.

For example:

struct Circle {
    radius: f64,
}

struct BigCircle {
    radius: f64,
}

impl Circle {
    fn new(radius: f64) -> Circle {
        Circle { radius }
    }

    fn into_big_circle(self) -> BigCircle {
        BigCircle { radius: self.radius * 2.0 }
    }
}

fn main() {
    let big_circle = Circle::new(2.0).into_big_circle();
    println!("Radius: {}", big_circle.radius);
}

In this example, the into_big_circle method takes ownership of the Circle instance and returns a new BigCircle instance with the new radius.

mut self in methods

We can also use mut self to take a mutable reference to the instance. This is useful when we want to modify the instance in place. This is commonly used to implement 'builder' patterns.

For example:

struct Circle {
    radius: f64,
}

impl Circle {
    fn new() -> Circle {
        Circle { radius: 1.0 }
    }

    fn with_radius(mut self, radius: f64) -> Circle {
        self.radius = radius;
        self
    }
}

fn main() {
    let circle = Circle::new().with_radius(10.0);
    println!("Radius: {}", circle.radius);
}

In this example, the with_radius method takes a mutable ownership of the Circle instance and modifies it in place.

&mut self in methods

We can also use &mut self to take a mutable reference to the instance. This is useful when we want to modify the instance in place, but don't want to take ownership of it.

For example:

struct Circle {
    radius: f64,
}

impl Circle {
    fn new() -> Circle {
        Circle { radius: 1.0 }
    }

    fn set_radius(&mut self, radius: f64) {
        self.radius = radius;
    }
}

fn main() {
    let mut circle = Circle::new();
    circle.set_radius(10.0);
    println!("Radius: {}", circle.radius);
}

Self

The Self keyword is used to refer to the type of the current instance. It is useful when we want to return a new instance of the same type from a method.

For example:

struct Circle {
    radius: f64,
}

impl Circle {
    fn new(radius: f64) -> Self {
        Circle { radius }
    }

    fn with_radius(mut self, radius: f64) -> Self {
        self.radius = radius;
        self
    }
}

fn main() {
    let circle = Circle::new(1.0).with_radius(10.0);
    println!("Radius: {}", circle.radius);
}

Exercises

This might be a good time to take a break and do some exercises. Have a look at the exercises in the Exercises section. Complete the exercise that is related to this session.

See you next time!

Session 3 - Tools of the trait

In this session we'll explore shared behaviour, so-called traits.

Because we will be relying more on the Rust standard library, it makes sense to point out that the Rust standard library documentation can be found at: https://doc.rust-lang.org.

Rust libraries are organized in Crates. When these Crates get published, documentation is added to this library automatically. For a full list of available Crates, navigate to: crates.io

Traits

In the previous session we implemented some functions on a struct. In Rust these functions can be grouped together to form a so-called trait. Like with humans, a trait is a combination of characteristics that all types that have that trait expose, let's say implement.

trait Fruitiness {
    fn is_sweet(&self) -> bool;
}

struct Pear {}

struct Lemon {}

impl Fruitiness for Pear {
    fn is_sweet(&self) -> bool {
        true
    }
}

impl Fruitiness for Lemon {
    fn is_sweet(&self) -> bool {
        false
    }
}

fn print_sweetness(id: &str, fruit: impl Fruitiness) {
    println!("{} is sweet? {}", id, fruit.is_sweet());
}

fn main() {
    let pear = Pear {};
    let lemon = Lemon {};
    print_sweetness("pear", pear);
    print_sweetness("lemon", lemon);
}

Traits can have default implementations that are available to all types that implement that trait. The default implementation can be overridden when desired.

trait Fruitiness {
    fn is_sweet(&self) -> bool {
        self.sweetness() >= 0.5
    }
    fn sweetness(&self) -> f32;
}

struct Pear {}

struct Lemon {}

impl Fruitiness for Pear {
    fn sweetness(&self) -> f32 {
        0.6
    }
}

impl Fruitiness for Lemon {
    fn sweetness(&self) -> f32 {
        0.2
    }
}

fn print_sweetness(id: &str, fruit: impl Fruitiness) {
    println!("{} is sweet? {}", id, fruit.is_sweet());
}

fn main() {
    let pear = Pear {};
    let lemon = Lemon {};
    print_sweetness("pear", pear);
    print_sweetness("lemon", lemon);
}

Traits are used throughout Rust, and often types implement a combination of traits, like the Display trait we discussed right at the beginning of this course.

use std::fmt::{Display, Formatter, Result};

trait Fruitiness {
    fn is_sweet(&self) -> bool {
        self.sweetness() >= 0.5
    }
    fn sweetness(&self) -> f32;
}

struct Pear {}

struct Lemon {}

impl Fruitiness for Pear {
    fn sweetness(&self) -> f32 {
        0.6
    }
}

impl Display for Pear {
    fn fmt(&self, f: &mut Formatter<'_>) -> Result {
        f.write_str("A pear")
    }
}

impl Fruitiness for Lemon {
    fn sweetness(&self) -> f32 {
        0.2
    }
}

impl Display for Lemon {
    fn fmt(&self, f: &mut Formatter<'_>) -> std::fmt::Result {
        f.write_str("A lemon")
    }
}

fn print_sweetness(fruit: impl Fruitiness + Display) {
    println!("{} is sweet? {}", fruit, fruit.is_sweet());
}

fn main() {
    let pear = Pear {};
    let lemon = Lemon {};
    print_sweetness(pear);
    print_sweetness(lemon);
}

The signature of the Display trait is common to all objects in Rust. RustRover will actually scaffold the implementation for you the moment you type: impl Display for Pear {}. Use the alt + Enter keyboard combination while the cursor is on the (red underlined) line. Then pick "Implement members."

If you are more "mouse" oriented, you can also click on the red (or yellow) light bulb and pick the same option from the list.

Notice the "use" statements at the top of our code. This is Rust's way of importing external objects and traits into our application. Often these are automatically added by the IDE. You can use the same alt + Enter combination to add missing imports.

You can see that traits can be combined using the + operator. If you need a combination of many traits you can use a type signature to make you code more readable.

use std::fmt::{Debug, Display, Formatter, Result};

trait Fruitiness {
    fn is_sweet(&self) -> bool {
        self.sweetness() >= 0.5
    }
    fn sweetness(&self) -> f32;
}

struct Pear {}

struct Lemon {}

impl Fruitiness for Pear {
    fn sweetness(&self) -> f32 {
        0.6
    }
}

impl Display for Pear {
    fn fmt(&self, f: &mut Formatter<'_>) -> Result {
        f.write_str("A pear")
    }
}

impl Debug for Pear {
    fn fmt(&self, f: &mut Formatter<'_>) -> Result {
        f.write_str("A debugged pear")
    }
}

impl Fruitiness for Lemon {
    fn sweetness(&self) -> f32 {
        0.2
    }
}

impl Display for Lemon {
    fn fmt(&self, f: &mut Formatter<'_>) -> std::fmt::Result {
        f.write_str("A lemon")
    }
}

impl Debug for Lemon {
    fn fmt(&self, f: &mut Formatter<'_>) -> Result {
        f.write_str("A debugged lemon")
    }
}

fn print_sweetness<T>(fruit: T)
    where
        T: Fruitiness + Display + Debug,
{
    println!("{} is sweet? {}", fruit, fruit.is_sweet());
}

fn main() {
    let pear = Pear {};
    let lemon = Lemon {};
    print_sweetness(pear);
    print_sweetness(lemon);
}

Without going into detail right now on how the mechanism actually works, I do want to point out that common traits can often be derived automatically by Rust. For example, the Debug trait in the above code, can be replaced with a derived implementation. It saves a few lines of code!

use std::fmt::{Debug, Display, Formatter, Result};

trait Fruitiness {
    fn is_sweet(&self) -> bool {
        self.sweetness() >= 0.5
    }
    fn sweetness(&self) -> f32;
}

#[derive(Debug)]
struct Pear {}

#[derive(Debug)]
struct Lemon {}

impl Fruitiness for Pear {
    fn sweetness(&self) -> f32 {
        0.6
    }
}

impl Display for Pear {
    fn fmt(&self, f: &mut Formatter<'_>) -> Result {
        f.write_str("A pear")
    }
}

impl Fruitiness for Lemon {
    fn sweetness(&self) -> f32 {
        0.2
    }
}

impl Display for Lemon {
    fn fmt(&self, f: &mut Formatter<'_>) -> std::fmt::Result {
        f.write_str("A lemon")
    }
}

fn print_sweetness<T>(fruit: T)
    where
        T: Fruitiness + Display + Debug,
{
    println!("{:?} is sweet? {}", fruit, fruit.is_sweet());
}

fn main() {
    let pear = Pear {};
    let lemon = Lemon {};
    print_sweetness(pear);
    print_sweetness(lemon);
}

Reference material

Traits: Defining Shared Behavior

Errors (part 1)

In Rust, errors are returned through the Result type as we've seen in previous chapters. Errors are 'normal' structs. Nevertheless, there is a convention to implement certain traits on types that represent an error: the std::fmt::Displayand std::error:Error traits from the standard library.

A typical error implementation could look like this:

type SampleResult<T> = std::result::Result<T, SampleError>;

#[derive(Debug, Clone)]
struct SampleError {
    pub message: String,
}

impl SampleError {
    fn new(msg: &str) -> Self {
        SampleError {
            message: msg.to_string(),
        }
    }
}

impl std::fmt::Display for SampleError {
    fn fmt(&self, f: &mut std::fmt::Formatter) -> std::fmt::Result {
        write!(f, "Sample error: {}", self.message)
    }
}

impl std::error::Error for SampleError {
    fn source(&self) -> Option<&(dyn std::error::Error + 'static)> {
        None
    }
}

fn failable_function() -> SampleResult<u32> {
    Err(SampleError::new("oops"))
}

fn main() {
    if let Err(err) = failable_function() {
        println!("Error: {err}");
    }
}

It could be useful to convert (low-level) errors into your custom error type as you propagate errors up in the chain. Rust has a "std::convert::From" trait that facilitates this conversion.

You can use the From trait to write conversion code for all types of structs, not only errors.

use std::fs::read_to_string;
use std::io;

type SampleResult<T> = std::result::Result<T, SampleError>;

#[derive(Debug, Clone)]
struct SampleError {
    pub message: String,
}

impl std::fmt::Display for SampleError {
    fn fmt(&self, f: &mut std::fmt::Formatter) -> std::fmt::Result {
        write!(f, "Sample error: {}", self.message)
    }
}

impl std::error::Error for SampleError {
    fn source(&self) -> Option<&(dyn std::error::Error + 'static)> {
        None
    }
}

impl std::convert::From<io::Error> for SampleError {
    fn from(io_err: io::Error) -> Self {
        SampleError {
            message: io_err.to_string(),
        }
    }
}

fn read_from_file() -> SampleResult<String> {
    match read_to_string("non_existing_file.txt") {
        Err(err) => Err(err.into()),
        Ok(content) => Ok(content),
    }
}

fn main() {
    if let Err(err) = read_from_file() {
        println!("Error: {err}");
    }
}

The into() function in the above code is the inverse of the from() function. The compiler is clever enough to let you use both of these, whilst only implementing the From trait. We could have also used Err(SampleError::from(err)).

The match function() { Err(err) => ... , Ok(c) => ... } pattern used in the above sample is a common way to check results. The Err(err) path can be used to quickly exit the function in case of an error.

Dealing with boilerplate

The above error handling code is quite verbose. There are popular crates that can help reduce the amount of boilerplate code you need to write: anyhow and thiserror. These two crates follow a different approach to error handling, and which one to pick depends on your use case.

anyhow makes it extremely easy to write concise error handling code, but is less useful for capturing and dealing with upstream errors. thiserror is more verbose, but allows you to define your own error types and implement the Error trait on them.

I'd suggest you use anyhow when you are writing a client application, a CLI tool, or some other binary where you are not exposing your error types to other libraries or applications. Use thiserror when you are writing a library and want to define your own error types.

The anyhow crate

The anyhow crate is a popular choice for error handling in Rust. It allows you to write concise error handling code without having to define your own error types. Here's how you can use it:

use anyhow::{anyhow, Result};

fn fallible_function() -> Result<u32> {
    Err(anyhow!("oops"))
}

fn main() -> Result<()> {
    if let Err(err) = fallible_function() {
        println!("Error: {err}");
    }
    Ok(())
}

With the anyhow! macro, you can create an error with a custom message. The Result type is a type alias for anyhow::Result<T>, which is a Result type that can hold any error type. This makes it easy to return errors from functions without having to define your own error types.

The thiserror crate

The thiserror crate can help reduce the boilerplate code of creating custom error types. Here's how you can use it:

use thiserror::Error;

#[derive(Error, Debug)]
enum SampleError {
    #[error("An error occurred: {0}")]
    Custom(String),
    #[error("An IO error occurred: {0}")]
    Io(#[from] std::io::Error),
    #[error("Some other error occurred: {0}")]
    Other(String),
}

fn fallible_function() -> Result<u32, SampleError> {
    Err(SampleError::Custom("oops".to_string()))
}

fn fallible_io_operation() -> Result<u32, SampleError> {
    let _ = std::fs::File::open("non_existent_file.txt")?;
    Ok(42)
}

fn main() -> Result<(), SampleError> {
    if let Err(err) = fallible_function() {
        match err {
            SampleError::Custom(msg) => println!("Custom error: {}", msg),
            SampleError::Io(err) => println!("IO error: {}", err),
            SampleError::Other(msg) => println!("Other error: {}", msg),
        }
    }

    fallible_io_operation()?;
    Ok(())
}

As you can see, the thiserror crate allows you to define your own error types and implement the Error trait on them with minimal boilerplate code. It allows "source" errors to be converted into your custom error type, and provides a convenient way to match on different error variants.

We'll revisit the error topic in a bit. First, I'd like to cover the structure of a Rust project.

Reference material

Organizing code & Project structure

Now that the samples are getting larger, it's a good moment to discuss how Rust code can be organized. There are different ways a Rust project can be structured. We'll look at some best practices and common patterns.

Rust code can be split into multiple .rs files. Each Rust file is called a module. These modules must be kept together by a 'parent' Rust file. This parent can be one of the following:

  • main.rs for applications
  • lib.rs for a library
  • mod.rs for sub-modules

An application with a main.rs and a single module person.rs is structured like this:

|- Cargo.toml
|- src
    |- main.rs
    |- person.rs

The main.rs must 'stitch' the modules together using a mod statement. In our case: mod person.

Filename: src/person.rs

#[derive(Debug)]
pub struct Person {
    pub name: String,
}

Filename: src/main.rs

use crate::person::Person;

mod person;

fn main() {
    let me = Person {
        name: "Marcel".to_string(),
    };
    println!("{:?}", me);
}

Output

Person { name: "Marcel" }

The pub keyword in front of struct and name means that we expose these items for use outside of the module, i.e. make them public.

The use statement imports the type(s) from the module, whereby sub-modules are separated by ::

crate refers to the current project.

Ferris organizing

Splitting files into submodules

A single Rust file can hold multiple modules. Each module is identified and scoped with a mod {...} statement for private modules, or pub mod {...} for public modules.

|- Cargo.toml
|- src
    |- main.rs
    |- database.rs

Filename: src/database.rs

pub mod project {
    #[derive(Debug)]
    pub struct Project {
        pub name: String,
    }
}

pub mod person {
    use crate::database::project::Project;

    #[derive(Debug)]
    pub struct Person {
        pub name: String,
        pub project: Option<Project>,
    }
}

Filename: src/main.rs

use crate::database::person::Person;
use crate::database::project::Project;

mod database;

fn main() {
    let project = Project {
        name: "Rust book".to_string(),
    };
    let person = Person {
        name: "Marcel".to_string(),
        project: Some(project),
    };

    println!("{:?}", person);
}

Output

Person { name: "Marcel", project: Some(Project { name: "Rust book" }) }

Typically, each module is kept in its own file, and the mod statement is used to 'stitch' the modules together. The exception is tests, which are often kept in the same file as the module they are testing.

Here's an example of a test in the same file as the module:

fn main() {
    let my_greeting = reverse_string("Hello World");
    println!("{my_greeting}");
}

fn reverse_string(input: &str) -> String {
    input.chars().rev().collect()
}

#[cfg(test)]
mod tests {
    use super::*;

    #[test]
    fn test_reverse_string() {
        assert_eq!(reverse_string("Hello World"), "dlroW olleH");
    }
}

By the way, you can run the tests with cargo test from the command line. This will run all tests in the project. We'll cover testing in more detail in a later chapter.

Larger modules

As your project grows, you may want to separate the modules into different files and group these into a submodule. To do this, you create a new directory under src with the name of the module and include an mod.rs file that typically only contains the mod statements to 'stitch' the Rust files of the module together.

|- Cargo.toml
|- src
    |- main.rs
    |- database
          |- mod.rs
          |- person.rs
          |- project.rs

Filename: src/database/mod.rs

pub mod person;
pub mod project;

Filename: src/database/project.rs

#[derive(Debug)]
pub struct Project {
    pub name: String,
}

Filename: src/database/person.rs

use crate::database::project::Project;

#[derive(Debug)]
pub struct Person {
    pub name: String,
    pub project: Option<Project>,
}

Filename: src/main.rs

use crate::database::person::Person;
use crate::database::project::Project;

mod database;

fn main() {
    let project = Project {
        name: "Rust book".to_string(),
    };
    let person = Person {
        name: "Marcel".to_string(),
        project: Some(project),
    };

    println!("{:?}", person);
}

Output

Person { name: "Marcel", project: Some(Project { name: "Rust book" }) }

The layout you choose depends a lot on the type of project you're working on. If you're working with a larger team on the same codebase, it may be easier to split modules into separate files.

Workspaces

Even larger projects can be split into separate binaries and libraries, that are kept together in a workspace. The Cargo Workspaces chapter in the Rust book explains this concept in detail.

Let's say we want to split the database from our binary into a reusable library. We can do this by creating the following workspace structure:

|- Cargo.toml
|- database_lib
|   |- Cargo.toml
|   |- src
|       |- lib.rs
|       |- person.rs
|       |- project.rs
|- my_project
    |- Cargo.toml
    |- src
        |- main.rs

The top-level Cargo.toml would look like this:

[workspace]
members = [
    "database_lib",
    "my_project",
]
resolver = "2"

The Cargo.toml in the database_lib would look like this:

[package]
name = "database_lib"
version = "0.1.0"
edition = "2021"

[dependencies]

The Cargo.toml in the my_project would look like this:

[package]
name = "my_project"
version = "0.1.0"
edition = "2021"

[dependencies]
database_lib = { path = "../database_lib" }

The lib.rs would not include a main function, but instead a pub mod statement for each module:

pub mod person;
pub mod project;

If you push this project to GitHub (or any other Git repository) , you can use the dependencies section in the Cargo.toml to include the database_lib in another project. See the next chapter for more details.

Make sure to make all the types you want to use in the my_project public by adding pub in front of the type.

Using types from external modules

In the above example we are importing the Project-type through the use crate::database::project::Project statement. By doing so, we can use Project in the rest of the code, without the need for typing out the full path all the time.

Alternatively, we could have written:

mod database;

fn main() {
    let project = crate::database::project::Project {
        name: "Rust book".to_string(),
    };
    let person = crate::database::person::Person {
        name: "Marcel".to_string(),
        project: Some(project),
    };

    println!("{person:?}");
}

You can see how such code can become very verbose, very quickly.

In the case of a naming conflict - two types from different crates share the same name - you can import the type's path, and associate an alias. Like so:

use crate::database::person as prs;
use crate::database::project as prj;

mod database;

fn main() {
    let project = prj::Project {
        name: "Rust book".to_string(),
    };
    let person = prs::Person {
        name: "Marcel".to_string(),
        project: Some(project),
    };

    println!("{person:?}");
}

You can also create an alias for a specific type:

use crate::database::person::Person as Someone;
use crate::database::project::Project;

mod database;

fn main() {
    let project = Project {
        name: "Rust book".to_string(),
    };
    let person = Someone {
        name: "Marcel".to_string(),
        project: Some(project),
    };

    println!("{person:?}");
}

Reference material

Importing Crates

Now that you know how to organize your Rust project. Let's look at how 3rd party Crates can be added to the application. Dependencies are, amongst other things, managed By the Cargo.toml file.

This is the Cargo.toml file that was generated for our HelloWorld example:

[package]
name = "hello_rust_world"
version = "0.1.0"
authors = ["Marcel <[email protected]>"]
edition = "2021"

# See more keys and their definitions at https://doc.rust-lang.org/cargo/reference/manifest.html

[dependencies]

To add a dependency, often you just need to add a single line to the [dependencies] section. To import the rand crate version 0.7.3, you would add this line:

[dependencies]
rand = "0.7.3"

You can find the latest version of a crate on crates.io.

Instead of editing cargo.toml manually, you can use the cargo add command line tool to add a dependency. For example, to add the serde-json crate, you would run:

$ cargo add serde-json                                                                                                                                                                        13:02:26  ☁  main ☂ ⚡ ✭
    Updating crates.io index
warning: translating `serde-json` to `serde_json`
      Adding serde_json v1.0.114 to dependencies.
             Features:
             + std
             - alloc
             - arbitrary_precision
             - float_roundtrip
             - indexmap
             - preserve_order
             - raw_value
             - unbounded_depth
    Updating crates.io index

As you can see, the cargo add command automatically translates the crate name to the correct name, and adds the latest version of the crate to the Cargo.toml file. It also lists the features that are available for the crate.

Limiting the features

When you add a dependency, Cargo will automatically import all the default features of that crate. This is usually what you want, as it makes the crate easier to use. When more control is required, additional parameters can be passed. You can, for example, extend and limit the features you are importing, by passing the features argument:

[dependencies]
serde = { version = "1.0.106", features = ["derive"] }

This will import the serde crate version 1.0.106, including the derive feature.

In case you do not want to import the default features, you can pass the default-features = false argument:

chrono = { version = "0.4", default-features = false, features = [
    "clock",
    "std",
] }

In this example, the chrono crate version 0.4 is imported, but without the default features. Instead, the clock and std features are imported.

Importing from Git or local

Sometimes you need a Crate that is not publicly published, in that case you can import directly from a Git repository:

[dependencies]
kp-lib-rs = { git = "https://bitbucket.forge.somewhere.com/scm/someservice/rust_common.git", tag = "v0.0.555", features = ["common", "smgr"] }

You can also import a Crate from your local machine:

[dependencies]
smgr_model = { path = "../smgr_model" }

Random numbers

Let's try this out extend our HelloWorld example, to generate a bunch of random numbers.

In RustRover when you are adding a dependency, you can press Ctrl + Space while the cursor is between the "" to get the latest version of that dependency.

Once you have added the dependency to the Cargo.toml, a symbol with three stacked crates appear in front of the dependency. When you click on that icon, you are automatically taken to the documentation of that crate, for that particular version.

Filename: Cargo.toml

[package]
name = "random_numbers"
version = "0.1.0"
authors = ["Marcel <[email protected]>"]
edition = "2021"

# See more keys and their definitions at https://doc.rust-lang.org/cargo/reference/manifest.html

[dependencies]
rand = "0.7.3"

Filename: main.rs

use rand::prelude::*;

fn main() {
    let mut rng = rand::thread_rng();

    for _i in 1..10 {
        let random_number: u8 = rng.gen(); // generates an integer between 0 and 255
        println!("Generated: {}", random_number);
    }
}

Reference material

Testing

In the previous chapters, we've looked at modularizing a Rust application. Modules are a key component for adding unit tests to a Rust application. These modules get a special annotation, so Cargo knows that the module should be ran during testing: #[cfg(test)]

In RustRover you can easily scaffold a test module by typing tmod. A test function can then added by typing tfn. A typical unit test looks like this:

fn capitalize(value: String) -> String {
    value.to_uppercase()
}

fn main() {
    println!("{}", capitalize("Rust".to_string()));
}

#[cfg(test)]
mod tests {
    use super::*;

    #[test]
    fn test_caps() {
        let input = "Rust".to_string();
        let output = capitalize(input);
        assert_eq!("RUST", output);
    }
}

There are three assertions available out-of-the-box that can be used during unit testing:

  • assert!()
  • assert_eq!()
  • assert_ne!()

You can use the play-button ▶️ in RustRover to run a single unit test, or a whole test module. Alternatively the key combination Ctrl + r does the same, based on the position of the cursor in the test module.

Cargo can also execute tests: cargo test. Remember the double-ctrl "Run Anything" operation in RustRover to easily execute Cargo commands.

Benchmark testing

Benchmarks are slightly harder to set up, as they require an external Crate to run. We use criterion.

First we'll include the criterion dependency in Cargo.toml:

[dev-dependencies]
criterion = "0.3.1"

[[bench]]
name = "capitalize"
harness = false

Then we create a benches directory that will hold our benchmark test. We'll also create a library with a single public function: capitalize_string, which is what we'll benchmark.

Our project structure looks like this:

|- Cargo.toml
|- benches
    |- capitalize.rs
|- src
    |- lib.rs

Notice that the name in the Cargo.toml file matches the capitalize.rs name in the benches directory.

Filename: src/lib.rs

pub fn capitalize_string(input: String) -> String {
    input.to_uppercase()
}

Filename: benches/capitalize.rs

#[macro_use]
extern crate criterion;
use capitalize::capitalize_string;
use criterion::Criterion;

fn bench_caps(c: &mut Criterion) {
    c.bench_function("capitalize strings", |b| {
        b.iter(|| capitalize_string("rust".to_string()))
    });
}

criterion_group!(benches, bench_caps);
criterion_main!(benches);

The || (pipe-symbols) in the benchmark code are used to create a closure. We'll look into closures in the next chapter.

You can start the test by running cargo bench.

cargo bench builds a release version of the application for benchmarking.

Criterion benchmarks the current run, compares the result to a previous run, and shows the performance delta.

Output from first run

capitalize strings      time:   [223.59 ns 225.70 ns 228.04 ns]
Found 1 outliers among 100 measurements (1.00%)
  1 (1.00%) high severe

Output from second run

capitalize strings      time:   [219.78 ns 222.41 ns 226.20 ns]
                        change: [-4.7158% -3.0100% -1.3074%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 6 outliers among 100 measurements (6.00%)
  4 (4.00%) high mild
  2 (2.00%) high severe

Reference material

Errors - part 2

Previously, we've seen how to create and handle errors in Rust. In this chapter, we'll look at how to log errors.

Using println to print an error to standard output, has no practical use outside an example. If you want to output errors in production code, you should use Rust's logging framework: log.

Include it in the Cargo.toml file:

[dependencies]
log = "0.4"

The log crate does not include an implementation. You need to combine it with a specific implementation in order for it to work.

We use the env_logger logging implementation. It is very easy to use and provides a good starting point for logging in Rust. It reads the RUST_LOG environment variable to set the log level. Possible values are trace, debug, info, warn, and error.

Add the following to your Cargo.toml file:

[dependencies]
log = "0.4"
env_logger = "0.11"

Without re-doing the whole "Error" example again, here's how you can replace the println statement with some logging variants:

use log::*;

fn main() {
    env_logger::init();
    trace!("This line is not printed");
    debug!("Neither is this line");
    info!("Playground initialized...");
    warn!("Failed to retrieve user {} from the database: {}", "bob", "user not found");
    error!("Failed to retrieve user {} from the database: {}", "bob", "I/O error; timeout");
}

Development teams will have their own set of rules on the use of the various log levels. We use these:

  • TRACE: lowest level step-through of the program code.Only practically useful in a test/dev environment due to the large amount of data logged.
  • DEBUG: optional logs to help (developers) identify why a particular situation/error occurred.
  • INFO: (regular) information that is useful for an operation team to monitor proper behaviour of the application.
  • WARN: expected (user) errors, that have no effect on the correct functioning of the application.
  • ERROR: error situations that need an action to get remediated. The application cannot recover from these errors.

In general, make sure to include a unique identifier in all your log statements. You want to be able to grep log output and find all related log-lines easily.

We'll expand on logging in this chapter, where we'll cover how to write meaningful log messages.

Ownership (part 2)

In this chapter, we'll explore a number of common ownership issues. I'll demonstrate the naive way that is causing the borrow issue and then a proposed solution.

Loops

Issue

fn main() {
    let people = vec!["Marcel", "Tom", "Dick", "Harry"];
    for person in people {
        println!("Hi {person}")
    }
    for person in people {
        println!("Hi {person}")
    }
}

The issue is that while iterating the first time through the people vector, the value held in people is moved to person. This means that at the end of the first loop, people is consumed by the loop, and can no longer be used.

Solution

To fix this case, you can simply borrow from people on the first loop.

fn main() {
    let people = vec!["Marcel", "Tom", "Dick", "Harry"];
    for person in &people {
        println!("Hi {person}")
    }
    for person in people {
        println!("Hi {person}")
    }
}

Optionals

Issue

struct Person {
    first_name: Option<String>,
}

fn main() {
    let person = Person {
        first_name: Some("Marcel".to_string()),
    };

    if let Some(first_name) = person.first_name {
        println!("Hi {}", first_name)
    }

    if let Some(first_name) = person.first_name {
        println!("Hi {}", first_name)
    }
}

The issue here is that person is partially moved into first_name by the first Some(first_name) statement.

Solution

struct Person {
    first_name: Option<String>,
}

fn main() {
    let person = Person {
        first_name: Some("Marcel".to_string()),
    };

    if let Some(first_name) = &person.first_name {
        println!("Hi {first_name}")
    }

    if let Some(first_name) = person.first_name {
        println!("Hi {first_name}")
    }
}

You can solve this quite easily by borrowing person in the initial if let statement. This will actually transform first_name from a String to a &String. Allowing you to use person in the second if let statement.

Async tasks

Problem

use tokio::task::JoinHandle;
use tokio::time::Duration;

struct AnswerToLife {
    answer: i32,
}

impl AnswerToLife {
    fn compute(&mut self) -> JoinHandle<()> {
        let task = tokio::spawn(async move {
            tokio::time::sleep(Duration::from_secs(3)).await;
            self.answer = 42;
        });
        println!("We are computing the answer to life. Please be patient...");
        task
    }

    fn print(&self) {
        println!("The answer to life is {}", self.answer);
    }
}

#[tokio::main]
async fn main() {
    let mut big_question = AnswerToLife { answer: 0 };
    let task = big_question.compute();
    task.await.unwrap();
    big_question.print();
}

If this looks like abracadabra to you, have a look at the chapter on Asynchronous programming.

The problem here is that self is moving into the new tokio::task. This is an issue, because the lifetime of the spawned task can not be determined at compile time. The compiler will hint at introducing a 'static lifetime for self, but that will likely set you off on a wild goose chase. Focus on what you're trying to accomplish. In this case we want to set answer after doing some intense computation. The problem is that in the current data-model we need to have a mutable reference to self to do this. Why don't we solve that, and see if we can change answer, without a reference to self.

Solution

use std::sync::Arc;
use tokio::sync::Mutex;
use tokio::task::JoinHandle;
use tokio::time::Duration;

struct AnswerToLife {
    answer: Arc<Mutex<i32>>,
}

impl AnswerToLife {
    fn compute(&self) -> JoinHandle<()> {
        let lockable_reference_to_answer = self.answer.clone();
        let task = tokio::spawn(async move {
            tokio::time::sleep(Duration::from_secs(3)).await;
            let mut locked_answer = lockable_reference_to_answer.lock().await;
            *locked_answer = 42;
        });
        println!("We are computing the answer to life. Please be patient...");
        task
    }

    async fn print(&self) {
        let locked_answer = self.answer.lock().await;
        println!("The answer to life is {locked_answer}");
    }
}

#[tokio::main]
async fn main() {
    let big_question = AnswerToLife {
        answer: Arc::new(Mutex::from(0)),
    };
    let task = big_question.compute();
    task.await.unwrap();
    big_question.print().await;
}

By wrapping the answer in an Arc we solve two issues at once. compute no longer need a mutuable reference, because the Arc is read-only. And two, we can clone the Arc which is a cheap operation and move the clone into the tokio::task.

Remember that we are cloning the reference to the AnswerToLife structure, not the structure itself.

To allow changes to the answer after computing it, we need to wrap it in a Mutex.

Make sure to use the tokio::sync::Mutex not the one from the standard library.

Some more Traits

In Chapter 5.1 we introduced the concept of traits. In this chapter we'll look at some more advanced features of traits.

  • How can traits be passed to functions?
  • Associated types
  • Type conversion traits

Passing traits to functions

In the previous chapter we saw how we can implement a trait for a struct. We can also pass a trait to a function. This is useful when we want to write functions that can operate on different types that implement the same trait.

For example:

trait Shape {
    fn area(&self) -> f64;
}

struct Circle {
    radius: f64,
}

impl Shape for Circle {
    fn area(&self) -> f64 {
        std::f64::consts::PI * self.radius.powi(2)
    }
}

struct Square {
    side: f64,
}

impl Shape for Square {
    fn area(&self) -> f64 {
        self.side.powi(2)
    }
}

fn print_area<S: Shape>(shape: S) {
    println!("Area: {}", shape.area());
}

fn main() {
    let circle = Circle { radius: 10.0 };
    let square = Square { side: 5.0 };

    print_area(circle);
    print_area(square);
}

In this example, we define a Shape trait with an area method. We then implement the Shape trait for the Circle and Square structs. We can then pass instances of Circle and Square to the print_area function, which takes any type that implements the Shape trait.

We use the S: Shape syntax to specify that the print_area function takes a type S that implements the Shape trait. This is also referred to as a generic type parameter.

When using generics in this way, the compiler will use so-called static dispatch to determine which implementation of the Shape trait to use at compile time. This can be more efficient than dynamic dispatch. Static dispatch can lead to faster code, but it requires the compiler to generate separate code for each function that uses the generic type.

Dynamic dispatch

In some cases, you may want to use dynamic dispatch instead of static dispatch. Dynamic dispatch allows you to work with trait objects, which are objects that implement a trait but have an unknown concrete type at compile time.

To use dynamic dispatch, you can use the dyn keyword to refer to a trait object. For example:

trait Shape {
    fn area(&self) -> f64;
}

struct Circle {
    radius: f64,
}

impl Shape for Circle {
    fn area(&self) -> f64 {
        std::f64::consts::PI * self.radius.powi(2)
    }
}

struct Square {
    side: f64,
}

impl Shape for Square {
    fn area(&self) -> f64 {
        self.side.powi(2)
    }
}

fn print_area(shape: &dyn Shape) {
    println!("Area: {}", shape.area());
}

fn main() {
    let circle = Circle { radius: 10.0 };
    let square = Square { side: 5.0 };

    print_area(&circle);
    print_area(&square);
}

Dynamic dispatching is more flexible than static dispatching, but it is slightly less efficient.

Associated types

In Rust, you can define associated types for a trait. Associated types are types that are associated with a trait, but are not specified until the trait is implemented.

For example:

use std::fmt::Display;

trait Shape {
    type Output: Display;

    fn area(&self) -> Self::Output;
}

struct Circle {
    radius: f64,
}

impl Shape for Circle {
    type Output = f64;

    fn area(&self) -> f64 {
        std::f64::consts::PI * self.radius.powi(2)
    }
}

struct Square {
    side: f32,
}

impl Shape for Square {
    type Output = f32;

    fn area(&self) -> f32 {
        self.side.powi(2)
    }
}

fn print_area<S: Shape>(shape: S) {
    println!("Area: {}", shape.area());
}

fn main() {
    let circle = Circle { radius: 10.0 };
    let square = Square { side: 5.0 };

    print_area(circle);
    print_area(square);
}

In the above example, we define a Shape trait with an associated type Output. The Output type is not specified in the trait definition, but is specified when the trait is implemented for a type. The only requirement is that the Output type must implement the Display trait. This allows us to use the Display trait in the print_area function.

When implementing the Shape trait for the Circle and Square structs, we specify the Output type as f64 and f32

Type conversion Traits

Rust provides a number of traits that allow you to convert between types. For example, the From and Into traits allow you to convert from one type to another. In Chapter 4.6 we looked at an C-style enumeration:

#![allow(unused)]
fn main() {
#[repr(u8)]
enum Color {
    Red = 0,
    Green = 1,
    Blue = 2,
}
}

We can use the From trait to convert from an u8 to a Color:

#![allow(unused)]
fn main() {
impl From<u8> for Color {
    fn from(value: u8) -> Self {
        match value {
            0 => Color::Red,
            1 => Color::Green,
            2 => Color::Blue,
            _ => panic!("Invalid value"),
        }
    }
}
}

The same can be done for the other direction:

#![allow(unused)]
fn main() {
impl From<Color> for u8 {
    fn from(color: Color) -> Self {
        match color {
            Color::Red => 0,
            Color::Green => 1,
            Color::Blue => 2,
        }
    }
}
}

Now we can easily convert between Color and u8:

#[repr(u8)]
#[derive(Debug)]
enum Color {
    Red = 0,
    Green = 1,
    Blue = 2,
}

impl From<Color> for u8 {
    fn from(color: Color) -> Self {
        match color {
            Color::Red => 0,
            Color::Green => 1,
            Color::Blue => 2,
        }
    }
}

impl From<u8> for Color {
    fn from(value: u8) -> Self {
        match value {
            0 => Color::Red,
            1 => Color::Green,
            2 => Color::Blue,
            _ => panic!("Invalid value"),
        }
    }
}

fn main() {
    let color: Color = 1.into();
    println!("{color:?}");
    let green_value: u8 = Color::Green.into();
    println!("{green_value:?}");
}

TryFrom

The conversion from an u8 to a Color can fail when an integer is passed that is not in the range of the enumeration. In this case the TryFrom conversion trait would have been the better choice. The TryFrom trait is similar to the From trait, but it returns a Result instead of panicking. The Error type is specified as an associated type.

use std::convert::TryFrom;

#[repr(u8)]
#[derive(Debug)]
enum Color {
    Red = 0,
    Green = 1,
    Blue = 2,
}

impl TryFrom<u8> for Color {
    type Error = String;

    fn try_from(value: u8) -> Result<Self, Self::Error> {
        match value {
            0 => Ok(Color::Red),
            1 => Ok(Color::Green),
            2 => Ok(Color::Blue),
            _ => Err(format!("Invalid Color value: {value}")),
        }
    }
}

fn main() {
    match Color::try_from(10) {
        Ok(color) => println!("{color:?}"),
        Err(err) => println!("{err}"),
    }
}

If you prefer to use String representations of the enumeration variants, you can use the FromStr trait. This trait allows you to convert a string to an enumeration variant.

use std::fmt::{Display, Formatter};
use std::str::FromStr;

enum Color {
    Red,
    Green,
    Blue,
}

impl FromStr for Color {
    type Err = String;

    fn from_str(s: &str) -> Result<Self, Self::Err> {
        match s {
            "Red" => Ok(Color::Red),
            "Green" => Ok(Color::Green),
            "Blue" => Ok(Color::Blue),
            _ => Err(format!("Invalid Color value: {s}")),
        }
    }
}

impl Display for Color {
    fn fmt(&self, f: &mut Formatter<'_>) -> std::fmt::Result {
        match self {
            Color::Red => write!(f, "Red"),
            Color::Green => write!(f, "Green"),
            Color::Blue => write!(f, "Blue"),
        }
    }
}

fn main() {
    match "Green".parse::<Color>() {
        Ok(color) => println!("{color}"),
        Err(err) => println!("{err}"),
    }
}

Although you can implement these enum conversions yourself, you may want to look at the great strum crate, which provides a lot of useful macros to make this easier.

Exercises

Continue with the exercise in the Exercises section. Have a great time coding!

Reference material

Session 4 - Better functions with Functional programming

Although Rust is not a functional programming language, it has many features that make it a great choice for functional programming. In this session, we'll look at some of these features and how they can be used to write functional code.

Objective

After this session, you should be able to:

  • use functional programming with iterators
  • use functional programming with Option and Result

Iterators

Most likely, the first place where you get a whiff of the power of functional programming with Rust is when you want to iterate over a collection of items. In Rust, iterators are a powerful tool to work with collections. They are a way to perform some operation on each item in a collection, one at a time. This is a common pattern in programming, and Rust's iterators make it easy to do.

Rust's iterators are lazy, meaning they have no effect until you call methods that consume the iterator to produce a result. This is a powerful feature, as it allows you to chain together a series of operations on a collection without having to create intermediate collections.

Suppose we want to capitalize a list of cities. We could do this with a loop:

fn main() {
    let cities = vec!["rome", "barcelona", "berlin"];
    let mut cities_caps = vec![];

    for city in cities {
        cities_caps.push(city.to_uppercase());
    }

    println!("{cities_caps:?}");
}

Although this works, it is not the most idiomatic way of accomplishing this task in Rust. Let's see how we can use iterators to achieve the same task in a more idiomatic way.

fn main() {
    let cities = vec!["rome", "barcelona", "berlin"];
    let cities_caps: Vec<String> = cities.into_iter().map(|city| city.to_uppercase()).collect();
    println!("{cities_caps:?}");
}

This code is more concise and idiomatic. It uses the map method to transform each item in the vector to uppercase, and the collect method to collect the results into a new vector. The |city| syntax is a closure, which is a way to define a function inline. We'll cover closures in more detail in the next chapter.

A good place to get familiar with closures is the chapter on closures in the Rust book, Let's re-do the capitalize example from the previous chapter using a closure.

fn main() {
    let capitalize = |value: &str| value.to_uppercase();
    let cities = vec!["rome", "barcelona", "berlin"];
    let cities_caps: Vec<String> = cities.into_iter().map(capitalize).collect();
    println!("{cities_caps:?}");
}

Let's break down the example.

As per the Rust documentation, the into_iter() method creates a consuming iterator, that is, one that moves each value out of the vector (from start to end). The vector cannot be used after calling this. The more commonly used iter() method returns an iterator over the slice, while borrowing the values.

Iterators support methods like, map, filter, take, skip, for_each and flatten, that you may be familiar with from map/reduce functions in other languages. We use the map() function to apply the capitalize operation to each element in the vector.

The collect() method is used to collect all the elements returned by the iterator and capture them in a new vector.

In Rust the performance of iterators is identical or better compared to loops. See Comparing Performance: Loops vs. Iterators

By using iterators we can be more expressive and concise in our code. This gives the compiler the opportunity to optimize the code better, and it makes the code easier to read and understand.

Operations during an iteration can be chained together. If we're only interested in cities with a "B" we can easily include a filter in the iteration.

fn main() {
    let capitalize = |value: &str| value.to_uppercase();
    let cities = vec!["rome", "barcelona", "berlin"];
    let cities_caps: Vec<String> = cities
        .into_iter()
        .filter(|c| c.starts_with("b"))
        .map(capitalize)
        .collect();

    println!("{cities_caps:?}");
}

Notice that we can directly apply the filter logic within the iterator using the closure syntax.

It is a good practice to first reduce the dataset, before performing computations on the data. So if we are interested in the second city starting with a "B", first skip and take, and then map the result:

fn main() {
    let capitalize = |value: &str| value.to_uppercase();
    let cities = vec!["rome", "barcelona", "berlin"];
    let city_in_caps: Option<String> = cities
        .into_iter()
        .filter(|c| c.starts_with("b"))
        .skip(1)
        .take(1)
        .map(capitalize)
        .next();

    println!("{city_in_caps:?}");
}

Notice that in this case we use the next method to get the first element of the iterator. The next method returns an Option type, which is Some if there is an element, and None if there is no element.

If needed, you can combine iterators with 'regular' loops, like in this example:

fn main() {
    let capitalize = |value: &str| value.to_uppercase();
    let cities = vec!["rome", "barcelona", "berlin"];
    let cities_caps = cities
        .into_iter()
        .filter(|c| c.starts_with("b"))
        .map(capitalize);

    for city in cities_caps {
        println!("{city:?}");
    }
}

In such a model, it may be useful to have the index for the element while looping. You can obtain this with the .enumerate() method.

fn main() {
    let capitalize = |value: &str| value.to_uppercase();
    let cities = vec!["rome", "barcelona", "berlin"];
    let cities_caps = cities
        .into_iter()
        .filter(|c| c.starts_with("b"))
        .map(capitalize)
        .enumerate();

    for (idx, city) in cities_caps {
        println!("{idx}: {city:?}");
    }
}

Iterators are lazy

What does this mean? It means that the iterator does not do anything until you call a method that consumes the iterator. Let's look at an example:

fn main() {
    let numbers_iter = [1, 2, 3, 4, 5].iter().map(|n| {
        print!("processing {n} -> ");
        n * 2
    });

    println!("numbers_iter created");

    for n in numbers_iter {
        println!("{n}");
    }
}

When you run this code, you will see that the println!("numbers_iter created") statement is executed before the print!("processing {n} -> ") statement. This is because the map method is lazy, and does not do anything until the iterator is consumed.

numbers_iter created
processing 1 -> 2
processing 2 -> 4
processing 3 -> 6
processing 4 -> 8
processing 5 -> 10

Reference material

Functional programming with Options and Results

In this previous chapter, we've seen how to use functional programming with iterators. In this chapter we'll see how to use functional programming with Option and Result.

Option

The Option type is used in Rust to represent a value that can be either something or nothing. The Option type is defined as follows:

#![allow(unused)]
fn main() {
enum Option<T> {
    Some(T),
    None,
}
}

The Option type is generic, meaning that it can hold any type of value. The Some variant holds a value of type T, and the None variant represents the absence of a value.

It can be very useful to use a map method on an Option to transform the value inside the Option. If the Option contains a value, the map method applies the function to the value and returns a new Option containing the result. If the Option is None, the map method does nothing and returns None.

Let's look at an example:

fn main() {
    let maybe_number = Some(5);
    let maybe_number_plus_one = maybe_number.map(|number| number + 1);
    println!("{maybe_number_plus_one:?}"); // Some(6)

    let maybe_number = None;
    let maybe_number_plus_one = maybe_number.map(|number| number + 1);
    println!("{maybe_number_plus_one:?}"); // None
}

In a similar way you can also assign a default value to an Option using the unwrap_or method. If the Option contains a value, the unwrap_or method returns the value. If the Option is None, the unwrap_or method returns the default value.

fn main() {
    let maybe_number = Some(5);
    let number = maybe_number.unwrap_or(0);
    println!("{number:?}"); // 5

    let maybe_number = None;
    let number = maybe_number.unwrap_or(0);
    println!("{number:?}"); // 0
}

If the map operation returns an Option and you want to flatten the result, you can use the and_then method. The and_then method applies the function to the value inside the Option and returns the result. If the Option is None, the and_then method does nothing and returns None.

fn main() {
    let only_even_numbers = |number| -> Option<i32> {
        if number % 2 == 0 {
            Some(number)
        } else {
            None
        }
    };

    let maybe_even = Some(6).and_then(only_even_numbers);
    println!("{maybe_even:?}"); // Some(6)

    let maybe_odd = Some(5).and_then(only_even_numbers);
    println!("{maybe_odd:?}"); // None
}

Results

The Result type is used in Rust to represent a value that can be either a success or a failure. The Result type is defined as follows:

#![allow(unused)]
fn main() {
enum Result<T, E> {
    Ok(T),
    Err(E),
}
}

The Result type is also generic, meaning that it can hold any type of success value and any type of error value. The Ok variant holds a success value of type T, and the Err variant holds an error value of type E. In that respect, Result is similar to Option, but with the added ability to hold an error value.

With that in mind, you can see how Result can be used in a similar way to Option. You can use a map method on a Result to transform the value inside the Result. If the Result contains a success value, the map method applies the function to the value and returns a new Result containing the result. If the Result is an error, the map method does nothing and returns the error.

fn main() {
    let maybe_number: Result<i32, &str> = Ok(5);
    let maybe_number_plus_one = maybe_number.map(|number| number + 1);
    println!("{maybe_number_plus_one:?}"); // Ok(6)

    let maybe_number: Result<i32, &str> = Err("error");
    let maybe_number_plus_one = maybe_number.map(|number| number + 1);
    println!("{maybe_number_plus_one:?}"); // Err("error")
}

Many of the operations that can be performed on Option can also be performed on Result. For example, you can use the unwrap_or method to assign a default value to a Result. If the Result contains a success value, the unwrap_or method returns the value. If the Result is an error, the unwrap_or method returns the default value.

Switching between Option and Result

You can use the ok method to convert an Option to a Result. If the Option contains a value, the ok method returns a Result containing the value. If the Option is None, the ok method returns a Result containing the default error value.

And vice versa, you can use the ok_or method to convert a Result to an Option. If the Result contains a success value, the ok_or method returns an Option containing the value. If the Result is an error, the ok_or method returns an Option containing the error value.

fn main() {
    let maybe_number = Some(5);
    let result = maybe_number.ok_or("error");
    println!("{result:?}"); // Ok(5)

    let maybe_number: Option<i32> = None;
    let result = maybe_number.ok_or("error");
    println!("{result:?}"); // Err("error")

    let result: Result<i32, &str> = Ok(5);
    let maybe_number = result.ok();
    println!("{maybe_number:?}"); // Some(5)

    let result: Result<i32, &str> = Err("error");
    let maybe_number = result.ok();
    println!("{maybe_number:?}"); // None
}

Reference material

Passing functions around

In Rust, functions are first-class citizens, which means you can pass them around as arguments to other functions. This is a powerful feature that allows you to write more flexible and reusable code.

Let's say we have a function that takes a string, applies a transformation to it, and returns the result. We can define a type alias for the transformation function, and then pass it as an argument to the main function.

Like this:

type Transformation = fn(String) -> String;

fn main() {
    let input = "Hello, world!".to_string();
    let result = transform_string(input, to_upper_case);
    let result = transform_string(result, reverse);
    println!("The result is: {}", result);
}

fn transform_string(input: String, operation: Transformation) -> String
{
    operation(input)
}

fn to_upper_case(input: String) -> String {
    input.to_uppercase()
}

fn reverse(input: String) -> String {
    input.chars().rev().collect()
}

The transform_string function takes a string and a transformation function, and applies the transformation to the string. In the main function, we pass the to_upper_case and reverse functions as arguments to transform_string.

The type Transformation is a type alias for a function that takes a String and returns a String. This allows us to pass any function that matches this signature as an argument to transform_string.

We can also rewrite the example using closures. The to_upper_case and reverse functions can be replaced with closures that do the same thing. As long as the closure has the same signature as the Transformation type alias, it can be passed as an argument to transform_string.

Like this:

type Transformation = fn(String) -> String;

fn main() {
    let to_upper_case = |input: String| input.to_uppercase();
    let reverse = |input: String| input.chars().rev().collect();

    let input = "Hello, world!".to_string();
    let result = transform_string(input, to_upper_case);
    let result = transform_string(result, reverse);
    println!("The result is: {}", result);
}

fn transform_string(input: String, operation: Transformation) -> String
{
    operation(input)
}

Because the to_upper_case and reverse closures have the same signature as the Transformation type alias, they can be stored in an array of transformations, and we can iterate over the array to apply each transformation to the input string.

type Transformation = fn(String) -> String;

fn main() {
    let to_upper_case = |input: String| input.to_uppercase();
    let reverse = |input: String| input.chars().rev().collect();
    let transformations = [to_upper_case, reverse];

    let mut input = "Hello, world!".to_string();
    for transformation in transformations {
        input = transform_string(input, transformation);
    }

    println!("The result is: {}", input);
}

fn transform_string(input: String, operation: Transformation) -> String
{
    operation(input)
}

You can see how powerful this feature is. It allows you to write more flexible and reusable code by passing functions around as arguments to other functions. This is a common pattern in Rust, and you'll see it used in many libraries and frameworks.

Callbacks

Another common use case for passing functions around is callbacks. A callback is a function that is passed as an argument to another function and is called by that function at a later time. It is often used to provide the caller with a way to display events or notifications.

Let's look at an example of using callbacks in Rust.

type CallbackFn = fn(u8);

fn main() {
    let callback = |progress: u8| println!("Progress: {progress:03}%");
    copy_imaginary_file("file.txt", callback);
}

fn copy_imaginary_file(filename: &str, callback: CallbackFn)
{
    println!("Copying file: {filename}");
    for i in 0..=10 {
        callback(i * 10);
    }
    println!("File copied!");
}

Exercises

Let's revisit the Exercises chapter and see if we can solve them with your knowledge on iterators and functional programming. Good luck!

Session 5 - Close the loop with closures

I'd recommend to get yourself familiar with closures by reading the Closures: Anonymous Functions that Can Capture Their Environment chapter in the Rust Book.

During this session, we'll look at some practical uses of closures as well as functional programming in Rust. We will also take the first steps in asynchronous Rust programming.

Closures

In Chapter 4.1 we had a first look at a closure. In this chapter, we'll dive deeper into closures, especially how they can capture their environment, and the effect this has on ownership and borrowing.

What does "capturing their environment" mean? It means that a closure can use variables from the scope in which it was defined. An example will make this clearer.

fn main() {
    let name = "Marcel".to_string();
    let greeting = "Hello".to_string();

    let print_greeting = || {
        println!("{greeting}, {name}");
    };

    print_greeting();
    println!("{name}");
}

In this case the closure borrows name and greeting from the scope in which it was defined. This is called a "borrowing closure". The closure borrows the variables, and does not take ownership of them.

If you want to take ownership of a variable, you can use the move keyword. This will move the variables into the closure.

fn main() {
    let name = "Marcel".to_string();
    let greeting = "Hello".to_string();

    let print_greeting = move || {
        println!("{greeting}, {name}");
    };

    print_greeting();
}

At first glance, this looks like the previous example, but there is a subtle difference. The closure now takes ownership of name and greeting. This means that the variables are moved into the closure, and are no longer available in the outer scope.

Let's confirm this by trying to use name after the closure has been called.

fn main() {
    let name = "Marcel".to_string();
    let greeting = "Hello".to_string();

    let print_greeting = move || {
        println!("{greeting}, {name}");
    };

    print_greeting();
    println!("{name}");
}

This will result in a compilation error:

error[E0382]: borrow of moved value: `name`
  --> src/main.rs:10:15
   |
2  |     let name = "Marcel".to_string();
   |         ---- move occurs because `name` has type `String`, which does not implement the `Copy` trait
...
5  |     let print_greeting = move || {
   |                          ------- value moved into closure here
6  |         println!("{greeting}, {name}");
   |                                ---- variable moved due to use in closure
...
10 |     println!("{name}");
   |               ^^^^^^ value borrowed here after move
   |

If you have heard the sentence: "I'm fighting the borrow checker", this is what it means. The Rust compiler is preventing you from making a mistake that could lead to undefined behavior at runtime.

There are different ways to solve this issue. You could clone the variables before moving them into the closure, or you could use a reference to the variables. Let's look at both options.

fn main() {
    let name = "Marcel".to_string();
    let greeting = "Hello".to_string();

    let use_name = name.clone();
    let use_greeting = greeting.clone();

    let print_greeting = move || {
        println!("{use_greeting}, {use_name}");
    };

    print_greeting();
    println!("{name}");
}

This will work, but it's not very elegant. You have to clone the variables, and then use the clones in the closure. This is not very efficient, especially if the variables are large.

A better way is to use references. This way you don't have to clone the variables, and you can still use them in the closure.

fn main() {
    let name = "Marcel".to_string();
    let greeting = "Hello".to_string();

    let print_greeting = move |name: &str, greeting: &str| {
        println!("{greeting}, {name}");
    };

    print_greeting(&name, &greeting);
    println!("{name}");
}

This will work in many cases, but it's not always possible to use references. An almost-always foolproof way to solve this issue is to use the Rc and RefCell types. We'll look at these in the next chapter.

An Rc is a reference-counted pointer to immutable data. This means that you can have multiple references to the same data, and the data will only be dropped when the last reference is dropped. An Rc is used when you want to have multiple owners of the same data.

A RefCell is a mutable memory location with dynamically checked borrow rules. This means that you can have multiple mutable references to the same data, and the borrow checker will ensure that the references are used correctly.

Let's look at the above example using an Rc

use std::rc::Rc;

fn main() {
    let name = Rc::new("Marcel".to_string());
    let greeting = Rc::new("Hello".to_string());

    let name2 = name.clone();
    let greeting2 = greeting.clone();

    let print_greeting = move || {
        println!("{greeting2}, {name2}");
    };

    print_greeting();
    println!("{name}");
}

It looks a bit like the earlier example where we cloned the variables, but there is a subtle difference. We are not cloning the variables, we are cloning the Rc pointers. This means that we are not cloning the data, we are cloning the reference to the data. This is much more efficient, especially if the data is large.

To complete the example, we'll look at the RefCell type. We'll use the RefCell type to make the name and greeting variables mutable.

use std::cell::RefCell;
use std::rc::Rc;

fn main() {
    let name = Rc::new(RefCell::new("Marcel".to_string()));
    let greeting = Rc::new(RefCell::new("Hello".to_string()));

    let name2 = name.clone();
    let greeting2 = greeting.clone();

    let print_greeting = move || {
        println!("{}, {}", greeting2.borrow(), name2.borrow());
    };

    print_greeting();
    println!("{}", name.borrow());

    // Change the name and greeting
    *name.borrow_mut() = "Alice".to_string();
    *greeting.borrow_mut() = "Goodbye".to_string();

    print_greeting();
    println!("{}", name.borrow());
}

This will print:

Hello, Marcel
Marcel
Goodbye, Alice
Alice

In this example we use the borrow and borrow_mut methods to get a reference to the data. The borrow method returns an immutable reference, and the borrow_mut method returns a mutable reference. The borrow checker will ensure that the references are used correctly.

Asynchronous programming

Since you are now familiar with the basic Rust concepts, let's ease our way into some asynchronous programming.

The Rust standard library provides a way to execute tasks on different threads. In this book we'll focus on the relatively new async/.await standard. We use tokio as our asynchronous runtime. There are other good asynchronous runtimes for Rust, like: async_std, but because of my/our practical experience with tokio, this is the one we'll use.

To get a quick introduction on the topic, I'd recommend a quick read through this blog post: Async-await on stable Rust!

To get started, we need to pull ths tokio crate into our Cargo.toml:

[dependencies]
tokio = { version = "1", features = ["full"] }

You can 'tune' your application by removing unneeded tokio features.

use std::time::Duration;
use tokio::spawn;
use tokio::time::sleep;

#[tokio::main]
async fn main() {
    let async_task = spawn(async {
        println!("Async task started.");
        sleep(Duration::from_secs(1)).await;
        println!("Async task done.");
    });

    println!("Launching task...");
    async_task.await.expect("async task failed");
    println!("Ready!");
}

This example is best experienced on your local machine, where you can see the output in real-time.

Let's explore this example in a bit more detail.

The async statement marks a function for asynchronous execution. These functions return a Future. These Futures are not immediately scheduled. Actually, if a Future is never await-ed, it may never be executed.

Keep your eye out for any compiler warnings, or Clippy hints, while writing async code. Especially for things like:

warning: unused something that must be used ... futures do nothing unless you .await or poll them

The tokio::spawn(async {}) commands wraps an asynchronous task. You can return values from such a function in the normal way. The result can be captured after the .await completes.

Every time an .await is encountered in the code, the scheduler will temporarily pause the task and see if there is other work to be executed.

Let's extend our example to run a bunch of parallel tasks and await the results.

use std::time::Duration;
use tokio::spawn;
use tokio::time::sleep;

#[tokio::main]
async fn main() {
    let mut tasks = vec![];
    for id in 0..5 {
        let t = spawn(async move {
            println!("Async task {} started.", id);
            sleep(Duration::from_millis((5 - id) * 100)).await;
            println!("Async task {} done.", id);
            let result = id * id; // silly calculation...
            (id, result)
        });

        tasks.push(t);
    }

    println!("Launched {} tasks...", tasks.len());
    for task in tasks {
        let (id, result) = task.await.expect("task failed");
        println!("Task {} completed with result: {}", id, result);
    }
    println!("Ready!");
}

What has changed compared to the previous example?

Notice the move statement, in the async block. This is needed if you want to use external variables, in this case id, inside the async block. The compiler will check if the variable can be sent across threads safely. More on that later.

We are returning the id and result in a tuple, that we deconstruct after await-ing the result of the async task.

Using a JoinSet

In the previous example, tasks complete in reverse order. If you observe the output from the main thread, though, you notice that nothing is printed until task '0' finishes. This is because we are iterating through the tasks in a sequence that does not match the order in which tasks complete. Typically, you want to act as soon as a task finishes, irrespective of the state of other tasks that are running in parallel.

This can be accomplished with the tokio::task::JoinSet. We have changed the example such that it uses the spawn method to launch new tasks, and then we use join_next() method to wait for the next task to complete. Once all tasks are completed, the join_next() method will return None. This is where we break the loop.

use std::time::Duration;
use tokio::task::JoinSet;
use tokio::time::sleep;

#[tokio::main]
async fn main() {
    let mut tasks = JoinSet::new();
    for id in 0..5 {
        tasks.spawn(async move {
            println!("Async task {} started.", id);
            sleep(Duration::from_millis((5 - id) * 100)).await;
            println!("Async task {} done.", id);
            let result = id * id; // silly calculation...
            (id, result)
        });
    }

    println!("Launched {} tasks...", tasks.len());
    while let Some(task) = tasks.join_next().await {
        let (id, result) = task.expect("task failed");
        println!("Task {} completed with result: {}", id, result);
    }
    println!("Ready!");
}

Run both examples and compare the output. You will notice that the second example prints the results in the order in which the tasks complete.

Reference material

Sharing data & ownership (part 2)

In Chapter 2.3 we've looked at ownership of data and how data can be borrowed by functions. In this chapter, we'll explore how data can be shared between threads.

Let's say we're implementing a scaffold to look up a user in a database asynchronously and add it to a list of users. With the stuff learned so far, you could come up with something like this:

use tokio::spawn;

#[derive(Debug)]
struct User {
    id: String,
}

async fn lookup_user(id: &str, users: &mut Vec<User>) {
    users.push(User { id: id.to_string() })
}

#[tokio::main]
async fn main() {
    let mut users: Vec<User> = vec![];

    let lookup_users = spawn(async move {
        lookup_user("bob", &mut users).await;
        lookup_user("tom", &mut users).await;
    });

    lookup_users.await.expect("failed to lookup users");
    println!("Users: {users:?}");
}

Unfortunately this does not compile:

14 |       let mut users: Vec<User> = vec![];
   |           --------- move occurs because `users` has type `std::vec::Vec<User>`, which does not implement the `Copy` trait
   ...
22 |       println!("Users: {users:?}");
   |                               ^^^^^ value borrowed here after move

Apparently we cannot use users in main after it moved into the async task. We need to find a way to use a reference to users that we can use inside the asynchronous task...

Well, maybe you paid attention to the 'advanced' example in chapter 2.3, did some reading online on Rc and RefCell ( like RefCell and the Interior Mutability Pattern), and that lead you to adapt the code in this way:

use std::cell::RefCell;
use std::rc::Rc;
use tokio::spawn;

#[derive(Debug)]
struct User {
    id: String,
}

async fn lookup_user(id: &str, users: Rc<RefCell<Vec<User>>>) {
    users.borrow_mut().push(User { id: id.to_string() })
}

#[tokio::main]
async fn main() {
    let users = Rc::new(RefCell::new(vec![]));
    let users_for_task = users.clone();

    let lookup_users = spawn(async move {
        lookup_user("bob", users_for_task.clone()).await;
        lookup_user("tom", users_for_task).await;
    });

    lookup_users.await.expect("failed to lookup users");
    println!("Users: {:?}", *users.borrow());
}

Another compiler error:

error[E0277]: `std::rc::Rc<std::cell::RefCell<std::vec::Vec<User>>>` cannot be sent between threads safely
   --> src/main.rs:19:24
    |
19  |     let lookup_users = spawn(async move {
    |                        ^^^^^ `std::rc::Rc<std::cell::RefCell<std::vec::Vec<User>>>` cannot be sent between threads safely
    | 
   ::: /playground/.cargo/registry/src/github.com-1ecc6299db9ec823/tokio-0.2.20/src/task/spawn.rs:127:21
    |
127 |         T: Future + Send + 'static,
    |                     ---- required by this bound in `tokio::task::spawn::spawn`
    |
    = help: within `impl std::future::Future`, the trait `std::marker::Send` is not implemented for `std::rc::Rc<std::cell::RefCell<std::vec::Vec<User>>>`
    = note: required because it appears within the type `[static generator@src/main.rs:19:41: 22:6 users_for_task:std::rc::Rc<std::cell::RefCell<std::vec::Vec<User>>> _]`
    = note: required because it appears within the type `std::future::GenFuture<[static generator@src/main.rs:19:41: 22:6 users_for_task:std::rc::Rc<std::cell::RefCell<std::vec::Vec<User>>> _]>`
    = note: required because it appears within the type `impl std::future::Future`

So apparently we cannot use Rc when using threads: 'the trait std::marker::Send is not implemented for std::rc::Rc<std::cell::RefCell<std::vec::Vec<User>>>'.

Remember that in the previous chapter it was mentioned, that when using async move, the Rust compiler checks if the variable can be sent between threads safely. This is one of those cases, where it is unsafe to do so. Rc is not thread-safe, and RefCell for sure isn't either.

Let's review what an Rc is by reading the chapter in the Rust book: Rc, the Reference Counted Smart Pointer

This may also be a good time to read a bit about the Send and the Sync trait: Send and Sync.

Luckily for us, Rust has thread-safe replacements for Rc and RefCell: Arc and Mutex.

Arc is a counted reference to an object that can be shared between threads. Arc stands for 'Atomically Reference Counted'.

Wrapping an object in an Arc allows us to share a reference to that object across threads. Unfortunately for us, that reference is read-only.

The goal of the exercise is to add the user, that was retrieved from the database, to the shared list of users.

To safely change data that is shared between threads, we need to wrap it in a Mutex. Mutex stands for mutual exclusion. This means that a Mutex safeguards access to the data that it wraps, by only allowing access to a single thread at any moment in time.

With these tools in hand, let's rewrite the example one last time:

use std::sync::Arc;
use tokio::spawn;
use tokio::sync::Mutex;

#[derive(Debug)]
struct User {
    id: String,
}

async fn lookup_user(id: &str, users: Arc<Mutex<Vec<User>>>) {
    let mut editable_users = users.lock().await;
    editable_users.push(User { id: id.to_string() })
}

#[tokio::main]
async fn main() {
    let users = Arc::new(Mutex::new(vec![]));
    let users_for_task = users.clone();

    let lookup_users = spawn(async move {
        lookup_user("bob", users_for_task.clone()).await;
        lookup_user("tom", users_for_task).await;
    });

    lookup_users.await.expect("failed to lookup users");
    let users_with_access = users.lock().await;
    println!("Users: {users_with_access:?}");
}

Make sure to use the tokio::sync::Mutex and not the one from the standard library, when using a mutex in an application that uses the tokio async runtime.

As pointed out previously, when cloning a Rc or Arc, we are cloning the reference to the data, not the data itself.

In the above example we are creating a clone of users and store it in users_for_task. The reason for this clone step, is that we need access to the users inside the lookup_users task, but also afterwards in the main() function when we print the user list. If we wouldn't clone users it would move into the task (due to async move) and we could not use it later on.

For completeness, here's the broken example without the initial clone step.

use std::sync::Arc;
use tokio::spawn;
use tokio::sync::Mutex;

#[derive(Debug)]
struct User {
    id: String,
}

async fn lookup_user(id: &str, users: Arc<Mutex<Vec<User>>>) {
    let mut editable_users = users.lock().await;
    editable_users.push(User { id: id.to_string() })
}

#[tokio::main]
async fn main() {
    let users = Arc::new(Mutex::new(vec![]));

    let lookup_users = spawn(async move {
        lookup_user("bob", users.clone()).await;
        lookup_user("tom", users.clone()).await;
    });

    lookup_users.await.expect("failed to lookup users");
    let users_with_access = users.lock().await;
    println!("Users: {users_with_access:?}");
}

In our example we use a Mutex to protect the user list from mutual manipulation by concurrent threads. It often happens that we need to read data in an object often, but manipulate it only seldom. In that case an RwLock might be a better alternative to a Mutex. A RwLock gives the same kind of protection as a Mutex, but distinguishes between read and write operations. It allows many read operations simultaneously, but only a single write operation.

Let's adjust our example to use a RwLock.

use std::sync::Arc;
use tokio::spawn;
use tokio::sync::RwLock;

#[derive(Debug)]
struct User {
    id: String,
}

async fn lookup_user(id: &str, users: Arc<RwLock<Vec<User>>>) {
    let mut editable_users = users.write().await;
    editable_users.push(User { id: id.to_string() })
}

#[tokio::main]
async fn main() {
    let users = Arc::new(RwLock::new(vec![]));
    let users_for_task = users.clone();

    let lookup_users = spawn(async move {
        lookup_user("bob", users_for_task.clone()).await;
        lookup_user("tom", users_for_task).await;
    });

    lookup_users.await.expect("failed to lookup users");
    let users_with_access = users.read().await;
    println!("Users: {:?}", users_with_access.as_slice());
}

Notice the read() vs. write() methods to acquire the different type of locks.

To summarize:

Wrap an object in an Arc if you want to share it between threads. If the object has to be mutable, also wrap it in a Mutex. If you have many read operations, and only a few write operations, use a RwLock instead of a Mutex.

Scope of variables

We've seen how we can use a Mutex to protect shared data between threads. What you may not have noticed is that the Mutex is dropped when it goes out of scope. And subsequently this means that the lock is released automatically when the Mutex is dropped.

This is a very powerful feature of Rust, and it's called RAII (Resource Acquisition Is Initialization). It's a pattern that is used in Rust to ensure that resources are cleaned up when they are no longer needed.

All variables in Rust have a scope, and when they go out of scope, they are dropped. This is the reason why we can use Mutex and RwLock in Rust without having to worry about unlocking them. The lock is automatically released when the Mutex or RwLock goes out of scope.

So how does scope work in Rust? And what are ways to limit the scope of variables?

Scope in Rust is defined by the curly braces {}. Variables declared inside the curly braces are only available inside those curly braces. This is called a block. This blocks can be part of a function definition, a loop, an if statement, or standalone. Here are some examples:

fn main() {
    let x = 42;

    {
        let y = 24;
        println!("x: {}, y: {}", x, y);
    }

    // y is not available here
    println!("x: {}", x);
}

Loops also have their own scope:

fn main() {
    let x = 42;

    for i in 0..3 {
        let y = i * 2;
        println!("x: {}, y: {}", x, y);
    }

    // y is not available here
    println!("x: {}", x);
}

Scope and Locks

When using a Mutex or RwLock it's important to limit the scope of the lock to the smallest possible scope. This ensures that the lock is released as soon as possible. This is important because the lock is blocking other threads from accessing the data.

Here's an example of how to limit the scope of a Mutex:

use std::sync::{Arc, Mutex};

fn main() {
    let data = Arc::new(Mutex::new(0));

    {
        let mut locked_data = data.lock().unwrap();
        *locked_data += 1;
        // locked_data is released here
    }

    println!("Data: {:?}", data);
}

Alternatively you could extract the content of the {} block into a function:

use std::sync::{Arc, Mutex};

fn main() {
    let data = Arc::new(Mutex::new(0));
    increment_data(&data);
    println!("Data: {:?}", data);
}

fn increment_data(data: &Arc<Mutex<i32>>) {
    let mut locked_data = data.lock().unwrap();
    *locked_data += 1;
    // locked_data is released here
}

Reference material

Sharing data through channels

An alternative approach to sharing data between threads, is to pass messages between threads using so-called: channels. By employing channels, each thread owns its own data, and the messages are synchronized in a thread-safe manner. Let's re-write our example using channels.

use tokio::join;
use tokio::spawn;
use tokio::sync::mpsc;

#[derive(Debug)]
struct User {
    id: String,
}

async fn lookup_user(id: &str, mut user_chan: mpsc::Sender<User>) {
    user_chan
        .send(User { id: id.to_string() })
        .await
        .expect("can not send user on channel");
}

#[tokio::main]
async fn main() {
    let (tx, mut rx) = mpsc::channel(100);
    let mut users = vec![];

    spawn(async move {
        join! {
            lookup_user("bob", tx.clone()),
            lookup_user("tom", tx),
        }
    });

    while let Some(user) = rx.recv().await {
        println!("received: {user:?}");
        users.push(user);
    }

    println!("Users: {users:?}");
}

As you can see, the users list does not need to be wrapped in a Mutex. This is because the main thread is the only thread that is manipulating the users Vec. The lookup_user() function is returning the User through the Sender half of the mpsc::channel. The data on the channel is automatically synchronized between threads.

mpsc stands for 'multi-producer, single-consumer' and supports sending many values from many producers to a single consumer. See Module tokio::sync for other channel types.

The channel constructor returns a tuple: tx and rx. These represent the Sender and Receiver halves of the channel. Only the Sender half can be cloned, hence the multi-producer term.

Notice that we are using the join! macro from the tokio crate to await both lookup_user tasks.

If at all possible, prefer channels above sharing data between threads. It saves you from performance bottlenecks, where threads are waiting on a Mutex lock. Channels, like the above, are buffered such that (up to a certain point) threads can continue to send data without a delay.

Exercises

Now that you've learned about asynchronous programming with tokio, it's time to put your knowledge to the test. Check out the corresponding exercise in the Exercises section. See you next time!

Reference material

Session 6 - Where the rubber hits the road

In this session, we'll look at some practical examples of all we've learned so far. Amongst others, we'll create a simple web server.

Hello Rust Web with Axum

Using the knowledge we have gained so far on closures and async programming, we will create a small web server using the axum framework. There are other frameworks around, but axum has proven to fit our use cases. It supports HTTP/1, HTTP/2 and web sockets. For most use cases, it is extremely straight-forward to use.

Let's get started!

Include the Axum framework in Cargo.toml:

[dependencies]
tokio = { version = "1", features = ["full"] }
axum = "0.7"

Let's create a simple web server that greets the visitor. We'll use the Router to define our routes. The Router is a collection of routes that can be used to match incoming requests. We'll use the get method to define a route that matches GET requests to the /hello/ path. This route will call the greet_visitor function. We expect a 'visitor' path parameter, which we'll extract using the extract::Path extractor.

use axum::{extract::Path, Router, routing::get, serve};
use tokio::net::TcpListener;

#[tokio::main]
async fn main() {
    // set up our application with "hello world" route at "/
    let app = Router::new().route("/hello/:visitor", get(greet_visitor));

    // start the server on port 3000
    let listener = TcpListener::bind("0.0.0.0:3000").await.unwrap();
    serve(listener, app).await.unwrap();
}

/// Extract the `visitor` path parameter and use it to greet the visitor.
async fn greet_visitor(Path(visitor): Path<String>) -> String {
    format!("Hello, {visitor}!")
}

Build and run this code on your local machine, then open http://localhost:3000/hello/world in your browser.

It does not get much simpler than this!

Currently, our route responds to GET requests. We can also respond to POST, PUT, DELETE, and other HTTP methods.

use axum::{
    extract::Path,
    Router,
    routing::{delete, get},
    serve,
};
use tokio::net::TcpListener;

#[tokio::main]
async fn main() {
    // set up our application with "hello world" route at "/
    let app = Router::new()
        .route("/hello/:visitor", get(greet_visitor))
        .route("/bye", delete(say_goodbye));

    // start the server on port 3000
    let listener = TcpListener::bind("0.0.0.0:3000").await.unwrap();
    serve(listener, app).await.unwrap();
}

/// Extract the `visitor` path parameter and use it to greet the visitor.
async fn greet_visitor(Path(visitor): Path<String>) -> String {
    format!("Hello, {visitor}!")
}

/// Say goodbye to the visitor.
async fn say_goodbye() -> String {
    "Goodbye".to_string()
}

As you can see in the example above, we added a new route that responds to DELETE requests. You can append routes to the Router using the route method. The delete method is used to define a route that matches DELETE requests to the /bye path. This route will call the say_goodbye function.

You can test these requests with curl from the command line:

  • GET: curl http://localhost:3000/hello/world
  • DELETE: curl -X DELETE http://localhost:3000/bye

Dealing with JSON

Our example returns a plain text string, which is not very useful. A JSON response might be better suited for our web server. Serde is the de facto standard when dealing with marshalling and unmarshalling JSON in Rust. Let's include that crate in the Cargo.toml.

[dependencies]
tokio = { version = "1", features = ["full"] }
axum = "0.7"
serde = { version = "1", features = ["derive"] }

We'll include a Greeting struct and let Rust derive the Serialize and Deserialize traits from the serde crate. For convenience, we also include a new() function to easily create a Greeting.

use axum::{
    extract::Path,
    Json,
    Router, routing::{delete, get}, serve,
};
use serde::{Deserialize, Serialize};
use tokio::net::TcpListener;

#[derive(Serialize, Deserialize)]
struct Greeting {
    greeting: String,
    visitor: String,
}

impl Greeting {
    fn new(greeting: &str, visitor: String) -> Self {
        Greeting {
            greeting: greeting.to_string(),
            visitor,
        }
    }
}

#[tokio::main]
async fn main() {
    // set up our application with "hello world" route at "/
    let app = Router::new()
        .route("/hello/:visitor", get(greet_visitor))
        .route("/bye", delete(say_goodbye));

    // start the server on port 3000
    let listener = TcpListener::bind("0.0.0.0:3000").await.unwrap();
    serve(listener, app).await.unwrap();
}

/// Extract the `visitor` path parameter and use it to greet the visitor.
/// We use `Json` to automatically serialize the `Greeting` struct to JSON.
async fn greet_visitor(Path(visitor): Path<String>) -> Json<Greeting> {
    Json(Greeting::new("Hello", visitor))
}

/// Say goodbye to the visitor.
async fn say_goodbye() -> String {
    "Goodbye".to_string()
}

GET result:

{
  "greeting": "Hello",
  "visitor": "world"
}

Reference material

Stateful Web with Axum

So far, we've seen how we can build a simple web app with axum. Let's add some state to our web application, and build a global counter of the number of requests to our REST API. We'll use an AtomicU16 to keep track of the number of visits. The AtomicU16 is a thread-safe integer that we can use to increment the number of visits, without worrying about concurrent access.

By using an Arc we can share the state between the different request handlers. The Arc is a reference-counted pointer that allows us to share the state between different threads. The State extractor allows us to access the shared state in our request handlers.

So essentially all requests are pointing to the same AppState struct, and we can use the AtomicU16 to keep track of the number of visits across all requests.

use axum::{
    extract::{Path, State},
    routing::{delete, get},
    Json, Router,
};
use serde::{Deserialize, Serialize};
use std::sync::{atomic::AtomicU16, atomic::Ordering::Relaxed, Arc};

#[derive(Serialize, Deserialize)]
struct Greeting {
    greeting: String,
    visitor: String,
    visits: u16,
}

struct AppState {
    number_of_visits: AtomicU16,
}

impl Greeting {
    fn new(greeting: &str, visitor: String, visits: u16) -> Self {
        Greeting {
            greeting: greeting.to_string(),
            visitor,
            visits,
        }
    }
}

#[tokio::main]
async fn main() {
    // Create a shared state for our application. We use an Arc so that we clone the pointer to the state and
    // not the state itself. The AtomicU16 is a thread-safe integer that we use to keep track of the number of visits.
    let app_state = Arc::new(AppState {
        number_of_visits: AtomicU16::new(1),
    });

    // setup our application with "hello world" route at "/
    let app = Router::new()
        .route("/hello/:visitor", get(greet_visitor))
        .route("/bye", delete(say_goodbye))
        .with_state(app_state);

    // start the server on port 3000
    let listener = tokio::net::TcpListener::bind("0.0.0.0:3000").await.unwrap();
    axum::serve(listener, app).await.unwrap();
}

/// Extract the `visitor` path parameter and use it to greet the visitor.
/// We also use the `State` extractor to access the shared `AppState` and increment the number of visits.
/// We use `Json` to automatically serialize the `Greeting` struct to JSON.
async fn greet_visitor(
    State(app_state): State<Arc<AppState>>,
    Path(visitor): Path<String>,
) -> Json<Greeting> {
    let visits = app_state
        .number_of_visits
        .fetch_add(1, Relaxed);
    Json(Greeting::new("Hello", visitor, visits))
}

/// Say goodbye to the visitor.
async fn say_goodbye() -> String {
    "Goodbye".to_string()
}

You can test the web server by running the following command:

$ curl http://127.0.0.1:3000/hello/world

Result:

{
  "greeting": "Hello",
  "visitor": "world",
  "visits": 1
}

Some statistics using Apache Bench

ab -n 1000 -c 1 http://127.0.0.1:3000/hello/world

Concurrency Level:      1
Time taken for tests:   0.112 seconds
Complete requests:      1000
ab -n 1000 -c 100 http://127.0.0.1:3000/hello/world 

Concurrency Level:      100
Time taken for tests:   0.072 seconds
Complete requests:      1000

Disregarding the actual numbers, this shows that our axum server is processing the requests in parallel as much as possible.

Axum routes with a Handler

As your application grows, it is probably not a good idea to group the logic for all your routes within the web server. This is where the Handler concept comes in. The Handler is a struct that contains the business logic to drive the web application. The Handler is responsible for processing the request and returning a response. This is a common pattern in web development.

Let's refactor our greeting application to use a Handler struct. We will create a WebHandler struct that will contain the logic for greeting a visitor and the logic for saying goodbye. We'll also take the opportunity to separate the data model into a separate module.

main.rs

use std::sync::Arc;

use axum::{
    extract::{Path, State},
    Json,
    Router, routing::{delete, get},
};

use crate::handler::WebHandler;
use crate::model::Greeting;

mod handler;
mod model;

#[tokio::main]
async fn main() {
    // Create a shared state for our application. We use an Arc so that we clone the pointer to the state and
    // not the state itself.
    let app_state = Arc::new(WebHandler::default());

    // set up our application with "hello world" route at "/
    let app = Router::new()
        .route("/hello/:visitor", get(greet_visitor))
        .route("/bye", delete(say_goodbye))
        .with_state(app_state);

    // start the server on port 3000
    let listener = tokio::net::TcpListener::bind("0.0.0.0:3000").await.unwrap();
    axum::serve(listener, app).await.unwrap();
}

/// Extract the `visitor` path parameter and use it to greet the visitor.
/// We also use the `State` extractor to access the shared `Handler` and call the `greet` method.
/// We use `Json` to automatically serialize the `Greeting` struct to JSON.
async fn greet_visitor(
    State(handler): State<Arc<WebHandler>>,
    Path(visitor): Path<String>,
) -> Json<Greeting> {
    Json(handler.greet(visitor))
}

/// Say goodbye to the visitor.
async fn say_goodbye(State(handler): State<Arc<WebHandler>>) -> String {
    handler.say_goodbye()
}

handler.rs

use std::{
    sync::atomic::AtomicU16,
    sync::atomic::Ordering::Relaxed,
};

use crate::model::Greeting;

/// A handler for our web application.
pub struct WebHandler {
    number_of_visits: AtomicU16,
}

impl WebHandler {
    /// Greet the visitor and increment the number of visits.
    pub fn greet(&self, visitor: String) -> Greeting {
        let visits = self
            .number_of_visits
            .fetch_add(1, Relaxed);
        Greeting::new("Hello", visitor, visits)
    }

    /// Say goodbye to the visitor.
    pub fn say_goodbye(&self) -> String {
        "Goodbye".to_string()
    }
}

impl Default for WebHandler {
    fn default() -> Self {
        WebHandler {
            number_of_visits: AtomicU16::new(0),
        }
    }
}

model.rs

use serde::{Deserialize, Serialize};

#[derive(Serialize, Deserialize)]
pub struct Greeting {
    greeting: String,
    visitor: String,
    visits: u16,
}

impl Greeting {
    pub(crate) fn new(greeting: &str, visitor: String, visits: u16) -> Self {
        Greeting {
            greeting: greeting.to_string(),
            visitor,
            visits,
        }
    }
}

By splitting the logic into a Handler struct, we have made our code more modular and easier to maintain. The Handler struct contains the business logic for our application, while the model module contains the data model. This separation of concerns makes our code more organized and easier to understand.

Keeping the business logic separate from the web I/O is a good practice. It helps with testing and also allows you to add other interfaces to your application, such as a command-line interface or a gRPC API.

Handler Trait

In the above example, we used a Handler struct to contain the business logic for our application. However, we can also use a trait to define the interface for our handler. This allows us to define multiple handlers that implement the same interface. This is extremely useful when you want to create a pluggable architecture for your application.

Using a trait to define the interface for your handler allows you to 'mock' the handler in your tests. This makes it easier to test your web endpoints in isolation.

Let's refactor our example to use a Handler trait instead of a concrete WebHandler struct.

main.rs

use std::sync::Arc;

use axum::{
    extract::{Path, State},
    Json,
    Router, routing::{delete, get},
};

use crate::handler::{GreetingHandler, WebHandler};
use crate::model::Greeting;

mod handler;
mod model;

type AppState = Arc<dyn GreetingHandler>;

#[tokio::main]
async fn main() {
    // Create a shared state for our application. We use an Arc so that we clone the pointer to the state and
    // not the state itself.
    let app_state: AppState = Arc::new(WebHandler::default());

    // set up our application with "hello world" route at "/
    let app = Router::new()
        .route("/hello/:visitor", get(greet_visitor))
        .route("/bye", delete(say_goodbye))
        .with_state(app_state);

    // start the server on port 3000
    let listener = tokio::net::TcpListener::bind("0.0.0.0:3000").await.unwrap();
    axum::serve(listener, app).await.unwrap();
}

/// Extract the `visitor` path parameter and use it to greet the visitor.
/// We also use the `State` extractor to access the shared `Handler` and call the `greet` method.
/// We use `Json` to automatically serialize the `Greeting` struct to JSON.
async fn greet_visitor(
    State(handler): State<AppState>,
    Path(visitor): Path<String>,
) -> Json<Greeting> {
    Json(handler.greet(visitor))
}

/// Say goodbye to the visitor.
async fn say_goodbye(State(handler): State<AppState>) -> String {
    handler.say_goodbye()
}

handler.rs

use std::{
    sync::atomic::AtomicU16,
    sync::atomic::Ordering::Relaxed,
};

use crate::model::Greeting;

/// A trait for handling greetings.
pub trait GreetingHandler: Send + Sync {
    fn greet(&self, visitor: String) -> Greeting;
    fn say_goodbye(&self) -> String;
}

/// A greeting handler implementation for our web application.
pub struct WebHandler {
    number_of_visits: AtomicU16,
}

impl GreetingHandler for WebHandler {
    /// Greet the visitor and increment the number of visits.
    fn greet(&self, visitor: String) -> Greeting {
        let visits = self
            .number_of_visits
            .fetch_add(1, Relaxed);
        Greeting::new("Hello", visitor, visits)
    }

    /// Say goodbye to the visitor.
    fn say_goodbye(&self) -> String {
        "Goodbye".to_string()
    }
}

impl Default for WebHandler {
    fn default() -> Self {
        WebHandler {
            number_of_visits: AtomicU16::new(0),
        }
    }
}

Our model.rs remains the same.

By using a Handler trait, we have made our code more modular and easier to extend. In this example we use dynamic dispatch to allow different implementations of the GreetingHandler trait to be used at runtime. This allows us to create different handlers for different environments, such as a test environment or a production environment.

Dynamic dispatching comes with a small performance penalty, as the compiler cannot optimize the code as much as with static dispatch. We can make a final update to our code to use static dispatch.

Static Dispatch

In the previous example, we used dynamic dispatch to allow different implementations of the GreetingHandler trait to be used at runtime. However, we can also use static dispatch to allow the compiler to optimize the code more efficiently. This is done by specifying the concrete type of the GreetingHandler trait at compile time.

main.rs

use crate::{
    handler::WebHandler,
    web::AxumWebServer,
};

mod handler;
mod model;
mod web;

#[tokio::main]
async fn main() {
    let handler = WebHandler::default();
    let web_server = AxumWebServer::new(handler);
    web_server.start().await;
}

handler.rs

use std::{
    sync::atomic::AtomicU16,
    sync::atomic::Ordering::Relaxed,
};


use crate::model::Greeting;

/// A trait for handling greetings.
pub trait GreetingHandler: Send + Sync + 'static {
    fn greet(&self, visitor: String) -> Greeting;
    fn say_goodbye(&self) -> String;
}

/// A greeting handler implementation for our web application.
pub struct WebHandler {
    number_of_visits: AtomicU16,
}

impl GreetingHandler for WebHandler {
    /// Greet the visitor and increment the number of visits.
    fn greet(&self, visitor: String) -> Greeting {
        let visits = self
            .number_of_visits
            .fetch_add(1, Relaxed);
        Greeting::new("Hello", visitor, visits)
    }

    /// Say goodbye to the visitor.
    fn say_goodbye(&self) -> String {
        "Goodbye".to_string()
    }
}

impl Default for WebHandler {
    fn default() -> Self {
        WebHandler {
            number_of_visits: AtomicU16::new(0),
        }
    }
}

web.rs

use std::sync::Arc;

use axum::{
    extract::{Path, State},
    Json,
    Router,
    routing::{delete, get},
};

use crate::{
    handler::GreetingHandler,
    model::Greeting,
};

type AppState<G> = Arc<G>;

pub struct AxumWebServer<G: GreetingHandler> {
    app_state: AppState<G>,
}

impl<G: GreetingHandler> AxumWebServer<G> {
    pub fn new(handler: G) -> Self {
        // Create a shared state for our application. We use an Arc so that we clone the pointer to the state and
        // not the state itself.
        let app_state: AppState<G> = Arc::new(handler);
        AxumWebServer { app_state }
    }

    pub async fn start(&self) {
        let app_state = self.app_state.clone();

        // set up our application with "hello world" route at "/
        let app = Router::new()
            .route("/hello/:visitor", get(Self::greet_visitor))
            .route("/bye", delete(Self::say_goodbye))
            .with_state(app_state);

        // start the server on port 3000
        let listener = tokio::net::TcpListener::bind("0.0.0.0:3000").await.unwrap();
        axum::serve(listener, app).await.unwrap();
    }


    /// Extract the `visitor` path parameter and use it to greet the visitor.
    /// We also use the `State` extractor to access the shared `Handler` and call the `greet` method.
    /// We use `Json` to automatically serialize the `Greeting` struct to JSON.
    async fn greet_visitor(
        State(handler): State<AppState<G>>,
        Path(visitor): Path<String>,
    ) -> Json<Greeting> {
        Json(handler.greet(visitor))
    }

    /// Say goodbye to the visitor.
    async fn say_goodbye(State(handler): State<AppState<G>>) -> String {
        handler.say_goodbye()
    }
}

The model.rs stays the same.

By using static dispatch, we have eliminated the performance penalty of dynamic dispatch. This makes our code more efficient and easier to optimize. As you can see the statically dispatched version of our code comes with some more cognitive overhead. This is where you have to decide what is more important to you: performance or ease of development.

Finally, we have isolated the web server logic in the web.rs file. Our main.rs file is now very clean and only contains the minimum code to start the application.

With these latest changes, our code is getting a little bit closer to a production-ready state.

Benchmarking is really the only way to know if the performance improvements are worth the added complexity. The overhead that the indirection of dynamic dispatch introduces is usually negligible. Remember that the execution time of the business logic will typically dwarf the overhead of the dynamic dispatch of the REST API calls.

Axum and authentication

So far, our Axum application has been open to the public. In this chapter, we will add authentication to our application. We will start with Basic Authentication and then move on to JSON Web Tokens (JWT).

Basic Authentication

Basic Authentication is a simple authentication scheme built into the HTTP protocol. It is a simple username and password authentication scheme. The client sends the username and password in the Authorization header. The server responds with a 401 Unauthorized status code if the credentials are incorrect.

The Axum add-on crate: axum_extra provides an extractor for Basic Authentication. The TypedHeader<Authorization<Basic>> extractor is used to extract the Authorization header from the request. The Authorization header is then parsed to extract the username and password.

Let's explore Basic Authentication with an example. We will enhance our simple web server and add a reset-visits endpoint that requires Basic Authentication.

First, add the axum-extra crate to your Cargo.toml:

[dependencies]
serde = { version = "1.0.197", features = ["derive"] }
tokio = { version = "1", features = ["full"] }
axum = "0.7"
axum-extra = { version = "0.9", features = ["typed-header"] }
serde_json = "1"

We'll also include serde_json for JSON serialization and deserialization.

Next, let's add the reset-visits endpoint to our web server.

main.rs

use std::sync::Arc;

use axum::{
    extract::{Path, State},
    http::StatusCode,
    response::IntoResponse,
    routing::{delete, get},
    Json, Router,
};
use axum_extra::{
    headers::{authorization::Basic, Authorization},
    TypedHeader,
};
use serde_json::json;

use crate::handler::{GreetingHandler, WebHandler};
use crate::model::Greeting;

mod handler;
mod model;

type AppState = Arc<dyn GreetingHandler>;

#[tokio::main]
async fn main() {
    // Create a shared state for our application. We use an Arc so that we clone the pointer to the state and
    // not the state itself.
    let app_state: AppState = Arc::new(WebHandler::default());

    // set up our application with "hello world" route at "/
    let app = Router::new()
        .route("/hello/:visitor", get(greet_visitor))
        .route("/bye", delete(say_goodbye))
        .route("/reset-visits", delete(reset_visits))
        .with_state(app_state);

    // start the server on port 3000
    let listener = tokio::net::TcpListener::bind("0.0.0.0:3000").await.unwrap();
    axum::serve(listener, app).await.unwrap();
}

/// Extract the `visitor` path parameter and use it to greet the visitor.
/// We also use the `State` extractor to access the shared `Handler` and call the `greet` method.
/// We use `Json` to automatically serialize the `Greeting` struct to JSON.
async fn greet_visitor(
    State(handler): State<AppState>,
    Path(visitor): Path<String>,
) -> Json<Greeting> {
    Json(handler.greet(visitor))
}

/// Say goodbye to the visitor.
async fn say_goodbye(State(handler): State<AppState>) -> String {
    handler.say_goodbye()
}

/// Reset the number of visits.
async fn reset_visits(
    TypedHeader(Authorization(creds)): TypedHeader<Authorization<Basic>>,
    State(handler): State<AppState>,
) -> impl IntoResponse {
    if creds.username() != "admin" || creds.password() != "password" {
        return (
            StatusCode::UNAUTHORIZED,
            Json(json!({"error": "Unauthorized"})),
        );
    };

    handler.reset_visits();
    (StatusCode::OK, Json(json!({"ok": "Visits reset"})))
}

And finally, let's add the reset_visits method to our GreetingHandler trait and implementation.

handler.rs

use std::{sync::atomic::AtomicU16, sync::atomic::Ordering::Relaxed};

use crate::model::Greeting;

/// A trait for handling greetings.
pub trait GreetingHandler: Send + Sync {
    fn greet(&self, visitor: String) -> Greeting;
    fn say_goodbye(&self) -> String;
    fn reset_visits(&self);
}

/// A greeting handler implementation for our web application.
pub struct WebHandler {
    number_of_visits: AtomicU16,
}

impl GreetingHandler for WebHandler {
    /// Greet the visitor and increment the number of visits.
    fn greet(&self, visitor: String) -> Greeting {
        let visits = self.number_of_visits.fetch_add(1, Relaxed);
        Greeting::new("Hello", visitor, visits)
    }

    /// Say goodbye to the visitor.
    fn say_goodbye(&self) -> String {
        "Goodbye".to_string()
    }

    /// Reset the number of visits.
    fn reset_visits(&self) {
        self.number_of_visits.store(0, Relaxed);
    }
}

impl Default for WebHandler {
    fn default() -> Self {
        WebHandler {
            number_of_visits: AtomicU16::new(0),
        }
    }
}

OK, let's look at the reset_visits in a bit more detail.

We use the TypedHeader<Authorization<Basic>> extractor to extract the Authorization header from the request. We use the destructor pattern to extract the credentials from the Authorization header: TypedHeader(Authorization(creds)).

We then check if the username and password are correct. If the credentials are incorrect, we return a 401 Unauthorized status code. If the credentials are correct, we call the reset_visits method on our GreetingHandler and return a 200 OK status code.

The json! macro from the serde_json crate is used to create a JSON response. This is a convenient way to create JSON without having to write out the JSON structures manually.

Test the reset-visits endpoint with curl:

curl -X DELETE -u admin:password http://localhost:3000/reset-visits

See what happens when you use the wrong credentials. Or when you omit the credentials altogether.

Json Web Tokens (JWT)

JSON Web Tokens (JWT) is a more common way to authenticate REST requests. JWT is an open standard that defines a compact and self-contained way for securely transmitting information between parties as a JSON object. JWTs can be signed using a secret key or a public/private key pair using RSA or ECDSA. The website jwt.io provides a good introduction to JWT.

We'll use the jsonwebtoken crate to work with JWTs. Add the jsonwebtoken crate to your Cargo.toml:

[dependencies]
serde = { version = "1.0.197", features = ["derive"] }
tokio = { version = "1", features = ["full"] }
axum = "0.7"
axum-extra = { version = "0.9", features = ["typed-header"] }
serde_json = "1"
jsonwebtoken = "9"

Now, let's add JWT authentication to our web server. We'll add a login endpoint that returns a JWT token. We'll then update the reset-visits endpoint to require a JWT token.

main.rs

use std::sync::Arc;

use crate::{
    handler::{GreetingHandler, WebHandler},
    model::{Greeting, OurJwtPayload},
};

use axum::{
    extract::{Path, State},
    http::StatusCode,
    response::IntoResponse,
    routing::{delete, get, post},
    Json, Router,
};
use axum_extra::{
    headers::{
        authorization::{Basic, Bearer},
        Authorization,
    },
    TypedHeader,
};
use jsonwebtoken::{DecodingKey, Validation};
use serde_json::json;

mod handler;
mod model;

const SECRET_SIGNING_KEY: &[u8] = b"keep_th1s_@_secret";

type AppState = Arc<dyn GreetingHandler>;

#[tokio::main]
async fn main() {
    // Create a shared state for our application. We use an Arc so that we clone the pointer to the state and
    // not the state itself.
    let app_state: AppState = Arc::new(WebHandler::default());

    // set up our application with "hello world" route at "/
    let app = Router::new()
        .route("/hello/:visitor", get(greet_visitor))
        .route("/bye", delete(say_goodbye))
        .route("/login", post(login))
        .route("/reset-visits", delete(reset_visits))
        .with_state(app_state);

    // start the server on port 3000
    let listener = tokio::net::TcpListener::bind("0.0.0.0:3000").await.unwrap();
    axum::serve(listener, app).await.unwrap();
}

/// Extract the `visitor` path parameter and use it to greet the visitor.
/// We also use the `State` extractor to access the shared `Handler` and call the `greet` method.
/// We use `Json` to automatically serialize the `Greeting` struct to JSON.
async fn greet_visitor(
    State(handler): State<AppState>,
    Path(visitor): Path<String>,
) -> Json<Greeting> {
    Json(handler.greet(visitor))
}

/// Say goodbye to the visitor.
async fn say_goodbye(State(handler): State<AppState>) -> String {
    handler.say_goodbye()
}

/// login endpoint
async fn login(
    TypedHeader(Authorization(creds)): TypedHeader<Authorization<Basic>>,
) -> impl IntoResponse {
    if creds.username() != "admin" || creds.password() != "password" {
        return (
            StatusCode::UNAUTHORIZED,
            Json(json!({"error": "Unauthorized"})),
        );
    };

    let Ok(jwt) = jsonwebtoken::encode(
        &jsonwebtoken::Header::default(),
        &OurJwtPayload::new(creds.username().to_string()),
        &jsonwebtoken::EncodingKey::from_secret(SECRET_SIGNING_KEY),
    ) else {
        return (
            StatusCode::INTERNAL_SERVER_ERROR,
            Json(json!({"error": "Failed to generate token"})),
        );
    };

    (StatusCode::OK, Json(json!({"jwt": jwt})))
}

/// Reset the number of visits.
async fn reset_visits(
    TypedHeader(Authorization(bearer)): TypedHeader<Authorization<Bearer>>,
    State(handler): State<AppState>,
) -> impl IntoResponse {
    let token = bearer.token();
    let decoding_key = DecodingKey::from_secret(SECRET_SIGNING_KEY);

    let Ok(jwt) =
        jsonwebtoken::decode::<OurJwtPayload>(token, &decoding_key, &Validation::default())
    else {
        return (
            StatusCode::UNAUTHORIZED,
            Json(json!({"error": "Invalid token"})),
        );
    };

    let username = jwt.claims.sub;
    handler.reset_visits();

    (
        StatusCode::OK,
        Json(json!({"ok": format_args!("Visits reset by {username}")})),
    )
}

Update the models.rs file to include the OurJwtPayload struct:

use std::time::{Duration, SystemTime};

use serde::{Deserialize, Serialize};

#[derive(Serialize, Deserialize)]
pub struct Greeting {
    greeting: String,
    visitor: String,
    visits: u16,
}

impl Greeting {
    pub(crate) fn new(greeting: &str, visitor: String, visits: u16) -> Self {
        Greeting {
            greeting: greeting.to_string(),
            visitor,
            visits,
        }
    }
}

#[derive(Serialize, Deserialize)]
pub struct OurJwtPayload {
    pub sub: String,
    pub exp: usize,
}

impl OurJwtPayload {
    pub fn new(sub: String) -> Self {
        // expires by default in 60 minutes from now
        let exp = SystemTime::now()
            .checked_add(Duration::from_secs(60 * 60))
            .expect("valid timestamp")
            .duration_since(SystemTime::UNIX_EPOCH)
            .expect("valid duration")
            .as_secs() as usize;

        OurJwtPayload { sub, exp }
    }
}

We can now use curl to test the login and reset-visits endpoints:

# login

curl -X POST -u admin:password http://localhost:3000/login

You should get a response similar to:

{
  "jwt": "eyJ0eXAiOiJKV1QiLCJhbGciOiJIUzI1NiJ9.eyJ1c2VybmFtZSI6ImFkbWluIn0.Dwxetf2HuZzABdOV-OwYgHpOBsHnHuNaCYoO0epfuiU"
}

With this token, you can now test the reset-visits endpoint:

curl -X DELETE -H "Authorization: Bearer eyJ0eXAiOiJKV1QiLCJhbGciOiJIUzI1NiJ9.eyJzdWIiOiJhZG1pbiIsImV4cCI6MTcxMTYzMDc1OH0.O6bFdi080bS2OIRTcHD2VgeTGwk-r14mfqodsWikZg4" http://localhost:3000/reset-visits

Try to change the token or omit it altogether to see what happens.

You can copy & paste the token and inspect it at jwt.io.

Security

I hope you realize that JWT is not a secure way to store sensitive information. The token is encoded, not encrypted. Anyone with the token can decode it and read its contents. Do not store sensitive information in the token.

The only thing you can trust is the content of the token. The token is signed with a secret key. If the token is tampered with, the signature will not match and the token will be invalid. So always verify the token before using it.

Exercises

This might be a good time to add some web endpoints to your exercise project. Complete the corresponding exercise in the Exercises section. Looking forward to seeing you next time!

Logging

This chapter wouldn't be complete if we did not talk about logging. Logging is an important part of any application. It allows you to track the behavior of your application and diagnose problems when they occur. The de facto logging library in Rust is log. The log crate provides a logging interface that allows you to write log messages using any logging implementation.

These are a few of the most popular logging implementations:

  • env_logger: A simple logger that reads the RUST_LOG environment variable.
  • pretty_env_logger: A prettier logger that reads the RUST_LOG environment variable.
  • slog: A structured logger that allows you to create custom loggers.
  • simple_logger: A simple logger that logs to stdout.
  • fern: A feature-rich logger that allows you to configure log levels and output.

In this chapter, we'll use env_logger as it is really easy to set up and use. To use env_logger, add it to your Cargo.toml file:

[dependencies]
log = "0.4"
env_logger = "0.11"

Let's create a simple logger in our application. First, we need to initialize the logger in our main function:

fn main() {
    env_logger::init();
    log::info!("Hello, world!");
}

Interestingly, there is no output if you run the above code. This is because the env_logger reads the RUST_LOG environment variable by default to determine the log level. To see the log messages, you need to set the RUST_LOG environment variable to info:

$ RUST_LOG=info cargo run

env_logger::init() must be called before any log messages are written. If you call env_logger::init() after writing log messages, the log messages will be ignored.

Log levels

The log crate provides several log levels:

  • trace
  • debug
  • info
  • warn
  • error

You can set the log level by setting the RUST_LOG environment variable to the desired log level. For example, to see info and higher log messages, set the RUST_LOG environment variable to info:

Useful logging

Logging is a powerful tool for debugging and diagnosing problems in your application. Here are a few tips to help you make the most of logging:

  • Use debug! for debug messages that are useful for tracking the behavior of your application during development.
  • Use info! for informational messages that are useful for tracking the behavior of your application.
  • Use warn! for warning messages that indicate potential, but recoverable, problems in your application.
  • Use error! for error messages that indicate problems that are not recoverable and need to be addressed.

One of the biggest mistakes developers make is designing log messages with low volume and single threading in mind. Once such an application is deployed in a production environment and the volume increases and multiple threads are writing logs simultaneously, the log messages become little more than noise. Log message must have context and be actionable.

Here's an example of a simple application that logs messages to the console:

use std::thread;

use log::{info, warn};

fn main() {
    env_logger::init();

    thread::scope(|s| {
        for i in 0..10 {
            s.spawn(move || fake_database_operation(i));
        }
    });
}

fn fake_database_operation(_account_number: i32) {
    info!("Reading account record from database");

    // randomly sleep up to 0.5 seconds
    let sleep_duration = std::time::Duration::from_millis((rand::random::<f64>() * 500.0) as u64);
    thread::sleep(sleep_duration);

    if sleep_duration.as_millis() > 400 {
        warn!("Timeout reading account record from database");
        return;
    } else {
        info!("Successfully read account record from database");
    }

    info!("Updating account record in database");

    // fail 50% of the time
    if rand::random::<f64>() > 0.5 {
        warn!("Failed to update account record in database; record locked");
    } else {
        info!("Successfully updated account record in database");
    }
}

This code spawns 10 threads, each of which simulates reading and updating an account record in a database. The log messages indicate when the account record is read, when it is updated, and when there is a timeout or an update failure. The code will randomly fail with a timeout or a record-locked error.

While at first glance this code seems to have good logging, let's take a look at the output to see if it's actually useful.

Here's a potential output.

[2024-03-29T11:07:11Z INFO  logging_example] Reading account record from database
[2024-03-29T11:07:11Z INFO  logging_example] Reading account record from database
[2024-03-29T11:07:11Z INFO  logging_example] Reading account record from database
[2024-03-29T11:07:11Z INFO  logging_example] Reading account record from database
[2024-03-29T11:07:11Z INFO  logging_example] Reading account record from database
[2024-03-29T11:07:11Z INFO  logging_example] Reading account record from database
[2024-03-29T11:07:11Z INFO  logging_example] Reading account record from database
[2024-03-29T11:07:11Z INFO  logging_example] Reading account record from database
[2024-03-29T11:07:11Z INFO  logging_example] Reading account record from database
[2024-03-29T11:07:11Z INFO  logging_example] Successfully read account record from database
[2024-03-29T11:07:11Z INFO  logging_example] Updating account record in database
[2024-03-29T11:07:11Z INFO  logging_example] Successfully updated account record in database
[2024-03-29T11:07:11Z INFO  logging_example] Successfully read account record from database
[2024-03-29T11:07:11Z INFO  logging_example] Updating account record in database
[2024-03-29T11:07:11Z INFO  logging_example] Successfully updated account record in database
[2024-03-29T11:07:11Z INFO  logging_example] Successfully read account record from database
[2024-03-29T11:07:11Z INFO  logging_example] Updating account record in database
[2024-03-29T11:07:11Z WARN  logging_example] Failed to update account record in database; record locked
[2024-03-29T11:07:12Z INFO  logging_example] Successfully read account record from database
[2024-03-29T11:07:12Z INFO  logging_example] Updating account record in database
[2024-03-29T11:07:12Z INFO  logging_example] Successfully updated account record in database
[2024-03-29T11:07:12Z INFO  logging_example] Successfully read account record from database
[2024-03-29T11:07:12Z INFO  logging_example] Updating account record in database
[2024-03-29T11:07:12Z WARN  logging_example] Failed to update account record in database; record locked
[2024-03-29T11:07:12Z WARN  logging_example] Timeout reading account record from database
[2024-03-29T11:07:12Z WARN  logging_example] Timeout reading account record from database
[2024-03-29T11:07:12Z WARN  logging_example] Timeout reading account record from database
[2024-03-29T11:07:12Z WARN  logging_example] Timeout reading account record from database
[2024-03-29T11:07:12Z WARN  logging_example] Timeout reading account record from database

So what's wrong with this output? The log messages are not very useful. They don't provide any context or actionable information. Based on the log, there is no way of telling what account record is being read or updated. Is one operation causing the other to fail? Is there a pattern to the failures? Are the failures related to the same account record? These are all questions that the log messages should answer.

Here's an improved version of the code that logs more useful information:

use std::thread;

use log::{debug, info, warn};

fn main() {
    env_logger::init();

    thread::scope(|s| {
        for i in 0..10 {
            s.spawn(move || fake_database_operation(i));
        }
    });
}

fn fake_database_operation(account_number: i32) {
    let start_time = std::time::Instant::now();
    info!("Reading account record from database for account number {account_number}");

    // randomly sleep up to 0.5 seconds
    let sleep_duration = std::time::Duration::from_millis((rand::random::<f64>() * 500.0) as u64);
    thread::sleep(sleep_duration);

    let elapsed_time = start_time.elapsed().as_millis();
    if sleep_duration.as_millis() > 400 {
        warn!("Timeout reading account record from database for account number {account_number}; operation took {elapsed_time} ms");
        return;
    } else {
        info!("Successfully read account record from database for account number {account_number}; operation took {elapsed_time} ms");
    }

    info!("Updating account record in database for account number {account_number}");

    // fail 50% of the time
    if rand::random::<f64>() > 0.5 {
        warn!("Failed to update account record in database for account number {account_number}; record locked");
    } else {
        info!("Successfully updated account record in database for account number {account_number}");
    }

    let elapsed_time = start_time.elapsed().as_millis();
    debug!("fake_database_operation completed successfully in {elapsed_time} ms")
}

In case of a failure, the log messages now provide the account number, the elapsed time, and the operation that failed. We can use a tool like grep to filter the log messages based on the account number and keep the context of the log messages. This makes it easier to diagnose problems and track the behavior of the application.

... but there is a lot of repetition in the code.

Tracing

The Tokio tracing crate takes logging to the next level by providing structured logging. Structured logging allows you to log key-value pairs that can be easily searched and filtered. This makes it easier to track the behavior of your application and diagnose problems when they occur, without the code repetition.

To use tracing, add it to your Cargo.toml file:

[dependencies]
log = "0.4"
rand = "0.8"
tracing = "0.1"
tracing-subscriber = "0.3"

Let's update the previous example to use tracing:

use std::thread;

use tracing::{debug, info, instrument, warn};

fn main() {
    tracing_subscriber::fmt::init();
    thread::scope(|s| {
        for i in 0..10 {
            s.spawn(move || fake_database_operation(i));
        }
    });
}

#[instrument]
fn fake_database_operation(account_number: i32) {
    let start_time = std::time::Instant::now();
    info!("Reading account record from database");

    // randomly sleep up to 0.5 seconds
    let sleep_duration = std::time::Duration::from_millis((rand::random::<f64>() * 500.0) as u64);
    thread::sleep(sleep_duration);

    let elapsed_time = start_time.elapsed().as_millis();
    if sleep_duration.as_millis() > 400 {
        warn!("Timeout reading account record from database; operation took {elapsed_time} ms");
        return;
    } else {
        info!("Successfully read account record from database; operation took {elapsed_time} ms");
    }

    info!("Updating account record in database");

    // fail 50% of the time
    if rand::random::<f64>() > 0.5 {
        warn!("Failed to update account record in database; record locked");
    } else {
        info!("Successfully updated account record in database");
    }

    let elapsed_time = start_time.elapsed().as_millis();
    debug!("fake_database_operation completed successfully in {elapsed_time} ms")
}

There is no need to log the account number in every log message. The tracing crate provides a way to add context to log messages using the #[instrument] attribute. The #[instrument] attribute automatically logs the function name, arguments, and return value. This makes it easier to track the behavior of your application and diagnose problems when they occur.

Let's look at the output of the tracing crate:

2024-03-29T11:16:47.819412Z  INFO fake_database_operation{account_number=6}: logging_example: Reading account record from database
2024-03-29T11:16:47.819422Z  INFO fake_database_operation{account_number=1}: logging_example: Reading account record from database
2024-03-29T11:16:47.819446Z  INFO fake_database_operation{account_number=3}: logging_example: Reading account record from database
2024-03-29T11:16:47.819453Z  INFO fake_database_operation{account_number=4}: logging_example: Reading account record from database
2024-03-29T11:16:47.819443Z  INFO fake_database_operation{account_number=7}: logging_example: Reading account record from database
2024-03-29T11:16:47.819489Z  INFO fake_database_operation{account_number=8}: logging_example: Reading account record from database
2024-03-29T11:16:47.819468Z  INFO fake_database_operation{account_number=5}: logging_example: Reading account record from database
2024-03-29T11:16:47.819410Z  INFO fake_database_operation{account_number=2}: logging_example: Reading account record from database
2024-03-29T11:16:47.819407Z  INFO fake_database_operation{account_number=0}: logging_example: Reading account record from database
2024-03-29T11:16:47.819518Z  INFO fake_database_operation{account_number=9}: logging_example: Reading account record from database
2024-03-29T11:16:47.896003Z  INFO fake_database_operation{account_number=9}: logging_example: Successfully read account record from database; operation took 76 ms
2024-03-29T11:16:47.896042Z  INFO fake_database_operation{account_number=9}: logging_example: Updating account record in database
2024-03-29T11:16:47.896052Z  INFO fake_database_operation{account_number=9}: logging_example: Successfully updated account record in database
2024-03-29T11:16:47.915817Z  INFO fake_database_operation{account_number=5}: logging_example: Successfully read account record from database; operation took 96 ms
2024-03-29T11:16:47.915848Z  INFO fake_database_operation{account_number=5}: logging_example: Updating account record in database
2024-03-29T11:16:47.915856Z  INFO fake_database_operation{account_number=5}: logging_example: Successfully updated account record in database
2024-03-29T11:16:47.925931Z  INFO fake_database_operation{account_number=0}: logging_example: Successfully read account record from database; operation took 106 ms
2024-03-29T11:16:47.925981Z  INFO fake_database_operation{account_number=0}: logging_example: Updating account record in database
2024-03-29T11:16:47.925992Z  INFO fake_database_operation{account_number=0}: logging_example: Successfully updated account record in database
2024-03-29T11:16:47.933133Z  INFO fake_database_operation{account_number=8}: logging_example: Successfully read account record from database; operation took 113 ms
2024-03-29T11:16:47.933157Z  INFO fake_database_operation{account_number=8}: logging_example: Updating account record in database
2024-03-29T11:16:47.933168Z  INFO fake_database_operation{account_number=8}: logging_example: Successfully updated account record in database
2024-03-29T11:16:48.004544Z  INFO fake_database_operation{account_number=7}: logging_example: Successfully read account record from database; operation took 185 ms
2024-03-29T11:16:48.004568Z  INFO fake_database_operation{account_number=7}: logging_example: Updating account record in database
2024-03-29T11:16:48.004579Z  INFO fake_database_operation{account_number=7}: logging_example: Successfully updated account record in database
2024-03-29T11:16:48.048530Z  INFO fake_database_operation{account_number=4}: logging_example: Successfully read account record from database; operation took 229 ms
2024-03-29T11:16:48.048558Z  INFO fake_database_operation{account_number=4}: logging_example: Updating account record in database
2024-03-29T11:16:48.048568Z  INFO fake_database_operation{account_number=4}: logging_example: Successfully updated account record in database
2024-03-29T11:16:48.239550Z  WARN fake_database_operation{account_number=3}: logging_example: Timeout reading account record from database; operation took 420 ms
2024-03-29T11:16:48.263141Z  WARN fake_database_operation{account_number=1}: logging_example: Timeout reading account record from database; operation took 443 ms
2024-03-29T11:16:48.263466Z  WARN fake_database_operation{account_number=6}: logging_example: Timeout reading account record from database; operation took 444 ms
2024-03-29T11:16:48.312712Z  WARN fake_database_operation{account_number=2}: logging_example: Timeout reading account record from database; operation took 493 ms

Note that the account_number is automatically added to the log messages. This makes it easier to filter the log messages based on the account number and keep the context of the log messages.

Conclusion

Logging is an important part of any application. It allows you to track the behavior of your application and diagnose problems as they occur. Make sure your log messages are contextual and actionable. Use structured logging to make it easier to construct contextualized log messages.

Session 7 - Test Drive Test-Driven Development

In this session, we'll look at the art of test-driven development (TDD). We'll start with a simple example and then extend the functionality using TDD. We'll also look at some best practices and tools to help you with TDD.

Objective

  • Understand the concept of test-driven development
  • Learn how to write tests in Rust

Non-Objective

Test Driven Development is not about code coverage. It is about writing tests that help you design your code. Good coverage is a side effect of TDD. I hope that by practicing TDD you will find that your code is better structured and easier to maintain.

Test-driven development explained

Test-driven development is a practice that has been around for a long time. It's a simple concept: write tests before you write the code. This might sound counterintuitive, but it's a powerful way to ensure that your code is correct and that it does what it' s supposed to do.

The basic idea is to write a test that fails, then write the code that makes the test pass. This is often referred to as the "Red-Green-Refactor" cycle. First, you write a test that fails (the "Red" part). Then you write the code that makes the test pass (the "Green" part). Finally, you refactor the code to make it better (the "Refactor" part).

This might sound like a lot of extra work, but it has several benefits. First, it forces you to think about what you want the code to do before you write it. This can help you avoid writing unnecessary code, and it can help you catch bugs early. Second, it gives you a safety net for making changes to your code. If you have a good set of tests, you can make changes to your code with confidence, knowing that the tests will catch any problems. Finally, it can help you write better code.

When applied correctly, TDD will lead to a codebase that is more maintainable, more reliable, and easier to work with.

Writing tests in Rust

Rust has a built-in testing framework that makes it easy to write tests. You can write tests in the same file as your code, and you can use the #[test] attribute to mark a function as a test.

There is a section in the Rust book that discusses the mechanics of testing in Rust in more detail: How to Write Tests.

Let's look at a simple example. Suppose we want a function that parses a comma-separated string into a Vec of Strings. There is also the requirement that empty fields should be skipped. We can start by writing a test for this function:

fn main() {}


#[cfg(test)]
mod tests {
    /// This test validates that we can parse comma-separated input like: "Tom,Dick,,Harry" into
    /// a Vec of Strings. Empty fields should be skipped.
    #[test]
    fn can_parse_fields() {
        let fields = parse_fields("");
    }
}

What you will notice is that the parse_fields function does not exist yet. We can use the IDE to generate the function for us:

fn main() {}

fn parse_fields(csv: &str) -> Vec<String> {
    todo!()
}

#[cfg(test)]
mod tests {
    use super::*;

    /// This test validates that we can parse comma-separated input like: "Tom,Dick,,Harry" into
    /// a Vec of Strings. Empty fields should be skipped.
    #[test]
    fn can_parse_fields() {
        let fields = parse_fields("");
    }
}

The todo! macro is a placeholder that will cause the test to fail with a message that the function is not yet implemented. If we run the test, we will see that it fails.

As you can see, we start by testing the simplest case we can think of; an empty string in this case.

So let's implement the function with just enough code to make the test pass:

fn main() {}

fn parse_fields(csv: &str) -> Vec<String> {
    vec![]
}

#[cfg(test)]
mod tests {
    use super::*;

    /// This test validates that we can parse comma-separated input like: "Tom,Dick,,Harry" into
    /// a Vec of Strings. Empty fields should be skipped.
    #[test]
    fn can_parse_fields() {
        let fields = parse_fields("");
        assert!(fields.is_empty())
    }
}

Okay! We have a successful test. Now we continue by adding the next logical case; be careful not to get excited and jump right to the final test case! Add just enough code to make the test pass.

fn main() {}

fn parse_fields(csv: &str) -> Vec<String> {
    if csv.is_empty() {
        return vec![];
    }

    vec![csv.to_string()]
}

#[cfg(test)]
mod tests {
    use super::*;

    /// This test validates that we can parse comma-separated input like: "Tom,Dick,,Harry" into
    /// a Vec of Strings. Empty fields should be skipped.
    #[test]
    fn can_parse_fields() {
        let fields = parse_fields("");
        assert!(fields.is_empty());

        let fields = parse_fields("Tom");
        assert_eq!(fields.len(), 1);
        assert_eq!(fields.first().unwrap(), "Tom");
    }
}

Even though these steps seem trivial, you have already identified the special case of an empty string that does not need any further processing.

Let's add the next test.

fn main() {}

fn parse_fields(csv: &str) -> Vec<String> {
    if csv.is_empty() {
        return vec![];
    }

    csv.split(',')
        .map(|s| s.to_string())
        .collect()
}

#[cfg(test)]
mod tests {
    use super::*;

    /// This test validates that we can parse comma-separated input like: "Tom,Dick,,Harry" into
    /// a Vec of Strings. Empty fields should be skipped.
    #[test]
    fn can_parse_fields() {
        let fields = parse_fields("");
        assert!(fields.is_empty());

        let fields = parse_fields("Tom");
        assert_eq!(fields.len(), 1);
        assert_eq!(fields.first().unwrap(), "Tom");

        let fields = parse_fields("Tom,Dick");
        assert_eq!(fields.len(), 2);
        assert_eq!(fields.first().unwrap(), "Tom");
        assert_eq!(fields.get(1).unwrap(), "Dick");
    }
}

To make this test pass, we had to add some extra logic and introduced the split method. Next, we'll introduce the special case of an empty field.

fn main() {}

fn parse_fields(csv: &str) -> Vec<String> {
    if csv.is_empty() {
        return vec![];
    }

    csv.split(',')
        .map(|s| s.to_string())
        .collect()
}

#[cfg(test)]
mod tests {
    use super::*;

    /// This test validates that we can parse comma-separated input like: "Tom,Dick,,Harry" into
    /// a Vec of Strings. Empty fields should be skipped.
    #[test]
    fn can_parse_fields() {
        let fields = parse_fields("");
        assert!(fields.is_empty());

        let fields = parse_fields("Tom");
        assert_eq!(fields.len(), 1);
        assert_eq!(fields.first().unwrap(), "Tom");

        let fields = parse_fields("Tom,Dick");
        assert_eq!(fields.len(), 2);
        assert_eq!(fields.first().unwrap(), "Tom");
        assert_eq!(fields.get(1).unwrap(), "Dick");

        let fields = parse_fields("Tom,,Dick");
        assert_eq!(fields.len(), 2);
        assert_eq!(fields.first().unwrap(), "Tom");
        assert_eq!(fields.get(1).unwrap(), "Dick");
    }
}

As you see, this will make our test fail. Adding the filter function will weed out the empty fields as required.

fn main() {}

fn parse_fields(csv: &str) -> Vec<String> {
    if csv.is_empty() {
        return vec![];
    }

    csv.split(',')
        .filter(|s| !s.is_empty())
        .map(|s| s.to_string())
        .collect()
}

#[cfg(test)]
mod tests {
    use super::*;

    /// This test validates that we can parse comma-separated input like: "Tom,Dick,,Harry" into
    /// a Vec of Strings. Empty fields should be skipped.
    #[test]
    fn can_parse_fields() {
        let fields = parse_fields("");
        assert!(fields.is_empty());

        let fields = parse_fields("Tom");
        assert_eq!(fields.len(), 1);
        assert_eq!(fields.first().unwrap(), "Tom");

        let fields = parse_fields("Tom,Dick");
        assert_eq!(fields.len(), 2);
        assert_eq!(fields.first().unwrap(), "Tom");
        assert_eq!(fields.get(1).unwrap(), "Dick");

        let fields = parse_fields("Tom,,Dick");
        assert_eq!(fields.len(), 2);
        assert_eq!(fields.first().unwrap(), "Tom");
        assert_eq!(fields.get(1).unwrap(), "Dick");
    }
}

Now that we have some non-trivial logic, let's see if we can refactor our code without breaking the tests.

fn main() {}

fn parse_fields(csv: &str) -> Vec<String> {
    csv.split(',')
        .filter(|s| !s.is_empty())
        .map(|s| s.to_string())
        .collect()
}

#[cfg(test)]
mod tests {
    use super::*;

    /// This test validates that we can parse comma-separated input like: "Tom,Dick,,Harry" into
    /// a Vec of Strings. Empty fields should be skipped.
    #[test]
    fn can_parse_fields() {
        let fields = parse_fields("");
        assert!(fields.is_empty());

        let fields = parse_fields("Tom");
        assert_eq!(fields.len(), 1);
        assert_eq!(fields.first().unwrap(), "Tom");

        let fields = parse_fields("Tom,Dick");
        assert_eq!(fields.len(), 2);
        assert_eq!(fields.first().unwrap(), "Tom");
        assert_eq!(fields.get(1).unwrap(), "Dick");

        let fields = parse_fields("Tom,,Dick");
        assert_eq!(fields.len(), 2);
        assert_eq!(fields.first().unwrap(), "Tom");
        assert_eq!(fields.get(1).unwrap(), "Dick");
    }
}

It turns out that we can now safely remove the check for the empty string, without breaking any of our test cases. Let's finish the exercise by adding the originally-requested input and any corner case we can think of.

fn main() {}

fn parse_fields(csv: &str) -> Vec<String> {
    csv.split(',')
        .filter(|s| !s.is_empty())
        .map(|s| s.to_string())
        .collect()
}

#[cfg(test)]
mod tests {
    use super::*;

    /// This test validates that we can parse comma-separated input like: "Tom,Dick,,Harry" into
    /// a Vec of Strings. Empty fields should be skipped.
    #[test]
    fn can_parse_fields() {
        let fields = parse_fields("");
        assert!(fields.is_empty());

        let fields = parse_fields("Tom");
        assert_eq!(fields.len(), 1);
        assert_eq!(fields.first().unwrap(), "Tom");

        let fields = parse_fields("Tom,Dick");
        assert_eq!(fields.len(), 2);
        assert_eq!(fields.first().unwrap(), "Tom");
        assert_eq!(fields.get(1).unwrap(), "Dick");

        let fields = parse_fields("Tom,,Dick");
        assert_eq!(fields.len(), 2);
        assert_eq!(fields.first().unwrap(), "Tom");
        assert_eq!(fields.get(1).unwrap(), "Dick");

        let fields = parse_fields("Tom,Dick,,Harry");
        assert_eq!(fields.len(), 3);
        assert_eq!(fields.first().unwrap(), "Tom");
        assert_eq!(fields.get(1).unwrap(), "Dick");
        assert_eq!(fields.get(2).unwrap(), "Harry");

        let fields = parse_fields(",Tom, Dick,, ,Harry,");
        assert_eq!(fields.len(), 3);
        assert_eq!(fields.first().unwrap(), "Tom");
        assert_eq!(fields.get(1).unwrap(), "Dick");
        assert_eq!(fields.get(2).unwrap(), "Harry");
    }
}

The original input tested fine, but it looks like we missed a corner case: a field containing spaces. Let's make the final adjustment to our code and verify that all of our test cases pass.

fn main() {}

fn parse_fields(csv: &str) -> Vec<String> {
    csv.split(',')
        .map(|s| s.trim())
        .filter(|s| !s.is_empty())
        .map(|s| s.to_string())
        .collect()
}

#[cfg(test)]
mod tests {
    use super::*;

    /// This test validates that we can parse comma-separated input like: "Tom,Dick,,Harry" into
    /// a Vec of Strings. Empty fields should be skipped.
    #[test]
    fn can_parse_fields() {
        let fields = parse_fields("");
        assert!(fields.is_empty());

        let fields = parse_fields("Tom");
        assert_eq!(fields.len(), 1);
        assert_eq!(fields.first().unwrap(), "Tom");

        let fields = parse_fields("Tom,Dick");
        assert_eq!(fields.len(), 2);
        assert_eq!(fields.first().unwrap(), "Tom");
        assert_eq!(fields.get(1).unwrap(), "Dick");

        let fields = parse_fields("Tom,,Dick");
        assert_eq!(fields.len(), 2);
        assert_eq!(fields.first().unwrap(), "Tom");
        assert_eq!(fields.get(1).unwrap(), "Dick");

        let fields = parse_fields("Tom,Dick,,Harry");
        assert_eq!(fields.len(), 3);
        assert_eq!(fields.first().unwrap(), "Tom");
        assert_eq!(fields.get(1).unwrap(), "Dick");
        assert_eq!(fields.get(2).unwrap(), "Harry");

        let fields = parse_fields(",Tom, Dick,, ,Harry,");
        assert_eq!(fields.len(), 3);
        assert_eq!(fields.first().unwrap(), "Tom");
        assert_eq!(fields.get(1).unwrap(), "Dick");
        assert_eq!(fields.get(2).unwrap(), "Harry");
    }
}

I hope you see that by following this approach, your code gradually improved as more test cases were covered. Because the steps were small, there was no need to think about the final solution from the beginning.

Your code evolved, and you ended up with a function that met the requirements, and all the test cases to prove that the code worked correctly.

In the next chapter, we'll look at how Test-driven-development can help you structure your code better.

Reference material

Better Rust code with TDD

The most challenging aspect of TDD is to find the right "level" of testing. If you start writing tests for high-level functions, you'll end up with tests that are large, complicated and need a lot of "mocking" or dependency injection.

My suggestion is to write tests from the "inside out". Start with the smallest possible unit of code and write tests for that. Then move on to the next level of code and write tests for that. This way you'll end up with a suite of tests that are small, simple and fast to run.

The alternative approach would be to write tests for the high-level functions first. This is called "top-down" testing. This approach is often used when you have legacy code, and you want to add tests to it. This is not well-suited for TDD.

In this chapter, we'll explore how code will look like when written with TDD and with "top-down" testing. We'll start with the latter.

Top-down testing

Building on the previous chapter, imagine that we need to read a line from a file and then parse the comma-separated fields, and return these in a Vec.

As seasoned Rustaceans, we know that we can use the fs module to read the file and the split method to parse the fields. So we write the following. Add an "input.txt" file with the content "Tom,Dick,,Harry" to the project root. Run the program and confirm all is fine!

use std::fs;

fn main() {
    let filename = "input.txt";
    let fields = parse_fields_from_file(filename);
    println!("{fields:?}");
}

fn parse_fields_from_file(filename: &str) -> Vec<String> {
    let lines = fs::read_to_string(filename).unwrap();
    let first_line = lines.lines().next().unwrap();
    first_line.split(",").filter(|f| !f.is_empty()).map(|f| f.to_string()).collect()
}

Now the QA team reminds me that I need a test to make sure there is enough code coverage. As I start writing the test, I realize that in my unit test I don't have access to the "input.txt" file. As it turns out, there is no easy way to test this code without refactoring it. 🤔

Test-driven Development

Let's do the same exercise using TDD.

We'll start by testing & writing the smallest piece of logic we can think of, which in this case (coincidentally) is parsing the single line of comma-separated values. Exactly what we did in the previous chapter! As a reminder:

fn main() {}

fn parse_fields(csv: &str) -> Vec<String> {
    csv.split(',')
        .map(|s| s.trim())
        .filter(|s| !s.is_empty())
        .map(|s| s.to_string())
        .collect()
}

#[cfg(test)]
mod tests {
    use super::*;

    /// This test validates that we can parse comma-separated input like: "Tom,Dick,,Harry" into
    /// a Vec of Strings. Empty fields should be skipped.
    #[test]
    fn can_parse_fields() {
        let fields = parse_fields("");
        assert!(fields.is_empty());

        let fields = parse_fields("Tom");
        assert_eq!(fields.len(), 1);
        assert_eq!(fields.first().unwrap(), "Tom");

        let fields = parse_fields("Tom,Dick");
        assert_eq!(fields.len(), 2);
        assert_eq!(fields.first().unwrap(), "Tom");
        assert_eq!(fields.get(1).unwrap(), "Dick");

        let fields = parse_fields("Tom,,Dick");
        assert_eq!(fields.len(), 2);
        assert_eq!(fields.first().unwrap(), "Tom");
        assert_eq!(fields.get(1).unwrap(), "Dick");

        let fields = parse_fields("Tom,Dick,,Harry");
        assert_eq!(fields.len(), 3);
        assert_eq!(fields.first().unwrap(), "Tom");
        assert_eq!(fields.get(1).unwrap(), "Dick");
        assert_eq!(fields.get(2).unwrap(), "Harry");

        let fields = parse_fields(",Tom, Dick,, ,Harry,");
        assert_eq!(fields.len(), 3);
        assert_eq!(fields.first().unwrap(), "Tom");
        assert_eq!(fields.get(1).unwrap(), "Dick");
        assert_eq!(fields.get(2).unwrap(), "Harry");
    }
}

The next piece of logic to test & write would be to fetch the first line of a newline-separated String. Let's add the test for the most trivial case.

fn main() {}

fn parse_fields(csv: &str) -> Vec<String> {
    csv.split(',')
        .map(|s| s.trim())
        .filter(|s| !s.is_empty())
        .map(|s| s.to_string())
        .collect()
}

#[cfg(test)]
mod tests {
    use super::*;

    /// This test validates that we can parse comma-separated input like: "Tom,Dick,,Harry" into
    /// a Vec of Strings. Empty fields should be skipped.
    #[test]
    fn can_parse_fields() {
        // ...
    }

    /// Confirm that we can retrieve the first line of a newline-separated String
    #[test]
    fn can_parse_lines() {
        let first_line = read_first_line("");
        assert_eq!(first_line, "");
    }
}

The first version of the parse_lines() function could be:

fn read_first_line(file_contents: &str) -> &str {
    file_contents
}

Using TDD, we'll extend the test coverage and function code, and end up with this:

fn main() {}

fn parse_fields(csv: &str) -> Vec<String> {
    csv.split(',')
        .map(|s| s.trim())
        .filter(|s| !s.is_empty())
        .map(|s| s.to_string())
        .collect()
}

fn read_first_line(file_contents: &str) -> &str {
    file_contents.split('\n').find(|s| !s.is_empty())
        .unwrap_or("")
}

#[cfg(test)]
mod tests {
    use super::*;

    /// This test validates that we can parse comma-separated input like: "Tom,Dick,,Harry" into
    /// a Vec of Strings. Empty fields should be skipped.
    #[test]
    fn can_parse_fields() {
        let fields = parse_fields("");
        assert!(fields.is_empty());

        let fields = parse_fields("Tom");
        assert_eq!(fields.len(), 1);
        assert_eq!(fields.first().unwrap(), "Tom");

        let fields = parse_fields("Tom,Dick");
        assert_eq!(fields.len(), 2);
        assert_eq!(fields.first().unwrap(), "Tom");
        assert_eq!(fields.get(1).unwrap(), "Dick");

        let fields = parse_fields("Tom,,Dick");
        assert_eq!(fields.len(), 2);
        assert_eq!(fields.first().unwrap(), "Tom");
        assert_eq!(fields.get(1).unwrap(), "Dick");

        let fields = parse_fields("Tom,Dick,,Harry");
        assert_eq!(fields.len(), 3);
        assert_eq!(fields.first().unwrap(), "Tom");
        assert_eq!(fields.get(1).unwrap(), "Dick");
        assert_eq!(fields.get(2).unwrap(), "Harry");

        let fields = parse_fields(",Tom, Dick,, ,Harry,");
        assert_eq!(fields.len(), 3);
        assert_eq!(fields.first().unwrap(), "Tom");
        assert_eq!(fields.get(1).unwrap(), "Dick");
        assert_eq!(fields.get(2).unwrap(), "Harry");
    }

    /// Confirm that we can retrieve the first line of a newline-separated String
    #[test]
    fn can_parse_lines() {
        let first_line = read_first_line("");
        assert_eq!(first_line, "");

        let first_line = read_first_line("\n");
        assert_eq!(first_line, "");

        let first_line = read_first_line("a");
        assert_eq!(first_line, "a");

        let first_line = read_first_line("a\nb");
        assert_eq!(first_line, "a");

        let first_line = read_first_line("\na\nb");
        assert_eq!(first_line, "a");

        let first_line = read_first_line("\na\nb\n");
        assert_eq!(first_line, "a");
    }
}

Another important aspect of writing tests is that we think carefully about what we are trying to test. And, perhaps more importantly, that we not test code that is not part of our program, such as code from the standard Rust library.

So in our case, there is no need to verify that fs::read_to_string() works as expected. We can assume that it does. With this in mind, we can write the following code:

fn main() {
    let filename = "input.txt";
    let fields = parse_fields_from_file(filename);
    println!("{fields:?}");
}

fn parse_fields(csv: &str) -> Vec<String> {
    csv.split(',')
        .map(|s| s.trim())
        .filter(|s| !s.is_empty())
        .map(|s| s.to_string())
        .collect()
}

fn read_first_line(file_contents: &str) -> &str {
    file_contents.split('\n').find(|s| !s.is_empty())
        .unwrap_or("")
}

fn parse_fields_from_file(filename: &str) -> Vec<String> {
    let file_contents = std::fs::read_to_string(filename)
        .expect("Could not read file");
    let first_line = read_first_line(&file_contents);
    parse_fields(first_line)
}

#[cfg(test)]
mod tests {
    use super::*;

    /// This test validates that we can parse comma-separated input like: "Tom,Dick,,Harry" into
    /// a Vec of Strings. Empty fields should be skipped.
    #[test]
    fn can_parse_fields() {
        let fields = parse_fields("");
        assert!(fields.is_empty());

        let fields = parse_fields("Tom");
        assert_eq!(fields.len(), 1);
        assert_eq!(fields.first().unwrap(), "Tom");

        let fields = parse_fields("Tom,Dick");
        assert_eq!(fields.len(), 2);
        assert_eq!(fields.first().unwrap(), "Tom");
        assert_eq!(fields.get(1).unwrap(), "Dick");

        let fields = parse_fields("Tom,,Dick");
        assert_eq!(fields.len(), 2);
        assert_eq!(fields.first().unwrap(), "Tom");
        assert_eq!(fields.get(1).unwrap(), "Dick");

        let fields = parse_fields("Tom,Dick,,Harry");
        assert_eq!(fields.len(), 3);
        assert_eq!(fields.first().unwrap(), "Tom");
        assert_eq!(fields.get(1).unwrap(), "Dick");
        assert_eq!(fields.get(2).unwrap(), "Harry");

        let fields = parse_fields(",Tom, Dick,, ,Harry,");
        assert_eq!(fields.len(), 3);
        assert_eq!(fields.first().unwrap(), "Tom");
        assert_eq!(fields.get(1).unwrap(), "Dick");
        assert_eq!(fields.get(2).unwrap(), "Harry");
    }

    /// Confirm that we can retrieve the first line of a newline-separated String
    #[test]
    fn can_parse_lines() {
        let first_line = read_first_line("");
        assert_eq!(first_line, "");

        let first_line = read_first_line("\n");
        assert_eq!(first_line, "");

        let first_line = read_first_line("a");
        assert_eq!(first_line, "a");

        let first_line = read_first_line("a\nb");
        assert_eq!(first_line, "a");

        let first_line = read_first_line("\na\nb");
        assert_eq!(first_line, "a");

        let first_line = read_first_line("\na\nb\n");
        assert_eq!(first_line, "a");
    }
}

As you can see, our main function is identical to our other attempt. But our code is much better structured and has proper test coverage.

The importance of having properly structured code with good test coverage increases when it comes to extending the functionality of the code. As an exercise, try extending the program using TDD to retrieve the fields from all the lines in the file.

Clean-up and refactoring

When you have a suite of tests that cover your code, you can refactor your code with confidence. You can change the implementation of a function, and as long as the tests pass, you can be sure that you haven't broken anything.

The test suite itself is also part of the code. It is important to keep the tests clean and well-structured. If the tests are messy, it will be difficult to understand what the code is supposed to do.

In the above example, we have a lot of repetition in the can_parse_fields tests. We can refactor the tests to make them more readable:

fn assert_fields(input: &str, expected: &[&str]) {
    let fields = parse_fields(input);
    assert_eq!(fields.len(), expected.len());
    for (i, field) in fields.iter().enumerate() {
        assert_eq!(field, expected.get(i).unwrap());
    }
}

/// This test validates that we can parse comma-separated input like: "Tom,Dick,,Harry" into
/// a Vec of Strings. Empty fields should be skipped.
#[test]
fn can_parse_fields() {
    let fields = parse_fields("");
    assert!(fields.is_empty());

    assert_fields("Tom", &["Tom"]);
    assert_fields("Tom,Dick", &["Tom", "Dick"]);
    assert_fields("Tom,,Dick", &["Tom", "Dick"]);
    assert_fields("Tom,Dick,,Harry", &["Tom", "Dick", "Harry"]);
    assert_fields(",Tom, Dick,, ,Harry,", &["Tom", "Dick", "Harry"]);
}

Refactoring the tests in this way makes them easier to maintain and more inherently meaningful.

Conclusion

Using Test-Driven Development will result in better structured code. As a side effect, your code will have good test coverage.

The tests themselves are also a form of documentation. They show how the code should work. If you need to change the code, you can look at the tests to see what the code is supposed to do.

Session 8 - Migration from other languages

Since Rust is relatively new compared to other languages, you might be coming to Rust from another language. This chapter covers some common concepts in other languages and how they are implemented in Rust.

Migrate to Rust from C

If you're coming from C, you'll find that Rust has a lot of similarities. However, there are some differences that you'll need to get used to.

Organizing code

In C, you can organize your code into separate files using header files and source files. In Rust, you can organize your code into modules. Modules allow you to group related code together and control the visibility of code within the module.

Here's an example of organizing code into modules in Rust:

mod math {
    pub fn add(a: i32, b: i32) -> i32 {
        a + b
    }
}

fn main() {
    use math::add;
    let result = add(1, 2);
    println!("{result}");
}

Although the above example is valid, it's not idiomatic Rust. In Rust, you would typically organize your code into separate files and use the mod keyword to include them in your main file. So the contents of the mod math {} block would be in a separate file called math.rs or math/mod.rs. The main.rs file would then look like this:

use math::add;

mod math;

fn main() {
    let result = add(1, 2);
    println!("{result}");
}

In C you would have a header file math.h and a source file math.c that would look like this:

// math.h
int add(int a, int b);

// math.c
#include "math.h"

int add(int a, int b) {
    return a + b;
}

// main.c
#include <stdio.h>
#include "math.h"

int main() {
    int result = add(1, 2);
    printf("%d\n", result);
}

Optimizing code for production

In C, you can use compiler flags like -O2 to optimize your code for production. In Rust, you can use the --release flag with the cargo build command to optimize your code for production.

Here's an example of compiling a C program with optimizations:

$ gcc -O2 -o program program.c

In Rust, the same code would look like this:

$ cargo build --release

The --release flag tells the Rust compiler to optimize the code for production. This includes inlining functions, removing debug symbols, and other optimizations that can improve performance.

Language features

Pointers

In C, you can create a pointer to a variable like this:

#include <stdio.h>

int main()
{
	int x = 10;
	int *ptr = &x;
	printf("Value of x: %d\n", *ptr);
}

In Rust, you can create a reference to a variable like this:

fn main() {
    let x = 10;
    let ptr = &x;
    println!("Value of x: {}", *ptr);
}

Lifetimes are a concept in Rust that ensures that references are valid for a certain period of time. This is a key difference between Rust and C. In C, you can have dangling pointers, which can lead to undefined behavior. In Rust, the compiler will ensure that references are valid for the lifetime of the reference. This is a blessing and a curse, as it can be a bit tricky to get used to at first.

For newcomers using a smart pointer like Rc, or Arc can be a good way to avoid dealing with lifetimes.

use std::rc::Rc;

fn main() {
    let x = 10;
    let ptr = Rc::new(x);
    println!("Value of x: {}", *ptr);
}

The overhead of using smart pointers is negligible in most cases, and it can save you a lot of headaches.

Memory Management

In C, you have to manage memory manually using malloc and free. This can be error-prone and lead to memory leaks and other issues. In Rust, memory management is handled by the compiler using the ownership system. This system ensures that memory is freed when it is no longer needed, and prevents common memory-related bugs like use-after-free and double-free errors.

Here's an example of manual memory management in C:

#include <stdio.h>
#include <stdlib.h>

int main()
{
    int *ptr = malloc(sizeof(int));
    *ptr = 10;
    printf("Value of x: %d\n", *ptr);
    free(ptr);
}

In Rust the same code would look like this:

fn main() {
    let x = 10;
    let ptr = &x;
    println!("Value of x: {}", *ptr);
}

The x will be dropped when it goes out of scope, and the memory will be freed automatically. In the above example, this happens immediately after the println! statement.

Strings

In C, strings are represented as arrays of characters. In Rust, strings are represented as String objects (editable), or &str string slices (static). Rust strings are UTF-8 encoded, which means they can contain any valid Unicode code point.

Here's an example of working with strings in C:

#include <stdio.h>

int main()
{
    char *name = "John";
    printf("Hello, %s!\n", name);
}

In Rust, the same code would look like this:

fn main() {
    let name = "John";
    println!("Hello, {name}!");
}

In Rust, you can also create a String object like this:

fn main() {
    let name = "John".to_string();
    println!("Hello, {name}!");
}

Arrays

In C, arrays are fixed-size collections of elements. In Rust, arrays are also fixed-size, but they are bounds-checked at runtime. This means that if you try to access an element outside the bounds of the array, your program will panic.

Here's an example of working with arrays in C:

#include <stdio.h>

int main()
{
    int arr[3] = {1, 2, 3};
    for (int i = 0; i < 3; i++) {
        printf("%d\n", arr[i]);
    }
}

In Rust, the same code would look like this:

fn main() {
    let arr = [1, 2, 3];
    for i in 0..3 {
        println!("{}", arr[i]);
    }
}

In Rust, you can also use the iter method to iterate over the elements of an array. This would be a more idiomatic way to write the above code:

fn main() {
    let arr = [1, 2, 3];
    arr.iter().for_each(|i| println!("{i}"));
}

Arrays vs. Vectors

In Rust, arrays are fixed-size collections of elements, while vectors are dynamic-size collections. Vectors are more flexible than arrays, but they have a performance cost. If you need a collection that can grow or shrink at runtime, you should use a vector. If you know the size of the collection at compile time, you should use an array.

Here's an example of working with vectors in Rust:

fn main() {
    let mut list = vec![1, 2, 3];
    list.push(4);
    list.iter().for_each(|i| println!("{i}"));
}

Ternary operator

In C, you can use the ternary operator ? : to conditionally assign a value to a variable. In Rust, you can use the if expression to achieve the same result.

Here's an example of using the ternary operator in C:

#include <stdio.h>

int main()
{
    int x = 10;
    int y = x > 5 ? 1 : 0;
    printf("%d\n", y);
}

In Rust, the same code would look like this:

fn main() {
    let x = 10;
    let y = if x > 5 { 1 } else { 0 };
    println!("{y}");
}

Dealing with file I/O

In C, you can read and write files using the fopen, fread, fwrite, and fclose functions. In Rust, you can use the std::fs module to read and write files.

Here's an example of reading a file in C:

#include <stdio.h>

int main()
{
    FILE *file = fopen("file.txt", "r");
    char buffer[256];
    fread(buffer, 1, 256, file);
    printf("%s\n", buffer);
    fclose(file);
}

In Rust, the same code would look like this:

use std::fs::File;
use std::io::Read;

fn main() {
    let mut file = File::open("file.txt").expect("Unable to open file");
    let mut buffer = String::new();
    file.read_to_string(&mut buffer).expect("Unable to read file");
    println!("{buffer}");
}

In Rust, you can use the std::fs::write function to write to a file:

use std::fs;

fn main() {
    let data = "Hello, world!";
    fs::write("another_file.txt", data).expect("Unable to write file");
}

In C you would use fwrite to write to a file:

#include <stdio.h>

int main()
{
    FILE *file = fopen("another_file.txt", "w");
    char data[] = "Hello, world!";
    fwrite(data, 1, sizeof(data), file);
    fclose(file);
}

Functions

In C, functions are defined using the function_name syntax. In Rust, functions are defined using the fn function_name syntax. Rust functions can take parameters and return values, just like C functions.

Here's an example of defining a function in C:

#include <stdio.h>

int add(int a, int b)
{
    return a + b;
}

int main()
{
    int result = add(1, 2);
    printf("%d\n", result);
}

In Rust, the same code would look like this:

fn add(a: i32, b: i32) -> i32 {
    a + b
}

fn main() {
    let result = add(1, 2);
    println!("{result}");
}

Note: In Rust, the last expression in a function is implicitly returned. This means that you don't need to use the return keyword to return a value from a function.

Void functions

In C, you can define a function that doesn't return a value using the void keyword. In Rust, you can define a function that doesn't return a value using the () type. This is called the unit type, but it is often not explicitly written.

Here's an example of defining a void function in C:

#include <stdio.h>

void greet()
{
    printf("Hello, world!\n");
}

int main()
{
    greet();
}

In Rust, the same code would look like this:

fn greet() {
    println!("Hello, world!");
}

fn main() {
    greet();
}

We could have written the greet function like this:

#![allow(unused)]
fn main() {
fn greet() -> () {
    println!("Hello, world!");
}
}

But it is not idiomatic Rust to write it like this. And Clippy will warn you about it:

warning: unneeded unit return type
 --> src/main.rs:1:11
  |
1 | fn greet() -> () {
  |           ^^^^^^ help: remove the `-> ()`
  |
  = help: for further information visit https://rust-lang.github.io/rust-clippy/master/index.html#unused_unit
  = note: `#[warn(clippy::unused_unit)]` on by default

For-loops

In C you can use a for loop to initialize and then iterate over a range of values. A typical C-style for loop looks like this:

#include <stdio.h>
#include <stdlib.h>

int main()
{
    int size = 10;
    int* arr = (int*)malloc(size * sizeof(int));   
    if (arr == NULL) { 
        exit(0);
    }  
    for (int i = 0; i < size; ++i) { 
        arr[i] = i + 1; 
    }
    for (int i = 0; i < size; ++i) { 
        printf("%d\n", arr[i]); 
    } 
}

What I've seen from people that move to Rust from C is that they often use the for loop in Rust in a C-style way:

fn main() {
    let size = 10;
    let mut arr = vec![0; size];

    for i in 0..size {
        arr[i] = i + 1;
    }

    for i in 0..size {
        println!("{}", arr[i]);
    }
}

This is not idiomatic Rust. And worse, it has performance implications. Rust uses bounds checking to ensure that you don't access elements outside the bounds of the array. This means that the above code will be slower than the equivalent C code. The idiomatic way to write the above code in Rust would be like this:

fn main() {
    let size = 10;
    let arr: Vec<i32> = (1..=size).collect();
    arr.iter().for_each(|i| println!("{i}"));
}

Using iterators is the idiomatic way to write loops in Rust. It is more concise and more performant than using a C-style for loop.

Macros

Rust has a powerful macro system that allows you to define custom syntax that gets evaluated at compile time. This is similar to the preprocessor in C. Here's an example of a macro in C:

#include <stdio.h>

#define LOG(message) printf("%s\n", message)

int main()
{
    LOG("Hello, world!");
}

In Rust, the same code would look like this:

macro_rules! log {
    ($message:expr) => {
        println!("{}", $message);
    };
}

fn main() {
    log!("Hello, world!");
}

Multi-threading

In C, you can create threads using the pthread library. In Rust, you can create threads using the std::thread module. A simple example in C that creates a thread looks like this:

#include <stdio.h>
#include <pthread.h>

void* print_message(void* message)
{
    printf("%s\n", (char*)message);
    return NULL;
}

int main()
{
    pthread_t thread;
    char* message = "Hello, world!";
    pthread_create(&thread, NULL, print_message, (void*)message);
    pthread_join(thread, NULL);
}

In Rust, the same code would look like this:

use std::thread;

fn print_message(message: &str) {
    println!("{message}");
}

fn main() {
    let message = "Hello, world!";
    let handle = thread::spawn(|| print_message(message));
    handle.join().expect("thread panicked");
}

Stuff that you can't do in "safe" Rust

I came across a C example that let you modify a global static variable from a thread, without any synchronization or locking mechanism. This is undefined behavior in Rust. Safe Rust will not allow you to modify a global static variable from a thread without using a synchronization primitive like a Mutex or an Atomic type.

In C you would be allowed to do this:

#include <stdio.h>
#include <pthread.h>

int counter = 0;

void* increment_counter(void* arg)
{
    counter++;
    return NULL;
}

int main()
{
    pthread_t thread;
    pthread_create(&thread, NULL, increment_counter, NULL);
    pthread_join(thread, NULL);
    printf("%d\n", counter);
}

To the credit of the author of the original C example, they did mention that this is not a good practice and that you should use a synchronization primitive like a mutex to protect the global variable. But as you can see, C is not going to stop you from doing this.

In "safe" Rust this is not possible. The code will not compile! Let's look at the equivalent "unsafe" Rust code:

use std::thread;

static mut COUNTER: i32 = 0;

fn increment_counter() {
    unsafe {
        COUNTER += 1;
    }
}

fn main() {
    let handle = thread::spawn(|| increment_counter());
    handle.join().expect("thread panicked");
    unsafe {
        println!("COUNTER: {}", COUNTER);
    }
}

The use of unsafe is a big red flag. It tells the developer that the code is doing something that will likely cause Undefined Behavior. I suggest you read more about Undefined Behavior (UB) in Rust, and why it is important to avoid it. UB will bite you when you least expect it. Stay safe! 😁

Exercise

Try removing the unsafe block from the above code and see what happens.

So now let's build the Rust code the way it should be built; assuming that you still want a global counter:

use std::sync::Mutex;
use std::thread;

static COUNTER: Mutex<i32> = Mutex::new(0);

fn increment_counter() {
    let mut counter = COUNTER.lock().expect("mutex is poisoned");
    *counter += 1;
}

fn main() {
    let handle = thread::spawn(|| increment_counter());
    handle.join().expect("thread panicked");
    let counter = COUNTER.lock().expect("mutex is poisoned");
    println!("COUNTER: {counter}");
}

This is the idiomatic way to write the above code in Rust. It uses a Mutex to protect the global counter from being modified by multiple threads at the same time. The Mutex ensures that only one thread can access the counter at a time.

Notice that we do not need to unlock the Mutex after we are done with it. The Mutex will be automatically dropped when it goes out of scope and the Drop trait of the Mutex will automatically unlock the mutex.

Also note that the inner value of the Mutex is 'consumed' by the Mutex when we create it with Mutex::new(0). This means that the Mutex takes ownership of the value and will be responsible for dropping it when it goes out of scope. This is why we don't need to drop the inner value of the Mutex after we are done with it.

Because the Mutex has ownership of the inner value, there is no way to access the inner value directly, so we don't need to worry about it being used incorrectly.

In Rust - and arguably in C as well - it is better to avoid global state as much as possible. What you will often see in Rust is that you will construct a struct that holds the state that you want to share between threads, and then pass a reference to that struct to the threads. This is a safer and more idiomatic way to write multithreaded code in Rust.

The mechanism to do this is to wrap the state in an Arc<Mutex<T>> where T is the type of the state. Arc is a thread-safe reference-counted pointer, and Mutex is a thread-safe mutual exclusion lock. This is a common pattern in Rust for sharing state between threads.

Let's redo the example once more using this pattern:

use std::sync::{Arc, Mutex};
use std::thread;

struct Counter {
    value: i32,
}

fn increment_counter(counter: &Arc<Mutex<Counter>>) {
    let mut counter = counter.lock().expect("mutex is poisoned");
    counter.value += 1;
}

fn main() {
    let counter = Arc::new(Mutex::new(Counter { value: 0 }));
    let counter_for_thread = counter.clone();
    let handle = thread::spawn(move || increment_counter(&counter_for_thread));
    handle.join().expect("thread panicked");
    let counter = counter.lock().expect("mutex is poisoned");
    println!("COUNTER: {}", counter.value);
}

Notice that the thread needs an owned copy of the Arc<Mutex<Counter>> to be able to use it. This is why we need to call clone on the Arc before passing it to the thread. The clone method on Arc only increments the reference count, so it is a cheap operation. The move keyword in the closure tells the compiler to move the Arc into the closure, so that the closure takes ownership of the Arc.

In this way, you can spawn many tasks each with their own reference to the shared state. Like so:

use std::sync::{Arc, Mutex};
use std::thread;

struct Counter {
    value: i32,
}

fn increment_counter(counter: &Arc<Mutex<Counter>>) {
    let mut counter = counter.lock().expect("mutex is poisoned");
    counter.value += 1;
}

fn main() {
    let counter = Arc::new(Mutex::new(Counter { value: 0 }));

    // The scope function is used to ensure that the spawned threads are joined before the main thread continues.
    thread::scope(|s| {
        for _ in 0..10 {
            let counter_for_thread = counter.clone();
            s.spawn(move || increment_counter(&counter_for_thread));
        }
    });

    let counter = counter.lock().expect("mutex is poisoned");
    println!("COUNTER: {}", counter.value);
}

Reference material

Migrate to Rust from C++

I'd recommend looking at the previous chapter, Migrating from C, before reading this chapter. It will give you a good understanding of the differences between C and Rust.

In this chapter, we will look at some of the differences between C++ and Rust.

Classes

Rust has no concept of classes. Instead, it has a concept of structs and impl blocks. Here's an example of a class in C++:

#include <iostream>

class MyClass {
public:
    int x;
    int y;

    MyClass(int x, int y) : x(x), y(y) {}

    int add() {
        return x + y;
    }
};

int main() {
    MyClass myClass(10, 20);
    std::cout << myClass.add() << std::endl;
}

In Rust, the same code would look like this:

struct MyClass {
    x: i32,
    y: i32,
}

impl MyClass {
    fn new(x: i32, y: i32) -> MyClass {
        MyClass { x, y }
    }

    fn add(&self) -> i32 {
        self.x + self.y
    }
}

fn main() {
    let my_class = MyClass::new(10, 20);
    println!("{:?}", my_class.add());
}

Although the code in Rust looks similar, there are a few differences:

  • In Rust, field and methods are private by default. You need to use the pub keyword to make them public.
  • In Rust, you need to use the impl block to define methods for a struct.
  • In Rust, you need to use the &self keyword to define methods that take a reference to the struct.
  • Rust has no concept of classes and inheritance. Instead, it has a concept of traits. We will look at traits in the next section.

Interfaces or Abstract Classes

In C++, you can define an interface using the virtual keyword. Here's an example of defining an interface in C++:

#include <iostream>

class GreeterInterface
{
public:
    virtual void greet() = 0;
};

class MyClass : public GreeterInterface
{
public:
    void greet() override
    {
        std::cout << "Hello, world!" << std::endl;
    }
};

int main()
{
    MyClass myClass;
    myClass.greet();
}

In Rust, the same code would look like this:

trait Greeter {
    fn greet(&self);
}

struct MyClass;

impl Greeter for MyClass {
    fn greet(&self) {
        println!("Hello, world!");
    }
}

fn main() {
    let my_class = MyClass;
    my_class.greet();
}

In Rust, you can define a trait using the trait keyword. You can then implement the trait for a struct using the impl block.

Passing Traits to Functions

Traits can be passed to functions in Rust in two ways:

  • Static dispatch: You can pass a trait to a function using a generic type parameter.
  • Dynamic dispatch: You can pass a trait to a function using a dynamic type parameter: &dyn.

With static dispatch, the compiler generates a separate version of the function for each type that implements the trait. With dynamic dispatch, the compiler generates a single version of the function that works with any type that implements the trait. The dynamic dispatch version is (marginally) slower, but it's more flexible.

trait Greeter {
    fn greet(&self);
}

struct MyClass;

impl Greeter for MyClass {
    fn greet(&self) {
        println!("Hello, world!");
    }
}

fn greet<T: Greeter>(greeter: &T) {
    greeter.greet();
}

fn dyn_greet(greeter: &dyn Greeter) {
    greeter.greet();
}

fn main() {
    let my_class = MyClass;
    greet(&my_class);
    dyn_greet(&my_class);
}

Inheritance

Rust has no concept of inheritance. Instead, it has a concept of composition. You can use composition to create hierarchies of objects. Here's an example of inheritance in C++:

#include <iostream>

class Animal {
public:
    virtual void speak() = 0;
};

class LandAnimal {
public:
    virtual void walk() = 0;
    int nrOfLegs() {
        return 4;
    }
};

class Dog : public Animal, public LandAnimal {
public:
    void speak() override {
        std::cout << "Woof!" << std::endl;
    }

    void walk() override {
        std::cout << "Walking on land" << std::endl;
    }
};

int main() {
    Dog dog;    
    dog.speak();
    dog.walk();
    std::cout << dog.nrOfLegs() << std::endl;
}

In Rust, the same code would look like this:

trait Animal {
    fn speak(&self);
}

trait LandAnimal {
    fn walk(&self);
    fn nr_of_legs(&self) -> i8 {
        4
    }
}

struct Dog;

impl Animal for Dog {
    fn speak(&self) {
        println!("Woof!");
    }
}

impl LandAnimal for Dog {
    fn walk(&self) {
        println!("Walking on land");
    }
}

fn main() {
    let dog = Dog;
    dog.speak();
    dog.walk();
    println!("{:?}", dog.nr_of_legs());
}

Migrate to Rust from Go

In this chapter, we will look at some of the differences between Go and Rust. We will also look at some similarities between the two languages.

Both Go and Rust are modern programming languages that are designed to be fast, safe, and efficient. They are both statically typed and have a strong focus on concurrency and parallelism. There are some notable differences between the two languages, though, which we will explore in this chapter.

External packages

In Go, you can import external packages from the internet by adding them to your go.mod file. For example, to import the github.com/gorilla/mux package, you would add the following line to your go.mod file:

require github.com/gorilla/mux v1.8.0

Alternatively you can use the go get command to add a dependency to your go.mod file. For example, to add the github.com/gorilla/mux package as a dependency, you would write:

go get -u github.com/gorilla/mux

In Rust, you can import external crates from crates.io by adding them to your Cargo.toml file. For example, to import the rand crate, you would add the following line to your Cargo.toml file:

[dependencies]
rand = "0.8.4"

You can also use the cargo add command to add a dependency to your Cargo.toml file. For example, to add the rand crate as a dependency, you would write:

cargo add rand

Importing packages

In Go, you import packages using the import keyword. For example, to import the fmt package, you would write:

import "fmt"

In Rust, you import crates using the use keyword. For example, to import the std::io crate, you would write:

#![allow(unused)]
fn main() {
use std::io;
}

To use the previously imported rand crate, you would write:

#![allow(unused)]
fn main() {
use rand;
}

Async

Go has built-in support for asynchronous programming using goroutines and channels. Goroutines are lightweight threads that are managed by the Go runtime, and channels are used to communicate between goroutines.

Rust also has built-in support for asynchronous programming using the async and await keywords. Rust's async/await syntax is similar to that of other languages like C# and JavaScript. However, Rust does not have built-in support for channels like Go does. You need an asynchronous runtime like tokio or async-std to handle asynchronous programming in Rust. At the time of this writing tokio is the most popular asynchronous runtime for Rust.

A simple example of a goroutine in Go:

package main

import (
    "fmt"
    "time"
)

func main() {
    go func() {
        fmt.Println("Hello from goroutine")
    }()

    time.Sleep(1 * time.Second)
}

The same example in Rust using tokio:

use std::time::Duration;

use tokio::task;
use tokio::time::sleep;

#[tokio::main]
async fn main() {
    task::spawn(async {
        println!("Hello from tokio");
        sleep(Duration::from_secs(1)).await;
    }).await.unwrap();
}

Make sure to add the tokio dependency tokio = { version = "1", features = ["full"] } to your Cargo.toml file.

Channels

In Go, you can use channels to communicate between goroutines. Channels are a powerful feature of Go that allow you to send and receive messages between goroutines. Tokio provides similar functionality in Rust using the mpsc module.

A simple example of using channels in Go:

package main

import (
    "fmt"
)

func main() {
    ch := make(chan string)

    go func() {
        ch <- "Hello from goroutine"
    }()

    msg := <-ch
    fmt.Println(msg)
}

The same example in Rust using tokio:

use tokio::sync::mpsc;

#[tokio::main]
async fn main() {
    let (tx, mut rx) = mpsc::channel(32);

    tokio::spawn(async move {
        tx.send("Hello from tokio".to_string()).await.unwrap();
    });

    let msg = rx.recv().await.unwrap();
    println!("{msg}");
}

Enumerations in for loops

In Go, you can use the range keyword to iterate over the elements of an array, slice, string, map, or channel. For example:

package main

import "fmt"

func main() {
    numbers := []int{1, 2, 3, 4, 5}

    for i, number := range numbers {
        fmt.Println(i, number)
    }
}

In Rust, you can use the iter method to iterate over the elements of a collection, and to get the index of the element you can use the enumerate method. For example:

fn main() {
    let numbers = [1, 2, 3, 4, 5];

    for (i, number) in numbers.iter().enumerate() {
        println!("{i} {number}");
    }
}

Traits

In Go, you can define interfaces to specify the behavior of a type. For example:

package main

import "fmt"

type Greeter interface {
    Greet()
}

type EnglishGreeter struct{}

func (eg EnglishGreeter) Greet() {
    fmt.Println("Hello, world!")
}

func main() {
    var greeter Greeter
    greeter = EnglishGreeter{}
    greeter.Greet()
}

In Rust, you can define traits to specify the behavior of a type. For example:

trait Greeter {
    fn greet(&self);
}

struct EnglishGreeter;

impl Greeter for EnglishGreeter {
    fn greet(&self) {
        println!("Hello, world!");
    }
}

fn main() {
    let greeter = EnglishGreeter;
    greeter.greet();
}

As you can see traits in Rust are similar to interfaces in Go. You can define a trait using the trait keyword and implement the trait for a struct using the impl block. The key difference is that in Go the interface is implemented implicitly, while in Rust the trait is implemented explicitly.

Public & Private visibility of functions and properties

In Go, functions and properties are public by default. To make a function or property private, you need to start the name with a lowercase letter. For example:

package main

import "fmt"

type Person struct {
    name string
}

func (p Person) GetName() string {
    return p.name
}

func main() {
    p := Person{name: "Alice"}
    fmt.Println(p.GetName())
}

In Rust, functions and properties are private by default. To make a function or property public, you need to use the pub keyword. For example:

struct Person {
    name: String,  // this is property is private
}

impl Person {
    pub fn get_name(&self) -> &str {
        &self.name
    }
}

fn main() {
    let p = Person { name: "Alice".to_string() };
    println!("{}", p.get_name());
}

Building for production

In Go, there is only one way to build your application: you run go build and it produces a single binary that you can run for development or production. In Rust, cargo build will produce a binary that is optimized for development. To build a binary optimized for production, you run cargo build --release.

The development binary is not optimized for speed, but it includes debug information that can be useful for debugging your application. The release binary is optimized for speed and does not include debug information.

You find the resulting binary in the target/debug or target/release directory.

In the Cargo.toml file you can further tune the release profile, for example:

[profile.release]
opt-level = "z"
lto = true

Reference material

Migrate to Rust from Java

Moving to Rust from Java can honestly be a daunting task. The two languages are quite different in terms of syntax, paradigms, and tooling. However, the transition can be made easier by understanding the similarities and differences between the two languages.

The Java language is a high-level, object-oriented programming language that is designed to be platform-independent. It used garbage collection to manage memory so developers don't have to worry about memory management.

Rust, on the other hand, is a systems programming language that is designed to be fast, but safe. It does not have a garbage collector, and instead uses a system of ownership, borrowing, and lifetimes to manage memory. Rust also has no concept of classes or inheritance, and instead uses traits and generics to achieve similar levels of code reuse.

main.java -> main.rs

Let's start by comparing a simple "Hello, World!" program in Java and Rust.

Java

public class Main {
    public static void main(String[] args) {
        System.out.println("Hello, World!");
    }
}

Rust

fn main() {
    println!("Hello, World!");
}

As you can see, the Rust version is a bit more concise and does not require a class definition. The main function is the entry point of the program, and it does not require any arguments. So far, so good. 😄

Building and Running

In Java, you would compile the program using the javac command and run it using the java command.

javac Main.java
java Main

In Rust, you would use the rustc command to compile the program and then run the resulting binary.

rustc main.rs
./main

Well... In reality, you would use something like Maven or Gradle to build your Java project, and Cargo to build your Rust project. But you get the idea.

Another thing to note is that Rust is a compiled language, while Java is an interpreted language. This means that Rust code is compiled into machine code that can be run directly on the target platform, while Java code is compiled into bytecode that is run on the Java Virtual Machine (JVM).

Building with cargo

Cargo is the build system and package manager for Rust. It makes it easy to build, test, and run Rust projects. It also handles dependencies and provides a consistent way to build and package Rust code.

To create a new Rust project, you can use the cargo new command.

cargo new hello_world
cd hello_world

This will create a new directory called hello_world with a basic Rust project structure. You can then edit the src/main.rs file to add your code, and use the cargo build and cargo run commands to build and run your project.

The binary that is produced by cargo build is a debug binary by default. To build a release binary, you would use the --release flag. Debug binaries include debug information that can be useful for debugging, while release binaries are optimized for size and/or speed and do not include debug information.

cargo build --release

Classes and Objects

In Java, classes are the building blocks of the language. They define the structure and behavior of objects, and can be used to create instances of objects.

In Rust, there are no classes. Instead, Rust uses structs to define data structures and traits to define behavior. Structs are similar to classes in that they can have properties and methods, but they do not support inheritance.

Here is an example of a simple class in Java and its equivalent in Rust:

Java

public class Person {
    private String name;

    public Person(String name) {
        this.name = name;
    }

    public String getName() {
        return name;
    }
}

Rust

#![allow(unused)]
fn main() {
struct Person {
    name: String,
}

impl Person {
    fn new(name: &str) -> Person {
        Person { name: name.to_string() }
    }

    fn get_name(&self) -> &str {
        &self.name
    }
}
}

Unit tests

In Java, unit tests are typically written using JUnit or TestNG. These frameworks provide annotations and utilities for writing and running tests. Rust has built-in support for unit tests using the #[test] attribute. Unlike Java, where tests are typically written in separate classes, Rust tests are written in the same file as the code they are testing.

Let's look at an example of a unit test in Java and its equivalent in Rust:

Java

import org.junit.Test;

import static org.junit.Assert.assertEquals;

public class PersonTest {
    @Test
    public void testGetName() {
        Person person = new Person("Alice");
        assertEquals("Alice", person.getName());
    }
}

Rust

#![allow(unused)]
fn main() {
struct Person {
    name: String,
}

impl Person {
    fn new(name: &str) -> Person {
        Person { name: name.to_string() }
    }

    fn get_name(&self) -> &str {
        &self.name
    }
}

#[cfg(test)]
mod tests {
    use super::*;

    #[test]
    fn test_get_name() {
        let person = Person::new("Alice");
        assert_eq!(person.get_name(), "Alice");
    }
}
}

The tests module is conditionally compiled only when running tests. The #[cfg(test)] ensures that the subsequent module is only compiled when running tests. This allows you to separate your tests from your production code, and ensures that your tests do not end up in the final binary.

Advanced

This chapter covers various advanced topics that go beyond the scope of the Rust classes. I've included them for reference purposes. You may find this information useful while building your own Rust projects.

Circular references or Self-references

If you have read some blog posts on Rust, you might have come across the term "circular references" or " self-references", and that it is impossible to have self-references in Rust. Although it is indeed more difficult to have self-references in Rust using traditional references, You can use Rc or Arc to more easily create self-references in Rust.

use std::rc::{Rc, Weak};

struct Node {
    value: usize,
    parent: Option<Weak<Node>>,
    children: Vec<Rc<Node>>,
}

fn build_tree(root_value: usize, nr_of_children: usize) -> Rc<Node> {
    Rc::new_cyclic(|parent| {
        let children = (1..=nr_of_children)
            .map(|i| Rc::new(Node {
                value: root_value + i,
                parent: Some(parent.clone()),
                children: vec![],
            }))
            .collect();

        Node {
            value: root_value,
            parent: None,
            children,
        }
    })
}

fn main() {
    let root = build_tree(10, 5);
    let nr_of_children = root.children.len();
    println!("Root has {nr_of_children} children");

    for child in root.children.iter() {
        if let Some(parent) = child.parent.as_ref().unwrap().upgrade() {
            println!("Root has one child with value {}; its parent value is {}", child.value, parent.value);
        }
    }
}

The Rc::new_cyclic creates a Weak reference to the parent node. This weak smart pointer is what we can use on the children, even before the parent is fully initialized. Weak references do not increment the reference count, unlike Rc references. This is important because we want to avoid reference cycles.

Later in the code, when the parent is fully initialized, we can upgrade the Weak reference to a Rc reference. This is what we do when we print the parent value. At this point, the Rc reference count is incremented by one, and the parent is not deallocated until the child reference is deallocated (or dropped).

Rust Arena

Another way to create self-references in Rust is by using an Arena. An Arena is a data structure that allows you to store a collection of objects in a single memory block. The Arena is responsible for managing the memory and lifetimes of the objects stored in it. This makes it possible to create self-references in Rust without using Rc or Arc.

Sample code

Add typed-arena to your Cargo.toml file.

Cargo.toml

[dependencies]
typed-arena = "2"

Let's create a simple tree structure using an Arena and Cell.

main.rs

use std::cell::Cell;

use typed_arena::Arena;

struct TreeNode<'a> {
    value: i32,
    parent: Cell<Option<&'a TreeNode<'a>>>,
}

fn main() {
    let arena = Arena::new();

    let root = arena.alloc(TreeNode { value: 1, parent: Cell::new(None) });
    let child_1 = arena.alloc(TreeNode { value: 2, parent: Cell::new(None) });
    let child_2 = arena.alloc(TreeNode { value: 3, parent: Cell::new(None) });
    let grandchild = arena.alloc(TreeNode { value: 4, parent: Cell::new(None) });

    child_1.parent.set(Some(root));
    child_2.parent.set(Some(root));
    grandchild.parent.set(Some(child_1));

    let parent = grandchild.parent.get().unwrap();
    println!("Value of grandchild's parent: {}", parent.value);

    let grandparent = parent.parent.get().unwrap();
    println!("Value of grandchild's grandparent: {}", grandparent.value);
}

With an Arena all the objects are stored in a single memory block, and the Arena is responsible for managing the memory and lifetimes of the objects. This makes it possible to create self-references in Rust without using Rc or Arc. In the example above, we create a simple tree structure using an Arena and Cell. The Cell type allows us to update the parent field of a TreeNode after it has been created. This is called "interior mutability".

Reference material

Rust on Microcontrollers, or Embedded Rust

One of the more interesting aspects of Rust is that it runs everywhere: on the server, the desktop, the web and even on microcontrollers. Rust is a great language for embedded development because at its core it is a systems programming language that provides a lot of control over the hardware, while still being safe and expressive.

Over the years, the support for embedded development in Rust has grown significantly. There are now a number of libraries and frameworks that make it easier to develop embedded applications in Rust. In this chapter, we'll be looking at some basics of embedded development in Rust, and we'll be building a few projects to get a feel for how Rust can be used in embedded systems.

Embedded Rust

We'll be using the Embassy framework for this chapter. Embassy is a lightweight async/await runtime for embedded systems. It has many features that backend developers are familiar with, like tasks, timers, and channels. It's a great way to get started with embedded development in Rust.

Microcontrollers

A microcontroller is a small computer on a single integrated circuit. It contains a processor core, memory, and peripheral devices. Microcontrollers are used in embedded systems, which are systems that are designed to perform a specific task. They are found in a wide range of devices, from microwave ovens to cars to industrial machinery.

We'll be using three different microcontrollers in this chapter:

  • Raspberry Pi Pico,
  • STM32F103 (Blue Pill),
  • STM32F401 (Black Pill).

The projects we'll be building are straightforward, but they should give you a good idea of how to get started with embedded development in Rust. You should be able to combine the knowledge you gain from these projects with the documentation of the microcontroller you are using to build more complex projects.

Raspberry Pi Pico

The Raspberry Pi Pico is a microcontroller board based on the RP2040 microcontroller chip. It has a dual-core ARM Cortex-M0+ processor, 264KB of SRAM, and 2MB of flash memory. The Pico is a great board for beginners because it is inexpensive and easy to use.

STM32F103 (Blue Pill)

The STM32F103 (Blue Pill) is a development board based on the STM32F103C8T6 microcontroller. It has an ARM Cortex-M3 processor, 64KB of flash memory, and 20KB of SRAM. The Blue Pill is a popular board for hobbyists because it is cheap and powerful.

STM32F401 (Black Pill)

The STM32F401 (Black Pill) is a development board based on the STM32F401CCU6 microcontroller. It has an ARM Cortex-M4 processor, 256KB of flash memory, and 64KB of SRAM. The Black Pill is a more powerful board than the Blue Pill, and it is suitable for more complex projects.

Embedded Rust - STM32F103

In this chapter, we'll be looking at programming an STM32F103 (STM32F103C8T6), or Blue Pill board. This board is relatively cheap and packs a lot of power. The STM32 development board is equipped with an ARM Cortex M3 processor. The full specifications can be found here: https://stm.com.

If you are looking for details on the STM32F401, or Black Pill board, check this chapter: Embedded Rust - STM32F401.

Getting prepared

Hardware

In order to build your first embedded project you need the following:

  • One or more STM32F103C8T6 development boards; these are also known as Blue Pill boards,
  • ST-Link V2 USB Debugger,
  • A breadboard with some common components; look for a bundle:
    • Jumper wire cables,
    • Some LEDs, resistors and push buttons.

It also helps to have some soldering gear.

If you are not soldering the pins to the STM32 board, do make sure the connections to the breadboard are solid and stable. You can easily lose a few hours debugging your source code, before figuring our that one of the connections is broken, or unstable.

Software

First we'll install probe-rs which is a tool to flash and debug embedded devices.

If you have cargo-binstall installed, you can install probe-rs with:

$ cargo binstall probe-rs

If not, you can build from source with:

$ cargo install probe-rs --locked --features cli

I had to restart my terminal to get the probe-rs command to work, i.e. to be found in the path. It should be in ~/.cargo/bin/probe-rs.

Rust ARM target

We'll also need the build toolchain for the ARM Cortex M3 processor. This includes the ARM cross-compile tools, OpenOCD and the Rust ARM target.

Make sure you have the thumbv7m-none-eabi target installed. We'll use the rustup tool to install it:

$ rustup target add thumbv7m-none-eabi

Connect the STM32

In order to program the STM32 board, you need to connect it to your PC using the ST-Link USB adapter. You should wire it like this:

STM32ST-Link v2
V3 = Red3.3v (Pin 8)
IO = OrangeSWDIO (Pin 4)
CLK = BrownSWDCLK (Pin 2)
GND = BlackGND (Pin 6)
Schematic:

Connect SMT32


Once connected, you can use this command to establish the link and check that everything is working so far:

$ probe-rs list

The following debug probes were found:
[0]: STLink V2 (VID: 0483, PID: 3748, Serial: 49C3BF6D067578485335241067, StLink)
$ probe-rs info

Probing target via JTAG

ARM Chip:
Debug Port: Version 1, DP Designer: ARM Ltd
└── 0 MemoryAP
    └── ROM Table (Class 1)
        ├── Cortex-M3 SCS   (Generic IP component)
        │   └── CPUID
        │       ├── IMPLEMENTER: ARM Ltd
        │       ├── VARIANT: 1
        │       ├── PARTNO: 3107
        │       └── REVISION: 1
        ├── Cortex-M3 DWT   (Generic IP component)
        ├── Cortex-M3 FBP   (Generic IP component)
        ├── Cortex-M3 ITM   (Generic IP component)
        └── Cortex-M3 TPIU  (Coresight Component)

Unable to debug RISC-V targets using the current probe. RISC-V specific information cannot be printed.
Unable to debug Xtensa targets using the current probe. Xtensa specific information cannot be printed.

Probing target via SWD

ARM Chip:
Debug Port: Version 1, DP Designer: ARM Ltd
└── 0 MemoryAP
    └── ROM Table (Class 1)
        ├── Cortex-M3 SCS   (Generic IP component)
        │   └── CPUID
        │       ├── IMPLEMENTER: ARM Ltd
        │       ├── VARIANT: 1
        │       ├── PARTNO: 3107
        │       └── REVISION: 1
        ├── Cortex-M3 DWT   (Generic IP component)
        ├── Cortex-M3 FBP   (Generic IP component)
        ├── Cortex-M3 ITM   (Generic IP component)
        └── Cortex-M3 TPIU  (Coresight Component)

Debugging RISC-V targets over SWD is not supported. For these targets, JTAG is the only supported protocol. RISC-V specific information cannot be printed.
Debugging Xtensa targets over SWD is not supported. For these targets, JTAG is the only supported protocol. Xtensa specific information cannot be printed.

Quick start

We'll be using the excellent embassy crate to program the STM32. The embassy crate is a framework for building concurrent firmware for embedded systems. It is based on the async and await syntax, and is designed to be used with the no_stdenvironment.

Create a new hello_stm32_world project with the following files; take care to put them in the correct directories:

cargo new hello_stm32_world
cd hello_stm32_world

In the project root:

Cargo.toml

[package]
name = "hello_stm32_world"
version = "0.1.0"
edition = "2021"

[dependencies]
embassy-stm32 = { version = "0.1.0", features = ["defmt", "stm32f103c8", "unstable-pac", "memory-x", "time-driver-any"] }
embassy-sync = { version = "0.5.0", features = ["defmt"] }
embassy-executor = { version = "0.5.0", features = ["arch-cortex-m", "executor-thread", "defmt", "integrated-timers"] }
embassy-time = { version = "0.3.0", features = ["defmt", "defmt-timestamp-uptime", "tick-hz-32_768"] }
embassy-usb = { version = "0.1.0", features = ["defmt"] }
embassy-futures = { version = "0.1.0" }

defmt = "0.3"
defmt-rtt = "0.4"

cortex-m = { version = "0.7.6", features = ["inline-asm", "critical-section-single-core"] }
cortex-m-rt = "0.7.0"
embedded-hal = "0.2.6"
panic-probe = { version = "0.3", features = ["print-defmt"] }
futures = { version = "0.3.17", default-features = false, features = ["async-await"] }
heapless = { version = "0.8", default-features = false }
nb = "1.0.0"

[profile.dev]
opt-level = "s"

[profile.release]
debug = 2

Next to the Cargo.toml, create a build script:

build.rs

fn main() {
    println!("cargo:rustc-link-arg-bins=--nmagic");
    println!("cargo:rustc-link-arg-bins=-Tlink.x");
    println!("cargo:rustc-link-arg-bins=-Tdefmt.x");
}

Then add a project-specific configuration file:

.cargo/config.toml

[target.'cfg(all(target_arch = "arm", target_os = "none"))']
# replace STM32F103C8 with your chip as listed in `probe-rs chip list`
runner = "probe-rs run --chip STM32F103C8"

[build]
target = "thumbv7m-none-eabi"

[env]
DEFMT_LOG = "trace"

And finally edit the src/main.rs file to look like:

src/main.rs

#![no_std]
#![no_main]

use defmt::info;
use embassy_executor::Spawner;
use embassy_stm32::Config;
use embassy_time::Timer;
use {defmt_rtt as _, panic_probe as _};

#[embassy_executor::main]
async fn main(_spawner: Spawner) -> ! {
    let config = Config::default();
    let _p = embassy_stm32::init(config);

    loop {
        info!("Hello STM32 World!");
        Timer::after_secs(1).await;
    }
}

Your project structure should look like this:

.
├── Cargo.toml
├── .cargo
│   └── config.toml
├── build.rs
└── src
    └── main.rs

First run

Now that you have the project set up, you can build and flash it to the STM32 board:

$ cargo run --release

After the standard build steps complete, you should see something like this:

      Erasing ✔ [00:00:00] [##########################################################################################################################################################] 14.00 KiB/14.00 KiB @ 24.64 KiB/s (eta 0s )
  Programming ✔ [00:00:00] [##########################################################################################################################################################] 14.00 KiB/14.00 KiB @ 18.79 KiB/s (eta 0s )    Finished in 1.332s

0.000000 TRACE BDCR configured: 00008200
└─ embassy_stm32::rcc::bd::{impl#2}::init @ /Users/mibes/.cargo/registry/src/index.crates.io-6f17d22bba15001f/embassy-stm32-0.1.0/src/fmt.rs:117 
0.000000 DEBUG rcc: Clocks { sys: Hertz(8000000), pclk1: Hertz(8000000), pclk1_tim: Hertz(8000000), pclk2: Hertz(8000000), pclk2_tim: Hertz(8000000), hclk1: Hertz(8000000), adc: Some(Hertz(1000000)), rtc: Some(Hertz(40000)) }
└─ embassy_stm32::rcc::set_freqs @ /Users/mibes/.cargo/registry/src/index.crates.io-6f17d22bba15001f/embassy-stm32-0.1.0/src/fmt.rs:130 
0.000030 INFO  Hello STM32 World!
└─ blinky_stm32::____embassy_main_task::{async_fn#0} @ src/main.rs:17  
1.000183 INFO  Hello STM32 World!
└─ blinky_stm32::____embassy_main_task::{async_fn#0} @ src/main.rs:17  

If you encounter any build issues, double-check that the config.toml and build.rs files are in the correct locations.

If you see the Hello STM32 World! message, you have successfully flashed the STM32 board with your first Rust program.

This project structure will be the foundation for the next steps in building more complex embedded projects. You may want to make a copy of this project structure to use as a template for future projects.

Running without the debugger.

Once loaded, your application is automatically executed when you power up the STM32 board, without the debugger attached. You can accomplish this in various ways:

  • Disconnect the brown and orange cables from the board (leaving the red and black attached). This will just draw power from the USB port, but won't establish the connection with the debugger.
  • Connect a USB charger to the USB port on the board. Make sure to disconnect the ST-LINK when doing this
  • Connect a 3.3v battery to the V3 and GND pins (red & black)

Next steps

Next we'll be building our first Blinky project using the breadboard and a few components.

Hello STM32 World - or Blinky

Let's update the hello_stm32_world project from the previous chapter and build a "blinky" project. We'll start of with the blinky example from the embassy crate. This example will toggle the LED on the STM32F103C8T6 board every 300ms.

main.rs

#![no_std]
#![no_main]

use defmt::*;
use embassy_executor::Spawner;
use embassy_stm32::gpio::{Level, Output, Speed};
use embassy_time::Timer;
use {defmt_rtt as _, panic_probe as _};

#[embassy_executor::main]
async fn main(_spawner: Spawner) {
    let p = embassy_stm32::init(Default::default());
    info!("Starting Blinky...");

    let mut led = Output::new(p.PC13, Level::High, Speed::Low);

    loop {
        info!("high");
        led.set_high();
        Timer::after_millis(300).await;

        info!("low");
        led.set_low();
        Timer::after_millis(300).await;
    }
}

Run the project with:

$ cargo run --release

And you should see the LED on the STM32F103C8T6 board blink every 300ms.

If that works build a breadboard circuit with a LED and a 150Ω resistor and connect it to pin PB5 on the STM32F103C8T6 board.

Schematic:

Hello World schematic


Update the main.rs file to look like this:

main.rs

#![no_std]
#![no_main]

use defmt::*;
use embassy_executor::Spawner;
use embassy_stm32::gpio::{Level, Output, Speed};
use embassy_time::Timer;
use {defmt_rtt as _, panic_probe as _};

#[embassy_executor::main]
async fn main(_spawner: Spawner) {
    let p = embassy_stm32::init(Default::default());
    info!("Starting Blinky...");

    let mut led = Output::new(p.PB5, Level::High, Speed::Low);

    loop {
        info!("high");
        led.set_high();
        Timer::after_millis(500).await;

        info!("low");
        led.set_low();
        Timer::after_millis(500).await;
    }
}

Run

$ cargo run --release

If everything is wired up correctly, you should see the LED blink every 500ms.

Summary

In this chapter, we built a "blinky" project using the embassy crate. We used the Output struct to control the LED on the STM32F103C8T6 board. We also used the Timer struct to create delays between turning the LED on and off.

Embedded Rust - Raspberry Pi Pico

In this chapter, we'll be looking at programming a Raspberry Pi Pico board. This board is relatively cheap and is great to learn embedded development. The Raspberry Pi Pico development board is equipped with an ARM Cortex M0+ processor. The full specifications can be found here: raspberry-pi-pico.

Getting prepared

Hardware

In order to build your first embedded project you need the following:

  • At least two Raspberry Pi Pico development boards,
  • Two breadboards with some common components; look for a bundle:
    • Jumper wire cables,
    • Some LEDs, resistors and push buttons.
  • One Micro USB cable.

It also helps to have some soldering gear.

If you are not soldering the pins to the Pico board, do make sure the connections to the breadboard are solid and stable. You can easily lose a few hours debugging your source code, before figuring our that one of the connections is broken, or unstable.

Software

First we'll install probe-rs which is a tool to flash and debug embedded devices.

If you have cargo-binstall installed, you can install probe-rs with:

$ cargo binstall probe-rs

If not, you can build from source with:

$ cargo install probe-rs --locked --features cli

I had to restart my terminal to get the probe-rs command to work, i.e. to be found in the path. It should be in ~/.cargo/bin/probe-rs.

Rust ARM target

We'll also need the build toolchain for the ARM Cortex M0 processor. This includes the ARM cross-compile tools, OpenOCD and the Rust ARM target.

Make sure you have the thumbv6m-none-eabi target installed. We'll use the rustup tool to install it:

$ rustup target add thumbv6m-none-eabi

Connect the Raspberry Pi Pico

In order to program the Raspberry Pi Pico board, you need to connect it to your PC using the USB cable. You need two Pico boards for this task. The first board will be the programmer, and the second board will be the target.

Schematic:

The schematic for connecting the two Pico boards is as follows:

Connect RPI Pico

Pinout of the Raspberry Pi Pico

The pinout of the Raspberry Pi Pico is as follows:

Pinout RPI Pico

Wiring

  • White: Connect pin 4 (SPIO SCK) on the programmer Pico to the SWDCLK pin on the target Pico,
  • Purple: Connect pin 5 (SPIO TX) on the programmer Pico to the SWDIO pin on the target Pico,
  • Green: Connect pin 6 (UART1 TX) on the programmer Pico to pin 2 (UARTO RX) on the target Pico,
  • Yellow: Connect pin 7 (UARTI RX) on the programmer Pico to pin 1 (UARTO TX) on the target Pico,
  • Red: Connect pin 39 (VSYS) on the programmer Pico to pin 39 (VSYS) on the target Pico,
  • Black: Connect pin 38 (GND) pin on the programmer Pico pin 38 (GND) on the target Pico.
  • The ground (GND) pin on the target Pico should be grounded; this is the black wire between the purple and white one,

Flash the programmer Pico

Once you have everything wired up you need to install the debugprobe firmware on the programmer Pico board. This is done by downloading the debugprobe_on_pico.uf2 file from the Debugging using another Raspberry Pi Pico website.

Hold down the BOOTSEL button when you plug in your programmer Pico. You should now see the pico appear as a USB drive. Drag and drop the debugprobe_on_pico.uf2 file onto the USB drive. The Pico will reboot and you should now be able to see the debug probe when you run the probe-rs command.

$ probe-rs list

The following debug probes were found:
[0]: Debugprobe on Pico _CMSIS_DAP_ (VID: 2e8a, PID: 000c, Serial: E6612483CB5A612A, CmsisDap)

Quick start

We'll be using the excellent embassy crate to program the Pico. The embassy crate is a framework for building concurrent firmware for embedded systems. It is based on the async and await syntax, and is designed to be used with the no_stdenvironment.

Create a new hello_pico_world project with the following files; take care to put them in the correct directories:

cargo new hello_pico_world
cd hello_pico_world

In the project root:

Cargo.toml

[package]
name = "hello_pico_world"
version = "0.1.0"
edition = "2021"

[dependencies]
embassy-embedded-hal = { version = "0.1.0", git = "https://github.com/embassy-rs/embassy", features = ["defmt"] }
embassy-sync = { version = "0.5.0", git = "https://github.com/embassy-rs/embassy", features = ["defmt"] }
embassy-executor = { version = "0.5.0", git = "https://github.com/embassy-rs/embassy", features = ["task-arena-size-32768", "arch-cortex-m", "executor-thread", "executor-interrupt", "defmt", "integrated-timers"] }
embassy-time = { version = "0.3", git = "https://github.com/embassy-rs/embassy", features = ["defmt", "defmt-timestamp-uptime"] }
embassy-rp = { version = "0.1.0", git = "https://github.com/embassy-rs/embassy", features = ["defmt", "unstable-pac", "time-driver", "critical-section-impl"] }

cortex-m = { version = "0.7.6", features = ["inline-asm"] }
cortex-m-rt = "0.7.0"
panic-probe = { version = "0.3", features = ["print-defmt"] }
defmt = "0.3"
defmt-rtt = "0.4"

Next to the Cargo.toml, create a build script:

build.rs

//! This build script copies the `memory.x` file from the crate root into
//! a directory where the linker can always find it at build time.
//! For many projects this is optional, as the linker always searches the
//! project root directory -- wherever `Cargo.toml` is. However, if you
//! are using a workspace or have a more complicated build setup, this
//! build script becomes required. Additionally, by requesting that
//! Cargo re-run the build script whenever `memory.x` is changed,
//! updating `memory.x` ensures a rebuild of the application with the
//! new memory settings.

use std::env;
use std::fs::File;
use std::io::Write;
use std::path::PathBuf;

fn main() {
    // Put `memory.x` in our output directory and ensure it's
    // on the linker search path.
    let out = &PathBuf::from(env::var_os("OUT_DIR").unwrap());
    File::create(out.join("memory.x"))
        .unwrap()
        .write_all(include_bytes!("memory.x"))
        .unwrap();
    println!("cargo:rustc-link-search={}", out.display());

    // By default, Cargo will re-run a build script whenever
    // any file in the project changes. By specifying `memory.x`
    // here, we ensure the build script is only re-run when
    // `memory.x` is changed.
    println!("cargo:rerun-if-changed=memory.x");

    println!("cargo:rustc-link-arg-bins=--nmagic");
    println!("cargo:rustc-link-arg-bins=-Tlink.x");
    println!("cargo:rustc-link-arg-bins=-Tlink-rp.x");
    println!("cargo:rustc-link-arg-bins=-Tdefmt.x");
}

Add this memory.x file:

memory.x

MEMORY {
BOOT2 : ORIGIN = 0x10000000, LENGTH = 0x100
FLASH : ORIGIN = 0x10000100, LENGTH = 2048K - 0x100

/* Pick one of the two options for RAM layout     */

/* OPTION A: Use all RAM banks as one big block   */
/* Reasonable, unless you are doing something     */
/* really particular with DMA or other concurrent */
/* access that would benefit from striping        */
RAM   : ORIGIN = 0x20000000, LENGTH = 264K

/* OPTION B: Keep the unstriped sections separate */
/* RAM: ORIGIN = 0x20000000, LENGTH = 256K        */
/* SCRATCH_A: ORIGIN = 0x20040000, LENGTH = 4K    */
/* SCRATCH_B: ORIGIN = 0x20041000, LENGTH = 4K    */
}

Then add a project-specific configuration file:

.cargo/config.toml

[target.'cfg(all(target_arch = "arm", target_os = "none"))']
runner = "probe-rs run --chip RP2040"

[build]
target = "thumbv6m-none-eabi"        # Cortex-M0 and Cortex-M0+

[env]
DEFMT_LOG = "debug"

And finally edit the src/main.rs file to look like:

src/main.rs

#![no_std]
#![no_main]

use defmt::*;
use embassy_executor::Spawner;
use embassy_time::Timer;
use {defmt_rtt as _, panic_probe as _};

#[embassy_executor::main]
async fn main(_spawner: Spawner) {
    let _p = embassy_rp::init(Default::default());

    loop {
        info!("Hello Pico World!");
        Timer::after_secs(1).await;
    }
}

Your project structure should look like this:

.
├── Cargo.toml
├── memory.x
├── .cargo
│   └── config.toml
├── build.rs
└── src
    └── main.rs

First run

Now that you have the project set up, you can build and flash it to the Pico board:

$ cargo run --release

After the standard build steps are complete, you should see something like this:

      Erasing ✔ [00:00:00] [##########################################################################################################################################################] 16.00 KiB/16.00 KiB @ 90.04 KiB/s (eta 0s )
  Programming ✔ [00:00:00] [##########################################################################################################################################################] 16.00 KiB/16.00 KiB @ 41.75 KiB/s (eta 0s )    Finished in 0.587s

0.002258 INFO  Hello Pico World!
└─ <mod path> @ └─ <invalid location: defmt frame-index: 4>:0   
1.002317 INFO  Hello Pico World!
└─ <mod path> @ └─ <invalid location: defmt frame-index: 4>:0   
2.002338 INFO  Hello Pico World!
└─ <mod path> @ └─ <invalid location: defmt frame-index: 4>:0   

If you encounter any build issues, double-check that the config.toml, memory.x and build.rs files are in the correct locations.

If you see the Hello Pico World! message, you have successfully flashed the Pico board with your first Rust program.

This project structure will be the foundation for the next steps in building more complex embedded projects. You may want to make a copy of this project structure to use as a template for future projects.

Running without the debugger.

Once loaded, your application is automatically executed when you power up the Pico board.

  • Connect a USB charger to the USB port on the board. Make sure to disconnect the programmer Pico board when doing this
  • Connect a 3.3v battery to the V3 and GND pins (red & black)

Next steps

Next we'll be building our first Blinky project using the breadboard and a few components.

Hello RPI Pico World - or Blinky

Let's update the hello_pico_world project from the previous chapter and build a "blinky" project. We'll start of with the blinky example from the embassy crate. This example will toggle the LED on the Pico board every 1s.

main.rs

#![no_std]
#![no_main]

use defmt::*;
use embassy_executor::Spawner;
use embassy_rp::gpio;
use embassy_time::Timer;
use gpio::{Level, Output};
use {defmt_rtt as _, panic_probe as _};

#[embassy_executor::main]
async fn main(_spawner: Spawner) {
    let p = embassy_rp::init(Default::default());
    let mut led = Output::new(p.PIN_25, Level::Low);

    loop {
        info!("led on!");
        led.set_high();
        Timer::after_secs(1).await;

        info!("led off!");
        led.set_low();
        Timer::after_secs(1).await;
    }
}

Run the project with:

$ cargo run --release

And you should see the LED on the Target board blink every 1s.

The code is pretty simple. We initialize the Pico board with embassy_rp::init(Default::default()) and create a new Output pin with the LED pin p.PIN_25. We then loop forever and toggle the LED on and off every 1s. GP25 is the onboard LED on the Pico board, as you can see in the Pico pinout.

If the above code works as expected, build a breadboard circuit with a LED and a 150Ω resistor and connect it to pin GP16 on the Pico board. Don't remove any of the programming wires!

Schematic:

Hello World schematic


Update the main.rs file to look like this:

main.rs

#![no_std]
#![no_main]

use defmt::*;
use embassy_executor::Spawner;
use embassy_rp::gpio;
use embassy_time::Timer;
use gpio::{Level, Output};
use {defmt_rtt as _, panic_probe as _};

#[embassy_executor::main]
async fn main(_spawner: Spawner) {
    let p = embassy_rp::init(Default::default());
    let mut led = Output::new(p.PIN_16, Level::Low);

    loop {
        info!("led on!");
        led.set_high();
        Timer::after_secs(1).await;

        info!("led off!");
        led.set_low();
        Timer::after_secs(1).await;
    }
}

Run

$ cargo run --release

If everything is wired up correctly, you should see the LED blink every second. The example did hardly change, we just changed the pin from GP25 to GP16, which is the pin we connected the LED to.

The embassy crate is a great way to write (async) code for embedded devices. The abstraction it provided makes coding for embedded devices much more like "regular" Rust code.

RPI Pico Analog Input

We'll continue to update our project and see if we can get some analog readings from a potentiometer. First, we'll update the breadboard to look like this:

Schematic:

Hello Input schematic


If you have wired up the breadboard correctly, you should be able to control the brightness of the LED by turning the potentiometer.

Notice that we replaced the resistor for a 68Ω version. This is to compensate for the resistance of the potentiometer.

Reading analog input

Ok, now that we have confirmed that the potentiometer is working, let's update the wiring one more time to connect the potentiometer to the Pico board. Connect the middle pin of the potentiometer to pin GP26 on the Pico board, and the other two pins to the 3V3 and GND pins.

Schematic:

Hello Input schematic


Notice that we switched the connections of the potentiometer.

Update main.rs to look like this:

#![no_std]
#![no_main]

use defmt::*;
use embassy_executor::Spawner;
use embassy_rp::adc::{Adc, Async, Config, InterruptHandler};
use embassy_rp::gpio;
use embassy_rp::gpio::Pull;
use embassy_rp::{adc, bind_interrupts};
use embassy_sync::blocking_mutex::raw::ThreadModeRawMutex;
use embassy_sync::channel::{Channel, Sender};
use embassy_time::Timer;
use gpio::{Level, Output};
use {defmt_rtt as _, panic_probe as _};

bind_interrupts!(struct Irqs {
    ADC_IRQ_FIFO => InterruptHandler;
});

static CHANNEL: Channel<ThreadModeRawMutex, u16, 64> = Channel::new();

#[embassy_executor::main]
async fn main(spawner: Spawner) {
    let p = embassy_rp::init(Default::default());
    let mut led = Output::new(p.PIN_16, Level::Low);

    info!("Setting up ADC");
    let adc = Adc::new(p.ADC, Irqs, Config::default());
    let p26 = adc::Channel::new_pin(p.PIN_26, Pull::None);

    // spawn the task that reads the ADC value
    spawner
        .spawn(read_adc_value(adc, p26, CHANNEL.sender()))
        .unwrap();

    let rx_adv_value = CHANNEL.receiver();

    loop {
        let value = rx_adv_value.receive().await;
        // we should get a new value every 1s
        // the value we are getting will be somewhere between 0 and 4095

        info!("ADC value: {}", value);

        if value > 2048 {
            led.set_high();
        } else {
            led.set_low();
        }
    }
}

#[embassy_executor::task(pool_size = 2)]
async fn read_adc_value(
    mut adc: Adc<'static, Async>,
    mut p26: adc::Channel<'static>,
    tx_value: Sender<'static, ThreadModeRawMutex, u16, 64>,
) {
    let mut measurements = [0u16; 10];
    let mut pos = 0;

    loop {
        measurements[pos] = adc.read(&mut p26).await.unwrap();
        pos = (pos + 1) % 10;

        if pos == 0 {
            // compute average of measurements
            let average = measurements.iter().sum::<u16>() / 10;

            // send average to main thread
            tx_value.send(average).await;
        }

        Timer::after_millis(100).await;
    }
}

The code above is a bit more complex than the previous examples. We are now using the ADC peripheral to read the value from the potentiometer. The read_adc_value task reads the value from the ADC every 100ms and computes the average of 10 measurements. The average is then sent to the main task that will control the LED.

We'll use channels, similar to tokio channels, to communicate between the tasks. The Channel is a simple implementation of a channel that uses a Sender and a Receiver to send and receive messages between tasks.

Whenever there is a new value from the ADC, the main task will check if the value is greater than 2048. If it is, the LED will be turned off, otherwise it will be turned on. In practice, this means that if the potentiometer is turned to the right, the LED will be turned on, and if it is turned to the left, the LED will be turned off.

Run the project with:

$ cargo run --release

RPI Pico with the TM1637 display

In this chapter, we'll connect a TM1637 display to the Raspberry Pi Pico board. The TM1637 is a 4-digit 7-segment display with a built-in driver. It's a very popular display for small projects, and it's easy to use.

We'll add a new module to the project to handle the display, and we'll update the main program to read the value of the potentiometer and display it on the TM1637 display.

This makes it (hopefully) easy to reuse the display in other projects.

Getting prepared

In order to build this project, you need the following:

Wiring

Connect the TM1637 display to the Pico board like this:

TM1637 schematic


Update main.rs to look like this:

#![no_std]
#![no_main]

use defmt::*;
use embassy_executor::Spawner;
use embassy_rp::{
    adc::{self, Adc, Async, Config, InterruptHandler},
    bind_interrupts,
    gpio::{Level, OutputOpenDrain, Pull},
};
use embassy_sync::blocking_mutex::raw::ThreadModeRawMutex;
use embassy_sync::channel::{Channel, Receiver, Sender};
use embassy_time::Timer;
use {defmt_rtt as _, panic_probe as _};

use crate::tm1637::{DIGITS, TM1637};

mod tm1637;

bind_interrupts!(struct Irqs {
    ADC_IRQ_FIFO => InterruptHandler;
});

static CHANNEL: Channel<ThreadModeRawMutex, u16, 64> = Channel::new();

#[embassy_executor::main]
async fn main(spawner: Spawner) {
    let p = embassy_rp::init(Default::default());

    info!("Setting up TM1637");
    let clock_pin = OutputOpenDrain::new(p.PIN_14, Level::Low);
    let dio_pin = OutputOpenDrain::new(p.PIN_15, Level::Low);
    spawner
        .spawn(display_numbers(
            TM1637::new(clock_pin, dio_pin),
            CHANNEL.receiver(),
        ))
        .unwrap();

    info!("Setting up ADC");
    let adc = Adc::new(p.ADC, Irqs, Config::default());
    let p26 = adc::Channel::new_pin(p.PIN_26, Pull::None);

    // spawn the task that reads the ADC value
    spawner
        .spawn(read_adc_value(adc, p26, CHANNEL.sender()))
        .unwrap();

    loop {
        Timer::after_millis(500).await;
    }
}

#[embassy_executor::task(pool_size = 2)]
async fn read_adc_value(
    mut adc: Adc<'static, Async>,
    mut p26: adc::Channel<'static>,
    tx_value: Sender<'static, ThreadModeRawMutex, u16, 64>,
) {
    let mut measurements = [0u16; 10];
    let mut pos = 0;

    loop {
        measurements[pos] = adc.read(&mut p26).await.unwrap();
        pos = (pos + 1) % 10;

        if pos == 0 {
            // compute average of measurements
            let average = measurements.iter().sum::<u16>() / 10;

            // send average to main thread
            tx_value.send(average).await;
        }

        Timer::after_millis(100).await;
    }
}

#[embassy_executor::task(pool_size = 2)]
async fn display_numbers(
    mut tm: TM1637<'static, 'static>,
    rx_measurement: Receiver<'static, ThreadModeRawMutex, u16, 64>,
) {
    let mut on = true;

    loop {
        let mut measurement = rx_measurement.receive().await;

        // truncate the temperature to 4 digits
        measurement = measurement.min(9999);

        // decide the brightness level based on the measurement
        // and whether the display should be on or off; for the maximum values we'll blink the display
        // by toggling the on/off state
        let brightness;
        (brightness, on) = match measurement {
            0..=2000 => (0, true),
            2001..=4000 => (2, true),
            _ => (5, !on),
        };

        info!(
            "measured: {} and brightness: {} {}",
            measurement, brightness, on
        );

        // split the last 4 digits of the temperature into individual digits
        let thousands = measurement / 1000;
        measurement -= thousands * 1000;
        let hundreds = measurement / 100;
        measurement -= hundreds * 100;
        let tens = measurement / 10;
        measurement -= tens * 10;
        let ones = measurement;

        let digits: [u8; 4] = [
            DIGITS[thousands as usize],
            DIGITS[hundreds as usize],
            DIGITS[tens as usize],
            DIGITS[ones as usize],
        ];

        tm.display(digits, brightness, on).await;
        Timer::after_millis(500).await;
    }
}

And add the tm1637.rs file:

#![allow(unused)]
fn main() {
use embassy_rp::gpio::OutputOpenDrain;
use embassy_time::Timer;

const DELAY_USECS: u64 = 100;
const ADDRESS_AUTO_INCREMENT_1_MODE: u8 = 0x40;
const ADDRESS_COMMAND_BITS: u8 = 0xc0;
const ADDRESS_COMM_3: u8 = 0x80;

const DISPLAY_CONTROL_BRIGHTNESS_MASK: u8 = 0x07;

// SEG_A 0b00000001
// SEG_B 0b00000010
// SEG_C 0b00000100
// SEG_D 0b00001000
// SEG_E 0b00010000
// SEG_F 0b00100000
// SEG_G 0b01000000
// SEG_DP 0b10000000

//
//      A
//     ---
//  F |   | B
//     -G-
//  E |   | C
//     ---
//      D
pub(crate) const DIGITS: [u8; 16] = [
    // XGFEDCBA
    0b00111111, // 0
    0b00000110, // 1
    0b01011011, // 2
    0b01001111, // 3
    0b01100110, // 4
    0b01101101, // 5
    0b01111101, // 6
    0b00000111, // 7
    0b01111111, // 8
    0b01101111, // 9
    0b01110111, // A
    0b01111100, // b
    0b00111001, // C
    0b01011110, // d
    0b01111001, // E
    0b01110001, // F
];

pub(crate) struct TM1637<'clk, 'dio> {
    clk: OutputOpenDrain<'clk>,
    dio: OutputOpenDrain<'dio>,
}

impl<'clk, 'dio> TM1637<'clk, 'dio> {
    pub fn new(clk: OutputOpenDrain<'clk>, dio: OutputOpenDrain<'dio>) -> Self {
        Self { clk, dio }
    }

    async fn delay(&mut self) {
        Timer::after_micros(DELAY_USECS).await;
    }

    fn brightness(&self, level: u8, on: bool) -> u8 {
        (level & DISPLAY_CONTROL_BRIGHTNESS_MASK) | (if on { 0x08 } else { 0x00 })
    }

    pub async fn set_brightness(&mut self, level: u8, on: bool) {
        self.start().await;
        let brightness = self.brightness(level, on);
        self.write_cmd(ADDRESS_COMM_3 + (brightness & 0x0f)).await;
        self.stop().await;
    }

    async fn send_bit_and_delay(&mut self, bit: bool) {
        self.clk.set_low();
        self.delay().await;
        if bit {
            self.dio.set_high();
        } else {
            self.dio.set_low();
        }
        self.delay().await;
        self.clk.set_high();
        self.delay().await;
    }

    pub async fn write_byte(&mut self, data: u8) {
        for i in 0..8 {
            self.send_bit_and_delay((data >> i) & 0x01 != 0).await;
        }
        self.clk.set_low();
        self.delay().await;
        self.dio.set_high();
        self.delay().await;
        self.clk.set_high();
        self.delay().await;
        self.dio.wait_for_low().await;
    }

    pub async fn start(&mut self) {
        self.clk.set_high();
        self.dio.set_high();
        self.delay().await;
        self.dio.set_low();
        self.delay().await;
        self.clk.set_low();
        self.delay().await;
    }

    pub async fn stop(&mut self) {
        self.clk.set_low();
        self.delay().await;
        self.dio.set_low();
        self.delay().await;
        self.clk.set_high();
        self.delay().await;
        self.dio.set_high();
        self.delay().await;
    }

    pub async fn write_cmd(&mut self, cmd: u8) {
        self.start().await;
        self.write_byte(cmd).await;
        self.stop().await;
    }

    pub async fn write_data(&mut self, addr: u8, data: u8) {
        self.start().await;
        self.write_byte(addr).await;
        self.write_byte(data).await;
        self.stop().await;
    }

    pub async fn display(&mut self, data: [u8; 4], brightness: u8, on: bool) {
        self.write_cmd(ADDRESS_AUTO_INCREMENT_1_MODE).await;
        self.start().await;
        let mut address = ADDRESS_COMMAND_BITS;
        for data_item in data {
            self.write_data(address, data_item).await;
            address += 1;
        }
        self.set_brightness(brightness, on).await;
    }
}
}

In this example, we are using the TM1637 display to show the measurement of the potentiometer. The display will show the last 4 digits of the measurement, and the brightness of the display will be adjusted based on the measurement. At the maximum value, the display will blink.

We'll launch two parallel tasks: one to read the ADC value and another to display the value on the TM1637 display. The two processes communicate through a channel.

RPI Pico with the DHT20 temperature and humidity sensor

Now we'll connect a DHT20 temperature and humidity sensor to the Raspberry Pi Pico board. The DHT20 is a digital temperature and humidity sensor with an I2C interface. The I2C is a two-wire interface that is used to connect low-speed devices in embedded systems. It is a quite common interface for sensors and other peripherals.

This specific example covers the DHT20 sensor, but the same principles apply to other I2C sensors. So with the datasheet of your sensor, you should be able to adapt this example to work with your sensor.🤞

Getting prepared

To build this project, you need the following:

  • DHT20 temperature & humidity sensor, I've got mine from Az-Delivery,

Wiring

Connect the DHT20 sensor to the Pico board like this:

DHT20 schematic

For the sake of simplicity, I've removed the programmer from the schematic. You still need it to flash the Pico board, though.

Your exact wiring might differ, depending on the sensor you have. The DHT20 sensor has a 4-pin connector. The pins are VCC, GND, SDA, and SCL. Connect the VCC pin to 3.3V, the GND pin to GND, the SDA pin to GP pin 2, and the SCL pin to GP pin 3.


Make main.rs look like this:

#![no_std]
#![no_main]

use defmt::*;
use embassy_executor::Spawner;
use embassy_rp::{
    bind_interrupts,
    i2c::{self, Config, InterruptHandler},
    peripherals::I2C1,
};
use embassy_time::Timer;
use {defmt_rtt as _, panic_probe as _};

use crate::dht20::{initialize, read_temperature_and_humidity};

bind_interrupts!(struct Irqs {
    I2C1_IRQ => InterruptHandler<I2C1>;
});

/// DHT20 sensor: datasheet: https://cdn.sparkfun.com/assets/8/a/1/5/0/DHT20.pdf
mod dht20 {
    use defmt::debug;
    use embassy_rp::{
        i2c::{Async, I2c},
        peripherals::I2C1,
    };
    use embassy_time::Timer;
    use embedded_hal_async::i2c::I2c as I2cAsync;

    const DHT20_I2C_ADDR: u8 = 0x38;
    const DHT20_GET_STATUS: u8 = 0x71;
    const DHT20_READ_DATA: [u8; 3] = [0xAC, 0x33, 0x00];

    const DIVISOR: f32 = 2u32.pow(20) as f32;
    const TEMP_DIVISOR: f32 = DIVISOR / 200.0;

    pub async fn initialize(i2c: &mut I2c<'static, I2C1, Async>) -> bool {
        Timer::after_millis(100).await;
        let mut data = [0x0; 1];
        i2c.write_read(DHT20_I2C_ADDR, &[DHT20_GET_STATUS], &mut data)
            .await
            .expect("Can not read status");

        data[0] & 0x18 == 0x18
    }

    async fn read_data(i2c: &mut I2c<'static, I2C1, Async>) -> [u8; 6] {
        let mut data = [0x0; 6];

        for _ in 0..10 {
            i2c.write(DHT20_I2C_ADDR, &DHT20_READ_DATA)
                .await
                .expect("Can not write data");
            Timer::after_millis(80).await;

            i2c.read(DHT20_I2C_ADDR, &mut data)
                .await
                .expect("Can not read data");

            if data[0] >> 7 == 0 {
                break;
            }
        }

        data
    }

    pub async fn read_temperature_and_humidity(i2c: &mut I2c<'static, I2C1, Async>) -> (f32, f32) {
        let data = read_data(i2c).await;
        debug!("data = {:?}", data);

        let raw_hum_data =
            ((data[1] as u32) << 12) + ((data[2] as u32) << 4) + (((data[3] & 0xf0) >> 4) as u32);
        debug!("raw_humidity_data = {:x}", raw_hum_data);
        let humidity = (raw_hum_data as f32) / DIVISOR * 100.0;

        let raw_temp_data =
            (((data[3] as u32) & 0xf) << 16) + ((data[4] as u32) << 8) + (data[5] as u32);
        debug!("raw_temperature_data = {:x}", raw_temp_data);
        let temperature = (raw_temp_data as f32) / TEMP_DIVISOR - 50.0;

        (temperature, humidity)
    }
}

#[embassy_executor::main]
async fn main(_spawner: Spawner) {
    let p = embassy_rp::init(Default::default());
    let sda = p.PIN_2;
    let scl = p.PIN_3;

    info!("set up i2c ");
    let mut i2c = i2c::I2c::new_async(p.I2C1, scl, sda, Irqs, Config::default());
    let ready = initialize(&mut i2c).await;
    info!("Ready: {}", ready);

    loop {
        let (temperature, humidity) = read_temperature_and_humidity(&mut i2c).await;
        info!("temperature = {}C", temperature);
        info!("humidity = {}%", humidity);

        Timer::after_millis(500).await;
    }
}

You need these dependencies in your Cargo.toml:

[dependencies]
embassy-embedded-hal = { version = "0.1.0", git = "https://github.com/embassy-rs/embassy", features = ["defmt"] }
embassy-sync = { version = "0.5.0", git = "https://github.com/embassy-rs/embassy", features = ["defmt"] }
embassy-executor = { version = "0.5.0", git = "https://github.com/embassy-rs/embassy", features = ["task-arena-size-32768", "arch-cortex-m", "executor-thread", "executor-interrupt", "defmt", "integrated-timers"] }
embassy-time = { version = "0.3", git = "https://github.com/embassy-rs/embassy", features = ["defmt", "defmt-timestamp-uptime"] }
embassy-rp = { version = "0.1.0", git = "https://github.com/embassy-rs/embassy", features = ["defmt", "unstable-pac", "time-driver", "critical-section-impl"] }

cortex-m = { version = "0.7.6", features = ["inline-asm"] }
cortex-m-rt = "0.7.0"
panic-probe = { version = "0.3", features = ["print-defmt"] }
defmt = "0.3"
defmt-rtt = "0.4"
embedded-hal-async = "1.0"

In this example, we read the temperature and humidity from the DHT20 sensor. The sensor is connected to the Pico board via I2C. The sensor is read every 500ms.

Run the project with:

$ cargo run --release

You should see the temperature and humidity values printed to the console every 500ms.

0.002239 INFO  set up i2c 
0.102840 INFO  Ready: true
0.184031 DEBUG data = [28, 155, 38, 117, 178, 148]
0.184077 DEBUG raw_humidity_data = 9b267
0.184141 DEBUG raw_temperature_data = 5b294
0.184230 INFO  temperature = 21.219635C
0.184256 INFO  humidity = 60.605526%

Reference material

Embedded Rust - STM32F401

In this chapter, we'll be looking at programming an STM32F401 (STM32F401CCU6), or Black Pill board. This board is relatively cheap, comes with a USB-C connector and offers great development possibilities. The STM32 development board is equipped with an ARM Cortex-M4 processor. The full specifications can be found here: https://stm.com.

If you are looking for details on the STM32F103, or Blue Pill board, check this chapter: Embedded Rust - STM32F103.

Getting prepared

Hardware

In order to build your first embedded project, you need the following:

  • One or more STM32F401CCU6 development boards; these are also known as Black Pill boards,
  • ST-Link V2 USB Debugger,
  • A breadboard with some common components; look for a bundle:
    • Jumper wire cables,
    • Some LEDs, resistors and push buttons.

It also helps to have some soldering gear.

If you are not soldering the pins to the STM32 board, do make sure the connections to the breadboard are solid and stable. You can easily lose a few hours debugging your source code, before figuring our that one of the connections is broken, or unstable.

Software

First we'll install probe-rs which is a tool to flash and debug embedded devices.

If you have cargo-binstall installed, you can install probe-rs with:

$ cargo binstall probe-rs

If not, you can build from source with:

$ cargo install probe-rs --locked --features cli

I had to restart my terminal to get the probe-rs command to work, i.e. to be found in the path. It should be in ~/.cargo/bin/probe-rs.

Rust ARM target

We'll also need the build toolchain for the ARM Cortex M3 processor. This includes the ARM cross-compile tools, OpenOCD and the Rust ARM target.

Make sure you have the thumbv7em-none-eabi target installed. We'll use the rustup tool to install it:

$ rustup target add thumbv7em-none-eabi

Connect the STM32

In order to program the STM32 board, you need to connect it to your PC using the ST-Link USB adapter. You should wire it like this:

STM32ST-Link v2
V3 = Red3.3v (Pin 8)
IO = OrangeSWDIO (Pin 4)
CLK = BrownSWDCLK (Pin 2)
GND = BlackGND (Pin 6)
Schematic:

Connect SMT32

Pinout:

Pinout STM32F401CCU6


Once connected you can use this command to establish the link and check that everything is working so far:

$ probe-rs list

The following debug probes were found:
[0]: STLink V2 (VID: 0483, PID: 3748, Serial: 49C3BF6D067578485335241067, StLink)
$ probe-rs info

Probing target via JTAG

ARM Chip:
Debug Port: Version 1, DP Designer: ARM Ltd
└── 0 MemoryAP
    └── ROM Table (Class 1)
        ├── Cortex-M4 SCS   (Generic IP component)
        │   └── CPUID
        │       ├── IMPLEMENTER: ARM Ltd
        │       ├── VARIANT: 0
        │       ├── PARTNO: 3108
        │       └── REVISION: 1
        ├── Cortex-M3 DWT   (Generic IP component)
        ├── Cortex-M3 FBP   (Generic IP component)
        ├── Cortex-M3 ITM   (Generic IP component)
        ├── Cortex-M4 TPIU  (Coresight Component)
        └── Cortex-M4 ETM   (Coresight Component)

Unable to debug RISC-V targets using the current probe. RISC-V specific information cannot be printed.
Unable to debug Xtensa targets using the current probe. Xtensa specific information cannot be printed.

Probing target via SWD

ARM Chip:
Debug Port: Version 1, DP Designer: ARM Ltd
└── 0 MemoryAP
    └── ROM Table (Class 1)
        ├── Cortex-M4 SCS   (Generic IP component)
        │   └── CPUID
        │       ├── IMPLEMENTER: ARM Ltd
        │       ├── VARIANT: 0
        │       ├── PARTNO: 3108
        │       └── REVISION: 1
        ├── Cortex-M3 DWT   (Generic IP component)
        ├── Cortex-M3 FBP   (Generic IP component)
        ├── Cortex-M3 ITM   (Generic IP component)
        ├── Cortex-M4 TPIU  (Coresight Component)
        └── Cortex-M4 ETM   (Coresight Component)

Debugging RISC-V targets over SWD is not supported. For these targets, JTAG is the only supported protocol. RISC-V specific information cannot be printed.
Debugging Xtensa targets over SWD is not supported. For these targets, JTAG is the only supported protocol. Xtensa specific information cannot be printed.

Quick start

We'll be using the excellent embassy crate to program the STM32. The embassy crate is a framework for building concurrent firmware for embedded systems. It is based on the async and await syntax, and is designed to be used with the no_stdenvironment.

Create a new hello_stm32_world project with the following files; take care to put them in the correct directories:

cargo new hello_stm32_world
cd hello_stm32_world

In the project root:

Cargo.toml

[package]
edition = "2021"
name = "hello_stm32_world"
version = "0.1.0"
license = "MIT OR Apache-2.0"

[dependencies]
embassy-stm32 = { version = "0.1.0", features = ["defmt", "stm32f401cc", "unstable-pac", "memory-x", "time-driver-any", "exti", "chrono"] }
embassy-sync = { version = "0.5.0", features = ["defmt"] }
embassy-executor = { version = "0.5.0", features = ["task-arena-size-32768", "arch-cortex-m", "executor-thread", "executor-interrupt", "defmt", "integrated-timers"] }
embassy-time = { version = "0.3.0", features = ["defmt", "defmt-timestamp-uptime", "tick-hz-32_768"] }
embassy-usb = { version = "0.1.0", features = ["defmt"] }
embassy-net = { version = "0.4.0", features = ["defmt", "tcp", "dhcpv4", "medium-ethernet", ] }

defmt = "0.3"
defmt-rtt = "0.4"

cortex-m = { version = "0.7.6", features = ["inline-asm", "critical-section-single-core"] }
cortex-m-rt = "0.7.0"
embedded-hal = "0.2.6"
embedded-io = { version = "0.6.0" }
embedded-io-async = { version = "0.6.1" }
panic-probe = { version = "0.3", features = ["print-defmt"] }

[profile.release]
debug = 2

Next to the Cargo.toml, create a build script:

build.rs

fn main() {
    println!("cargo:rustc-link-arg-bins=--nmagic");
    println!("cargo:rustc-link-arg-bins=-Tlink.x");
    println!("cargo:rustc-link-arg-bins=-Tdefmt.x");
}

Then add a project-specific configuration file:

.cargo/config.toml

[target.'cfg(all(target_arch = "arm", target_os = "none"))']
# replace STM32F401CCUx with your chip as listed in `probe-rs chip list`
runner = "probe-rs run --chip STM32F401CCUx"

[build]
target = "thumbv7em-none-eabi"

[env]
DEFMT_LOG = "trace"

And finally edit the src/main.rs file to look like:

src/main.rs

#![no_std]
#![no_main]

use defmt::*;
use embassy_executor::Spawner;
use embassy_stm32::gpio::{Level, Output, Speed};
use embassy_time::Timer;
use {defmt_rtt as _, panic_probe as _};

#[embassy_executor::main]
async fn main(_spawner: Spawner) {
    let p = embassy_stm32::init(Default::default());
    info!("Hello STM32 World!");

    let mut led = Output::new(p.PC13, Level::High, Speed::Low);

    loop {
        info!("high");
        led.set_high();
        Timer::after_millis(300).await;

        info!("low");
        led.set_low();
        Timer::after_millis(300).await;
    }
}

Your project structure should look like this:

.
├── Cargo.toml
├── .cargo
│   └── config.toml
├── build.rs
└── src
    └── main.rs

First run

Now that you have the project set up, you can build and flash it to the STM32 board:

$ cargo run --release

After the standard build steps complete, you should see something like this:

      Erasing ✔ [00:00:00] [############################################################################################################################################################################] 32.00 KiB/32.00 KiB @ 43.20 KiB/s (eta 0s )
  Programming ✔ [00:00:00] [############################################################################################################################################################################] 21.00 KiB/21.00 KiB @ 32.46 KiB/s (eta 0s )    Finished in 1.41s
      
0.000000 TRACE BDCR ok: 00008200
└─ embassy_stm32::rcc::bd::{impl#2}::init @ /Users/mibes/.cargo/registry/src/index.crates.io-6f17d22bba15001f/embassy-stm32-0.1.0/src/fmt.rs:117 
0.000000 DEBUG flash: latency=0
└─ embassy_stm32::rcc::_version::init @ /Users/mibes/.cargo/registry/src/index.crates.io-6f17d22bba15001f/embassy-stm32-0.1.0/src/fmt.rs:130 
0.000000 DEBUG rcc: Clocks { sys: Hertz(16000000), pclk1: Hertz(16000000), pclk1_tim: Hertz(16000000), pclk2: Hertz(16000000), pclk2_tim: Hertz(16000000), hclk1: Hertz(16000000), hclk2: Hertz(16000000), hclk3: Hertz(16000000), plli2s1_q: None, plli2s1_r: None, pll1_q: None, rtc: Some(Hertz(32000)) }
└─ embassy_stm32::rcc::set_freqs @ /Users/mibes/.cargo/registry/src/index.crates.io-6f17d22bba15001f/embassy-stm32-0.1.0/src/fmt.rs:130 
0.000000 INFO  Hello STM32 World!

If you encounter any build issues, double-check that the config.toml and build.rs files are in the correct locations.

If you see the Hello STM32 World! message, you have successfully flashed the STM32 board with your first Rust program.

This project structure will be the foundation for the next steps in building more complex embedded projects. You may want to make a copy of this project structure to use as a template for future projects.

Running without the debugger.

Once loaded, your application is automatically executed when you power up the STM32 board, without the debugger attached. You can accomplish this in various ways:

  • Disconnect the brown and orange cables from the board (leaving the red and black attached). This will just draw power from the USB port, but won't establish the connection with the debugger.
  • Connect a USB charger to the USB port on the board. Make sure to disconnect the ST-LINK when doing this
  • Connect a 3.3v battery to the V3 and GND pins (red & black)

Embedded Rust - Capacitive Soil Moisture Sensor and the STM32F401

In this chapter, we'll build a simple embedded project using the STM32F401 microcontroller and a capacitive soil moisture sensor. We'll include an LED to indicate the moisture level of the soil. The LED will be blinking when the soil is too dry.

Getting prepared

To build this project, you need the following:

  • Capacitive Soil Moisture Sensor, I've got mine from Az-Delivery,
  • Orange LED,
  • 150Ω resistor,

Wiring

Connect the sensor to the STM board like this:

Capacitive Soil Moisture Sensor schematic

As you can see, the sensor needs 5V to power up. You can use an external power supply or connect the 5V pin of the ST-Link to the sensor, which is what I did.

The sensor has an analog output that we'll connect to the PA7 pin of the STM32F401. The PA7 pin is connected to the ADC1 channel 7.

The LED is connected to the PA0 pin of the STM32F401.


Make main.rs look like this:

#![no_std]
#![no_main]

use defmt::*;
use embassy_executor::Spawner;
use embassy_stm32::{
    adc::Adc,
    gpio::{Level, Output, Speed},
    peripherals::{ADC1, PA7},
};
use embassy_sync::{
    blocking_mutex::raw::ThreadModeRawMutex,
    channel::{Channel, Sender},
};
use embassy_time::{Delay, Timer};

use {defmt_rtt as _, panic_probe as _};

const DRYNESS: u16 = 3000;

static CHANNEL: Channel<ThreadModeRawMutex, u16, 64> = Channel::new();

#[embassy_executor::main]
async fn main(spawner: Spawner) {
    let p = embassy_stm32::init(Default::default());

    // orange LED
    let mut led = Output::new(p.PA0, Level::High, Speed::Low);

    info!("Setting up ADC");
    let mut delay = Delay;
    let adc = Adc::new(p.ADC1, &mut delay);
    let pin = p.PA7;

    spawner
        .spawn(read_moisture(adc, pin, CHANNEL.sender()))
        .unwrap();

    let rx_moisture = CHANNEL.receiver();

    loop {
        let alarm_on = rx_moisture.receive().await > DRYNESS;

        // Blink the LED if the moisture is too low
        if alarm_on {
            led.set_high();
            Timer::after_millis(500).await;
        }

        led.set_low();
    }
}

#[embassy_executor::task(pool_size = 2)]
async fn read_moisture(
    mut adc: Adc<'static, ADC1>,
    mut pin: PA7,
    tx_moisture: Sender<'static, ThreadModeRawMutex, u16, 64>,
) {
    loop {
        let v = adc.read(&mut pin);
        info!("ADC value: {}", v);
        tx_moisture.send(v).await;

        Timer::after_millis(1000).await;
    }
}

You need these dependencies in your Cargo.toml:

[dependencies]
embassy-stm32 = { version = "0.1.0", features = ["defmt", "stm32f401cc", "unstable-pac", "memory-x", "time-driver-any", "exti", "chrono"] }
embassy-sync = { version = "0.5.0", features = ["defmt"] }
embassy-executor = { version = "0.5.0", features = ["task-arena-size-32768", "arch-cortex-m", "executor-thread", "executor-interrupt", "defmt", "integrated-timers"] }
embassy-time = { version = "0.3.0", features = ["defmt", "defmt-timestamp-uptime", "tick-hz-32_768"] }
embassy-usb = { version = "0.1.0", features = ["defmt"] }
embassy-net = { version = "0.4.0", features = ["defmt", "tcp", "dhcpv4", "medium-ethernet", ] }

defmt = "0.3"
defmt-rtt = "0.4"

cortex-m = { version = "0.7.6", features = ["inline-asm", "critical-section-single-core"] }
cortex-m-rt = "0.7.0"
embedded-hal = "0.2.6"
embedded-io = { version = "0.6.0" }
embedded-io-async = { version = "0.6.1" }
panic-probe = { version = "0.3", features = ["print-defmt"] }

Run the project with:

$ cargo run --release

You should see the moisture level printed on the console. The LED should be blinking when the soil is too dry. It could be that you need to adjust the DRYNESS value to match your sensor.

Exercises

This chapter contains exercises that will help you to understand the concepts taught during the development classes. The idea is for you to revisit this chapter and apply what you've learned to the exercises.

Exercise - Driving School

We're going to help a local driving school with their student administration.

Start by creating a new project driving_school and add the chrono crate to your Cargo.toml.

Exercise 1: Students; after completing Session 2

Create a Student struct with the following fields:

  • name: String
  • date_of_birth: chrono::NaiveDate
  • has_id: bool
  • passed_eye_test: bool
  • lessons_completed: u16
  • car_type: CarType
  • exam_date: Option<chrono::NaiveDate>
  • passed_exam: bool

The CarType enum should have the following variants:

  • Manual
  • Automatic

Learning to drive a manual car is more difficult than learning to drive an automatic car. A student needs 5 extra lessons to learn to drive a manual car. The minimum number of lessons to learn to drive an automatic car is 20.

Use the builder pattern to create a Student instance. The Student struct should have a new method that takes the mandatory fields name: String, date_of_birth: chrono::NaiveDate, has_id: bool and passed_eye_test: bool. Then add a method with_car_type(mut self, car_type: CarType) -> Self that sets the car_type field.

In this particular country, students can only start with their driving lessons when they will turn 17 in the next 6 months. They also need to have an ID and have doctor's proof that they passed an eye test.

To implement this, create a method can_start_driving_lessons(&self) -> bool that returns true if the student can start driving lessons. I'd suggest creating a helper function is_seventeen_in_six_months(&self) -> bool that returns true if the student will turn 17 in the next 6 months. Maybe you can use test-driven-development to implement this method.

The lessons_completed field should be incremented by 1 when the student completes a lesson. Create a method complete_lesson(&mut self) that increments the lessons_completed field.

Exercise 2: Student validations; after completing Session 3

Add the anyhow crate to your Cargo.toml.

Change the new method to return an anyhow::Result<Self>. The method should validate if the student's date of birth is valid. You cannot sign up as a student if you're not at least 16 1/2 years old. If the date of birth is invalid, the method should return an error.

Also add some other validations to the new method:

  • They must have a name.
  • The student must have an ID.

You can quickly create an error with anyhow! like this: anyhow!("Student must be at least 16 1/2 years old").

We also need a minimum_lessons_remaining(&self) -> u8 method that returns the number of lessons remaining, depending on the car type. Add a test that checks if the method returns the correct number of lessons remaining, for: 0, 1, 5, 20, 21, and 30 lessons.

Include the set_exam_date(&mut self, exam_date: NaiveDate) -> Result<()> function. Think about the validations you need to add to this method.

Finally, add the std::fmt::Display trait to the Student struct. The Display implementation should return a string with the student's name, type of car they're learning to drive, the number of lessons they've completed and the number of lessons remaining. Use this format:

Name: Alice; Car type: Manual; Lessons completed: 5; Lessons remaining: 0

What validations did you add to the set_exam_date method? Did you check the minimum_lessons_remaining() and the passed_eye_test field?

Exercise 3: Student lists; after completing Session 4

To help the driving school instructors, we'll include some functions to print a list of students. They are interested in:

  • Students who still need to submit their eye test.
  • Students who have completed all their lessons and are ready to take the exam.
  • Students with an upcoming exam in the next 7 days.

Complete these steps:

  • Add a DrivingSchool struct that contains a Vec<Student>. Add a new method that creates a new DrivingSchool instance with an empty list of students.

  • Add a method add_student(&mut self, student: Student) that adds a student to the list.

  • Add a method students_needing_eye_test(&self) -> Vec<&Student> that returns a list of students that still need to hand in their eye test.

  • Add a method students_ready_for_exam(&self) -> Vec<&Student> that returns a list of students who have completed all their lessons and are ready to take the exam.

  • Add a method students_with_upcoming_exam(&self, in_days: u8) -> Vec<&Student> that returns a list of students who have an upcoming exam in the next n days.

How did you organize your code? Did you create a separate file for the Student and DrivingSchool structs? Did you create a models module?

Have you added tests for the DrivingSchool operations? Did you remember to filter out the students who have already passed their exam? What about some test helpers to create students and initialize the driving school?

Exercise 4: Cleaning-up; after completing Session 5

Due to data privacy regulations, we need to remove all personal data of students who have passed their exam. Add a background task that runs every day at midnight to remove the personal information of students who have passed their exam.

Create a clean_up_students(&mut self) method that removes the personal information of students who have passed their exam.

Since the background task runs asynchronously, you must use the tokio runtime. Add the tokio crate to your cargo.toml. Make sure you include the full feature set: tokio = { version = "1", features = ["full"] }.

You need to refactor your main function to use the tokio::main macro. Since the clean_up_students method is called from an asynchronous context, you need to make sure that the DrivingSchool struct is protected by a `RwLock'.

Did you find the retain method useful to remove students who have passed their exam?

Are you using the RwLock and sleep from the tokio crate? Note that you cannot use the std::thread::sleep function, because it blocks the thread.

How did you keep the `main' function alive?

Exercise 5: REST API; after completing Session 6

The driving school wants to create a REST API to manage their students. They want to have the following endpoints:

  • GET /students: Returns a list of all students.
  • GET /students/{id}: Returns a single student.
  • GET /students?pending_exam=true&days=7: Returns a list of students who have an upcoming exam in the next 7 days.
  • GET /students?ready_for_exam=true: Returns a list of students who are ready to take the exam.
  • GET /students?eye_test=false: Returns a list of students who still need to submit their eye test.
  • POST /students: Adds a new student.
  • PUT /students/{id}/exam_date: Updates the exam date of a student.
  • DELETE /students/{id}: Deletes a student.

To control access create a login endpoint with basic authentication that returns a JWT token. You can use the jsonwebtoken crate to create and validate the JWT token. Make sure all endpoints are protected with the JWT token.

Can you think of a way to create two roles: instructor and owner? The instructor can only read the students and the owner can add, update and delete students.

Is it OK to store the role in the JWT token? How can you validate the role?

Did you add logging to the REST API?

Playground

Try your Rust snippets in this playground.

The top-100 crates are available for use in the playground!

use rand::prelude::*;

fn main() {
    let mut rng = rand::thread_rng();
    let hurray: u8 = rng.gen();
    let hurray = hurray.max(10);    // at least 10 times!! 
    println!("We love the Rust playground {hurray} times");
}

Resources