Functions & ownership (part 1)
In this chapter, we'll explore Rust functions and ownership of variables. This is where stuff gets 'interesting'. Let's look at ownership and lifetimes first.
Because memory of a Rust program is not managed by a garbage collector at run-time, the Rust compiler needs to know when to free up memory at compile time. This can only be done deterministically when the compiler knows when a variable is no longer needed. This is where ownership comes in. Ownership of variables is a concept that is unique to Rust.
Variables are scoped in Rust. This means that a variable is only valid within the scope it is defined in. The scope is
defined by the curly braces {}
. When a variable goes out of scope, it is dropped.
As you can see, variables cannot outlive their scope. This is an essential concept in Rust, and it is enforced by the compiler. This concept is referred to as "lifetimes" of variables.
All the lifetime constraints are checked at compile time. The compiler has no way of knowing the actual runtime behavior of the program, so it will only accept code guaranteed to be safe at runtime. This can be frustrating at first, especially when you're coming from a language that does not have these constraints, and you are sure, or believe you are sure, that your code is correct.
Let's dive into the concept of ownership.
Functions
When you define a function in Rust, you can pass variables to the function. The function can either take ownership of the variable or borrow the variable.
Rust functions are defined with the fn
statement, followed by the function name, and the parameters in parentheses.
The return type is defined after the parameters, with a ->
followed by the type. The function body is enclosed in
curly braces {}
. Those braces define the scope of the function.
For example:
fn build_greeting(name: &str) -> String {
format!("Welcome, {name} to the Rust world!")
}
The function build_greeting
takes a &str
as a parameter and returns a String
. The function body uses the format!
macro to build a string and returns it.
Note the lack of a
;
at the end of the function body. This is because the last expression in a function is the return value. There is no need to use thereturn
keyword in Rust.
Moving variables
When you pass a variable to a function, the variable is moved to the function. This means that the function takes ownership of the variable, and the variable is dropped when the function goes out of scope.
The calling scope no longer has access to the variable after it has been moved to the function. You can only move a variable once. If you try to use a variable after it has been moved, the compiler will complain.
Borrowing variables
An alternative way of passing variables to a function is by lending the variable to the function. This means that the
function can use the variable, but the ownership remains with the calling function. The variable is not dropped when
the function goes out of scope. This is called: "borrowing". You use the &
symbol to borrow a variable.
You can have many read-only borrows at the same time, but only one mutable borrow at a time. This ensures that the variable is not modified by multiple functions at the same time, which could lead to undefined behavior.
Cloning
If you are having trouble with ownership and borrowing, you can clone the variable. This creates a new variable with the same value, and you can pass this to the function. Cloning can be expensive, but in practice the actual cost is negligible for most situations.
In case you need to clone a variable, you can use the clone()
method. For "expensive" types, like structs
you can
often wrap them in a smart pointer like Rc
or Arc
. This allows you to clone the reference to the struct, instead of
the struct itself. These smart pointers are also a great way of dealing with code that behaves correctly at runtime,
but the compiler cannot verify this at compile time.
Don't fight the borrow checker. Use
clone()
, possibly withRc
orArc
to get around lifetime issues. You can always revisit your code and search for these clones and optimize once your Rust toolset has expanded.
Enhance the Hello World with functions
Let's enhance our Hello World with a greeter function (how original!):
fn greet(name: &str) { println!("Welcome, {name} to the Rust world!"); } fn main() { greet("Rusty"); }
We're using the variable type: &str
in the above code. As seen, the &
means we are borrowing the variable. The
&str
type is a string slice, which is a reference to a string. "Rusty"
is a literal string (implicitly typed
to: &'static str
). We can assign string literals to &str
without conversion. String slices have a fixed length and
cannot be modified.
If you need to modify a string, you can use the String
type. This is a growable, mutable string. You can convert a
&str
to a String
by calling the to_string()
method on the string slice. The String
type offers many methods that
allow you to manipulate the string.
If we re-write the above example to use String
, it would look like this:
fn greet(name: String) { println!("Welcome, {name} to the Rust world!"); } fn main() { greet("Rusty".to_string()); }
We can modify the greet
function to greet many folks by passing a list (= vector) of strings:
fn greet(names: Vec<String>) { println!("Welcome, {names:?} to the Rust world!"); } fn main() { greet(vec!["Rusty".to_string(), "Marcel".to_string()]); }
As explained earlier, there are two things you should know about passing variables:
- In Rust functions are greedy. When you pass a variable to a function, it will take the variable and not give it back, unless you tell it otherwise; i.e. the variable is moved to the function.,
- In Rust only one function can own a variable at any moment in time.
So with these two rules in mind, let's look at the following code:
fn greet(names: Vec<String>) { println!("Welcome, {names:?} to the Rust world!"); } fn main() { let mut names = vec!["Rusty".to_string(), "Marcel".to_string()]; greet(names); names.push("John".to_string()); greet(names); }
The goal that I had in mind with this code is to greet a number of people: "Rusty" and "Marcel". Then add a person ("John") to the list and greet the lot again.
When you try to run this, we get an error:
error[E0382]: borrow of moved value: `names`
--> src/main.rs:8:5
|
6 | let mut names = vec!["Rusty".to_string(), "Marcel".to_string()];
| --------- move occurs because `names` has type `std::vec::Vec<std::string::String>`, which does not implement the `Copy` trait
7 | greet(names);
| ----- value moved here
8 | names.push("John".to_string());
| ^^^^^ value borrowed here after move
What has happened is that we gave the "greet" function the "names" variable. It happily took this variable, and as noted before, it does not return this to the calling function ("main").
In our case, after the first call to "greet" the names
variable is cleared. The "greet" function owns the "names"
variable, and because there is no further use for it after println!
the variable is freed.
We can change this behavior by lending the variable to the "greet" function. This is called borrowing and is done
with the &
statement, in this way:
fn greet(names: &Vec<String>) { println!("Welcome, {names:?} to the Rust world!"); } fn main() { let mut names = vec!["Rusty".to_string(), "Marcel".to_string()]; greet(&names); names.push("John".to_string()); greet(&names); }
Please note that although this code compiles and runs, Clippy has an optimization recommendation for you!
In the above code, we temporarily lend the "names" variable to the "greet" function, but keep the ownership in the " main" function. In this way, the "names" variable is freed at the end of the "main" function.
We'll revisit the topic of ownership and borrowing a few more times, because this can become 'painful' quickly when not understood 100%.
Borrowing and mutating
Imagine we want the "greet" function to clear the list of names after greeting. (not debating if this is good practice of not!). You could come up with something like this:
fn greet(names: &Vec<String>) { println!("Welcome, {names:?} to the Rust world!"); names.clear(); } fn main() { let mut names = vec!["Rusty".to_string(), "Marcel".to_string()]; greet(&names); names.push("John".to_string()); greet(&names); }
The compiler won't let you run this code though:
error[E0596]: cannot borrow `*names` as mutable, as it is behind a `&` reference
--> src/main.rs:3:5
|
1 | fn greet(names: &Vec<String>) {
| ------------ help: consider changing this to be a mutable reference: `&mut std::vec::Vec<std::string::String>`
2 | println!("Welcome, {names:?} to the Rust world!");
3 | names.clear();
| ^^^^^ `names` is a `&` reference, so the data it refers to cannot be borrowed as mutable
Although we're lending the "names" to the "greet" function, we're not explicitly allowing the "greet" function to modify the list of names. If we want to do this, we need to pass it as a "mutable reference", like this:
fn greet(names: &mut Vec<String>) { println!("Welcome, {names:?} to the Rust world!"); names.clear(); } fn main() { let mut names = vec!["Rusty".to_string(), "Marcel".to_string()]; greet(&mut names); names.push("John".to_string()); greet(&mut names); }
Why is there a difference? Although more functions can borrow a read-only reference simultaneously, only one function can borrow a mutable reference at any moment in time. This is more evident when we look at multi-threading. Just keep it in mind for now.
We can mix-and-match as needed:
fn greet_and_replace(names: &mut Vec<String>) { println!("Welcome, {names:?} to the Rust world!"); names.clear(); names.push("John".to_string()); } fn greet(names: &Vec<String>) { println!("Welcome, {names:?} to the Rust world!"); } fn main() { let mut names = vec!["Rusty".to_string(), "Marcel".to_string()]; greet_and_replace(&mut names); greet(&names); }
There is another way to pass and return ownership: actually taking and returning the variable as part of the function. Let's look at that.
Returning data from a function
We'll stick with the greeter, but modify the example slightly.
fn greet(mut names: Vec<String>) -> Vec<String> { println!("Welcome, {names:?} to the Rust world!"); names.clear(); names.push("John".to_string()); names } fn main() { let names = vec!["Rusty".to_string(), "Marcel".to_string()]; let new_names = greet(names); greet(new_names); }
In this example, we're not lending the "names" to "greet", but we give the ownership by passing the variable. The "greet" function modifies the list (that it owns) and returns the modified list to "main". By doing this, it also passes the ownership to "main"!
There are two things to notice:
- the "mut" statement before "names" is mandatory, to allow the "greet" function to mutate the list. This has no > relationship to ownership. As you can see that the "names" in "main" is not mutable.
- the
;
is missing on the last statement in the "greet" function.
In Rust, the result of the last executed statement in a function is returned, when there is no ;
. This means that
the signature, or type, of the last statement must match that of the function it is returning from. Try this:
fn greet(mut names: Vec<String>) -> Vec<String> { println!("Welcome, {names:?} to the Rust world!"); names.clear(); names.push("John".to_string()) } fn main() { let names = vec!["Rusty".to_string(), "Marcel".to_string()]; let new_names = greet(names); greet(new_names); }
You will see that the compiler will complain:
expected struct `std::vec::Vec`, found `()`
This is because the .push()
function is returning nothing ()
, which does not match the Vec<String>
that is
expected. You can determine the signature of a function by holding down the "cmd" button (on a Mac) and hovering over
the function. Often you can click (while holding the "cmd" button) to navigate to the source-code of the function, and
if you're lucky, there is actually some documentation and a usage example.
Note that adding a
;
to the "names" statement gives a very similar error!
fn greet(mut names: Vec<String>) -> Vec<String> { println!("Welcome, {names:?} to the Rust world!"); names.clear(); names.push("John".to_string()); names; } fn main() { let names = vec!["Rusty".to_string(), "Marcel".to_string()]; let new_names = greet(names); greet(new_names); }
Maybe obvious, but you cannot pass a variable to a function when you're not the owner:
fn greet(mut names: Vec<String>) -> Vec<String> { println!("Welcome, {names:?} to the Rust world!"); names.clear(); names.push("John".to_string()); names } fn main() { let names = vec!["Rusty".to_string(), "Marcel".to_string()]; greet(names); greet(names); }
Nevertheless, you will see this type of error a lot during your first weeks with Rust!
error[E0382]: use of moved value: `names`
--> src/main.rs:11:11
|
9 | let names = vec!["Rusty".to_string(), "Marcel".to_string()];
| ----- move occurs because `names` has type `std::vec::Vec<std::string::String>`, which does not implement the `Copy` trait
10 | greet(names);
| ----- value moved here
11 | greet(names);
| ^^^^^ value used here after move
You can fix this in two ways:
- Lend the variable to the "greet" function as per the previous examples.
- If it really your intention to pass the same data twice, make a clone.
fn greet(mut names: Vec<String>) -> Vec<String> { println!("Welcome, {names:?} to the Rust world!"); names.clear(); names.push("John".to_string()); names } fn main() { let names = vec!["Rusty".to_string(), "Marcel".to_string()]; greet(names.clone()); greet(names); }
Don't be afraid to use
clone()
to get out of these situations (initially). You can always revisit your code and search for these clones and optimize once your Rust toolset has expanded. Or as a colleague said: "sprinkle your code with these clones() until it compiles." :-)
The same happens in this example:
fn greet(name: String) { println!("Welcome {name}"); } fn main() { let name = "Marcel".to_string(); let other_name = name; greet(name); greet(other_name); }
The two solutions:
fn greet(name: String) { println!("Welcome {name}"); } fn main() { let name = "Marcel".to_string(); let other_name = name.clone(); greet(name); greet(other_name); }
fn greet(name: &String) { println!("Welcome {name}"); } fn main() { let name = "Marcel".to_string(); let other_name = &name; greet(&name); greet(other_name); }
Note that the type of "other_name" is different in the two examples: "String" vs. "&String".
Be careful with the second example. Although it is more efficient, it sets you up for another common error. Check these two examples:
fn greet(name: String) { println!("Welcome {name}"); } fn main() { let mut name = "Marcel".to_string(); let other_name = name.clone(); name = "Horaci".to_string(); greet(name); greet(other_name); }
vs:
fn greet(name: &String) { println!("Welcome {name}"); } fn main() { let mut name = "Marcel".to_string(); let other_name = &name; name = "Horaci".to_string(); greet(&name); greet(other_name); }
You will curse at this one a few more times in the next weeks:
error[E0506]: cannot assign to `name` because it is borrowed
--> src/main.rs:8:5
|
7 | let other_name = &name;
| ----- borrow of `name` occurs here
8 | name = "Horaci".to_string();
| ^^^^ assignment to borrowed `name` occurs here
9 | greet(&name);
10 | greet(other_name);
| ---------- borrow later used here
Swapping the two lines will do away with the error for now, but this was clearly not the intention of the programmer:
fn greet(name: &String) { println!("Welcome {name}"); } fn main() { let mut name = "Marcel".to_string(); name = "Horaci".to_string(); let other_name = &name; greet(&name); greet(other_name); }
You can fix it as it was intended using interior mutability. This is a more advanced topic. For now, remember that
you can use Rc
and RefCell
to get around these issues. We'll revisit this topic later.
Warning advanced example upcoming, ignore at will.
use std::cell::RefCell; use std::rc::Rc; fn greet(name: Rc<RefCell<String>>) { println!("Welcome {}", name.borrow()); } fn main() { let name = Rc::new(RefCell::new("Marcel".to_string())); let other_name = name.clone(); greet(name.clone()); greet(other_name.clone()); name.replace("Horaci".to_string()); greet(name); greet(other_name); }
Although you can pretty much forget about this example for now, it is worth noting that the
.clone()
operations in this example are cloning the reference (Rc
) to the String, not the String itself. The innerRefCell
wrapper allows us to mutate the String. So for folks coming from the C-world, we are effectively creating a (smart) pointer to a string.