Migrate to Rust from C
If you're coming from C, you'll find that Rust has a lot of similarities. However, there are some differences that you'll need to get used to.
Organizing code
In C, you can organize your code into separate files using header files and source files. In Rust, you can organize your code into modules. Modules allow you to group related code together and control the visibility of code within the module.
Here's an example of organizing code into modules in Rust:
mod math { pub fn add(a: i32, b: i32) -> i32 { a + b } } fn main() { use math::add; let result = add(1, 2); println!("{result}"); }
Although the above example is valid, it's not idiomatic Rust. In Rust, you would typically organize your code into
separate files and use the mod
keyword to include them in your main file. So the contents of the mod math {}
block
would be in a separate file called math.rs
or math/mod.rs
. The main.rs
file would then look like this:
use math::add; mod math; fn main() { let result = add(1, 2); println!("{result}"); }
In C you would have a header file math.h
and a source file math.c
that would look like this:
// math.h
int add(int a, int b);
// math.c
#include "math.h"
int add(int a, int b) {
return a + b;
}
// main.c
#include <stdio.h>
#include "math.h"
int main() {
int result = add(1, 2);
printf("%d\n", result);
}
Optimizing code for production
In C, you can use compiler flags like -O2
to optimize your code for production. In Rust, you can use the --release
flag with the cargo build
command to optimize your code for production.
Here's an example of compiling a C program with optimizations:
$ gcc -O2 -o program program.c
In Rust, the same code would look like this:
$ cargo build --release
The --release
flag tells the Rust compiler to optimize the code for production. This includes inlining functions,
removing debug symbols, and other optimizations that can improve performance.
Language features
Pointers
In C, you can create a pointer to a variable like this:
#include <stdio.h>
int main()
{
int x = 10;
int *ptr = &x;
printf("Value of x: %d\n", *ptr);
}
In Rust, you can create a reference to a variable like this:
fn main() { let x = 10; let ptr = &x; println!("Value of x: {}", *ptr); }
Lifetimes are a concept in Rust that ensures that references are valid for a certain period of time. This is a key difference between Rust and C. In C, you can have dangling pointers, which can lead to undefined behavior. In Rust, the compiler will ensure that references are valid for the lifetime of the reference. This is a blessing and a curse, as it can be a bit tricky to get used to at first.
For newcomers using a smart pointer like Rc
, or Arc
can be a good way to avoid dealing with lifetimes.
use std::rc::Rc; fn main() { let x = 10; let ptr = Rc::new(x); println!("Value of x: {}", *ptr); }
The overhead of using smart pointers is negligible in most cases, and it can save you a lot of headaches.
Memory Management
In C, you have to manage memory manually using malloc
and free
. This can be error-prone and lead to memory leaks
and other issues. In Rust, memory management is handled by the compiler using the ownership system. This system ensures
that memory is freed when it is no longer needed, and prevents common memory-related bugs like use-after-free and
double-free errors.
Here's an example of manual memory management in C:
#include <stdio.h>
#include <stdlib.h>
int main()
{
int *ptr = malloc(sizeof(int));
*ptr = 10;
printf("Value of x: %d\n", *ptr);
free(ptr);
}
In Rust the same code would look like this:
fn main() { let x = 10; let ptr = &x; println!("Value of x: {}", *ptr); }
The x
will be dropped when it goes out of scope, and the memory will be freed automatically. In the above example,
this happens immediately after the println!
statement.
Strings
In C, strings are represented as arrays of characters. In Rust, strings are represented as String
objects (editable),
or &str
string slices (static). Rust strings are UTF-8 encoded, which means they can contain any valid Unicode code
point.
Here's an example of working with strings in C:
#include <stdio.h>
int main()
{
char *name = "John";
printf("Hello, %s!\n", name);
}
In Rust, the same code would look like this:
fn main() { let name = "John"; println!("Hello, {name}!"); }
In Rust, you can also create a String
object like this:
fn main() { let name = "John".to_string(); println!("Hello, {name}!"); }
Arrays
In C, arrays are fixed-size collections of elements. In Rust, arrays are also fixed-size, but they are bounds-checked at runtime. This means that if you try to access an element outside the bounds of the array, your program will panic.
Here's an example of working with arrays in C:
#include <stdio.h>
int main()
{
int arr[3] = {1, 2, 3};
for (int i = 0; i < 3; i++) {
printf("%d\n", arr[i]);
}
}
In Rust, the same code would look like this:
fn main() { let arr = [1, 2, 3]; for i in 0..3 { println!("{}", arr[i]); } }
In Rust, you can also use the iter
method to iterate over the elements of an array. This would be a more idiomatic
way to write the above code:
fn main() { let arr = [1, 2, 3]; arr.iter().for_each(|i| println!("{i}")); }
Arrays vs. Vectors
In Rust, arrays are fixed-size collections of elements, while vectors are dynamic-size collections. Vectors are more flexible than arrays, but they have a performance cost. If you need a collection that can grow or shrink at runtime, you should use a vector. If you know the size of the collection at compile time, you should use an array.
Here's an example of working with vectors in Rust:
fn main() { let mut list = vec![1, 2, 3]; list.push(4); list.iter().for_each(|i| println!("{i}")); }
Ternary operator
In C, you can use the ternary operator ? :
to conditionally assign a value to a variable. In Rust, you can use the
if
expression to achieve the same result.
Here's an example of using the ternary operator in C:
#include <stdio.h>
int main()
{
int x = 10;
int y = x > 5 ? 1 : 0;
printf("%d\n", y);
}
In Rust, the same code would look like this:
fn main() { let x = 10; let y = if x > 5 { 1 } else { 0 }; println!("{y}"); }
Dealing with file I/O
In C, you can read and write files using the fopen
, fread
, fwrite
, and fclose
functions. In Rust, you can use
the std::fs
module to read and write files.
Here's an example of reading a file in C:
#include <stdio.h>
int main()
{
FILE *file = fopen("file.txt", "r");
char buffer[256];
fread(buffer, 1, 256, file);
printf("%s\n", buffer);
fclose(file);
}
In Rust, the same code would look like this:
use std::fs::File; use std::io::Read; fn main() { let mut file = File::open("file.txt").expect("Unable to open file"); let mut buffer = String::new(); file.read_to_string(&mut buffer).expect("Unable to read file"); println!("{buffer}"); }
In Rust, you can use the std::fs::write
function to write to a file:
use std::fs; fn main() { let data = "Hello, world!"; fs::write("another_file.txt", data).expect("Unable to write file"); }
In C you would use fwrite
to write to a file:
#include <stdio.h>
int main()
{
FILE *file = fopen("another_file.txt", "w");
char data[] = "Hello, world!";
fwrite(data, 1, sizeof(data), file);
fclose(file);
}
Functions
In C, functions are defined using the function_name
syntax. In Rust, functions are defined using
the fn function_name
syntax. Rust functions can take parameters and return values, just like C functions.
Here's an example of defining a function in C:
#include <stdio.h>
int add(int a, int b)
{
return a + b;
}
int main()
{
int result = add(1, 2);
printf("%d\n", result);
}
In Rust, the same code would look like this:
fn add(a: i32, b: i32) -> i32 { a + b } fn main() { let result = add(1, 2); println!("{result}"); }
Note: In Rust, the last expression in a function is implicitly returned. This means that you don't need to use the
return
keyword to return a value from a function.
Void functions
In C, you can define a function that doesn't return a value using the void
keyword. In Rust, you can define a function
that doesn't return a value using the ()
type. This is called the unit type, but it is often not explicitly written.
Here's an example of defining a void function in C:
#include <stdio.h>
void greet()
{
printf("Hello, world!\n");
}
int main()
{
greet();
}
In Rust, the same code would look like this:
fn greet() { println!("Hello, world!"); } fn main() { greet(); }
We could have written the greet
function like this:
#![allow(unused)] fn main() { fn greet() -> () { println!("Hello, world!"); } }
But it is not idiomatic Rust to write it like this. And Clippy will warn you about it:
warning: unneeded unit return type
--> src/main.rs:1:11
|
1 | fn greet() -> () {
| ^^^^^^ help: remove the `-> ()`
|
= help: for further information visit https://rust-lang.github.io/rust-clippy/master/index.html#unused_unit
= note: `#[warn(clippy::unused_unit)]` on by default
For-loops
In C you can use a for
loop to initialize and then iterate over a range of values. A typical C-style for
loop looks
like this:
#include <stdio.h>
#include <stdlib.h>
int main()
{
int size = 10;
int* arr = (int*)malloc(size * sizeof(int));
if (arr == NULL) {
exit(0);
}
for (int i = 0; i < size; ++i) {
arr[i] = i + 1;
}
for (int i = 0; i < size; ++i) {
printf("%d\n", arr[i]);
}
}
What I've seen from people that move to Rust from C is that they often use the for
loop in Rust in a C-style way:
fn main() { let size = 10; let mut arr = vec![0; size]; for i in 0..size { arr[i] = i + 1; } for i in 0..size { println!("{}", arr[i]); } }
This is not idiomatic Rust. And worse, it has performance implications. Rust uses bounds checking to ensure that you don't access elements outside the bounds of the array. This means that the above code will be slower than the equivalent C code. The idiomatic way to write the above code in Rust would be like this:
fn main() { let size = 10; let arr: Vec<i32> = (1..=size).collect(); arr.iter().for_each(|i| println!("{i}")); }
Using iterators is the idiomatic way to write loops in Rust. It is more concise and more performant than using a C-style
for
loop.
Macros
Rust has a powerful macro system that allows you to define custom syntax that gets evaluated at compile time. This is similar to the preprocessor in C. Here's an example of a macro in C:
#include <stdio.h>
#define LOG(message) printf("%s\n", message)
int main()
{
LOG("Hello, world!");
}
In Rust, the same code would look like this:
macro_rules! log { ($message:expr) => { println!("{}", $message); }; } fn main() { log!("Hello, world!"); }
Multi-threading
In C, you can create threads using the pthread
library. In Rust, you can create threads using the std::thread
module. A simple example in C that creates a thread looks like this:
#include <stdio.h>
#include <pthread.h>
void* print_message(void* message)
{
printf("%s\n", (char*)message);
return NULL;
}
int main()
{
pthread_t thread;
char* message = "Hello, world!";
pthread_create(&thread, NULL, print_message, (void*)message);
pthread_join(thread, NULL);
}
In Rust, the same code would look like this:
use std::thread; fn print_message(message: &str) { println!("{message}"); } fn main() { let message = "Hello, world!"; let handle = thread::spawn(|| print_message(message)); handle.join().expect("thread panicked"); }
Stuff that you can't do in "safe" Rust
I came across a C example that let you modify a global static variable from a thread, without any synchronization or
locking mechanism. This is undefined behavior in Rust. Safe Rust will not allow you to modify a global static variable
from a thread without using a synchronization primitive like a Mutex
or an Atomic
type.
In C you would be allowed to do this:
#include <stdio.h>
#include <pthread.h>
int counter = 0;
void* increment_counter(void* arg)
{
counter++;
return NULL;
}
int main()
{
pthread_t thread;
pthread_create(&thread, NULL, increment_counter, NULL);
pthread_join(thread, NULL);
printf("%d\n", counter);
}
To the credit of the author of the original C example, they did mention that this is not a good practice and that you should use a synchronization primitive like a mutex to protect the global variable. But as you can see, C is not going to stop you from doing this.
In "safe" Rust this is not possible. The code will not compile! Let's look at the equivalent "unsafe" Rust code:
use std::thread; static mut COUNTER: i32 = 0; fn increment_counter() { unsafe { COUNTER += 1; } } fn main() { let handle = thread::spawn(|| increment_counter()); handle.join().expect("thread panicked"); unsafe { println!("COUNTER: {}", COUNTER); } }
The use of unsafe
is a big red flag. It tells the developer that the code is doing something that will likely cause
Undefined Behavior. I suggest you read more about Undefined Behavior (UB) in Rust, and why it is important to avoid it.
UB will bite you when you least expect it. Stay safe! 😁
Exercise
Try removing the
unsafe
block from the above code and see what happens.
So now let's build the Rust code the way it should be built; assuming that you still want a global counter:
use std::sync::Mutex; use std::thread; static COUNTER: Mutex<i32> = Mutex::new(0); fn increment_counter() { let mut counter = COUNTER.lock().expect("mutex is poisoned"); *counter += 1; } fn main() { let handle = thread::spawn(|| increment_counter()); handle.join().expect("thread panicked"); let counter = COUNTER.lock().expect("mutex is poisoned"); println!("COUNTER: {counter}"); }
This is the idiomatic way to write the above code in Rust. It uses a Mutex
to protect the global counter from being
modified by multiple threads at the same time. The Mutex
ensures that only one thread can access the counter at a
time.
Notice that we do not need to unlock the
Mutex
after we are done with it. TheMutex
will be automatically dropped when it goes out of scope and theDrop
trait of theMutex
will automatically unlock the mutex.
Also note that the inner value of the
Mutex
is 'consumed' by theMutex
when we create it withMutex::new(0)
. This means that theMutex
takes ownership of the value and will be responsible for dropping it when it goes out of scope. This is why we don't need to drop the inner value of theMutex
after we are done with it.Because the
Mutex
has ownership of the inner value, there is no way to access the inner value directly, so we don't need to worry about it being used incorrectly.
In Rust - and arguably in C as well - it is better to avoid global state as much as possible. What you will often see in Rust is that you will construct a struct that holds the state that you want to share between threads, and then pass a reference to that struct to the threads. This is a safer and more idiomatic way to write multithreaded code in Rust.
The mechanism to do this is to wrap the state in an Arc<Mutex<T>>
where T
is the type of the state. Arc
is a
thread-safe reference-counted pointer, and Mutex
is a thread-safe mutual exclusion lock. This is a common pattern in
Rust for sharing state between threads.
Let's redo the example once more using this pattern:
use std::sync::{Arc, Mutex}; use std::thread; struct Counter { value: i32, } fn increment_counter(counter: &Arc<Mutex<Counter>>) { let mut counter = counter.lock().expect("mutex is poisoned"); counter.value += 1; } fn main() { let counter = Arc::new(Mutex::new(Counter { value: 0 })); let counter_for_thread = counter.clone(); let handle = thread::spawn(move || increment_counter(&counter_for_thread)); handle.join().expect("thread panicked"); let counter = counter.lock().expect("mutex is poisoned"); println!("COUNTER: {}", counter.value); }
Notice that the thread needs an owned copy of the
Arc<Mutex<Counter>>
to be able to use it. This is why we need to callclone
on theArc
before passing it to the thread. Theclone
method onArc
only increments the reference count, so it is a cheap operation. Themove
keyword in the closure tells the compiler to move theArc
into the closure, so that the closure takes ownership of theArc
.
In this way, you can spawn many tasks each with their own reference to the shared state. Like so:
use std::sync::{Arc, Mutex}; use std::thread; struct Counter { value: i32, } fn increment_counter(counter: &Arc<Mutex<Counter>>) { let mut counter = counter.lock().expect("mutex is poisoned"); counter.value += 1; } fn main() { let counter = Arc::new(Mutex::new(Counter { value: 0 })); // The scope function is used to ensure that the spawned threads are joined before the main thread continues. thread::scope(|s| { for _ in 0..10 { let counter_for_thread = counter.clone(); s.spawn(move || increment_counter(&counter_for_thread)); } }); let counter = counter.lock().expect("mutex is poisoned"); println!("COUNTER: {}", counter.value); }