Migrate to Rust from C

If you're coming from C, you'll find that Rust has a lot of similarities. However, there are some differences that you'll need to get used to.

Organizing code

In C, you can organize your code into separate files using header files and source files. In Rust, you can organize your code into modules. Modules allow you to group related code together and control the visibility of code within the module.

Here's an example of organizing code into modules in Rust:

mod math {
    pub fn add(a: i32, b: i32) -> i32 {
        a + b
    }
}

fn main() {
    use math::add;
    let result = add(1, 2);
    println!("{result}");
}

Although the above example is valid, it's not idiomatic Rust. In Rust, you would typically organize your code into separate files and use the mod keyword to include them in your main file. So the contents of the mod math {} block would be in a separate file called math.rs or math/mod.rs. The main.rs file would then look like this:

use math::add;

mod math;

fn main() {
    let result = add(1, 2);
    println!("{result}");
}

In C you would have a header file math.h and a source file math.c that would look like this:

// math.h
int add(int a, int b);

// math.c
#include "math.h"

int add(int a, int b) {
    return a + b;
}

// main.c
#include <stdio.h>
#include "math.h"

int main() {
    int result = add(1, 2);
    printf("%d\n", result);
}

Optimizing code for production

In C, you can use compiler flags like -O2 to optimize your code for production. In Rust, you can use the --release flag with the cargo build command to optimize your code for production.

Here's an example of compiling a C program with optimizations:

$ gcc -O2 -o program program.c

In Rust, the same code would look like this:

$ cargo build --release

The --release flag tells the Rust compiler to optimize the code for production. This includes inlining functions, removing debug symbols, and other optimizations that can improve performance.

Language features

Pointers

In C, you can create a pointer to a variable like this:

#include <stdio.h>

int main()
{
	int x = 10;
	int *ptr = &x;
	printf("Value of x: %d\n", *ptr);
}

In Rust, you can create a reference to a variable like this:

fn main() {
    let x = 10;
    let ptr = &x;
    println!("Value of x: {}", *ptr);
}

Lifetimes are a concept in Rust that ensures that references are valid for a certain period of time. This is a key difference between Rust and C. In C, you can have dangling pointers, which can lead to undefined behavior. In Rust, the compiler will ensure that references are valid for the lifetime of the reference. This is a blessing and a curse, as it can be a bit tricky to get used to at first.

For newcomers using a smart pointer like Rc, or Arc can be a good way to avoid dealing with lifetimes.

use std::rc::Rc;

fn main() {
    let x = 10;
    let ptr = Rc::new(x);
    println!("Value of x: {}", *ptr);
}

The overhead of using smart pointers is negligible in most cases, and it can save you a lot of headaches.

Memory Management

In C, you have to manage memory manually using malloc and free. This can be error-prone and lead to memory leaks and other issues. In Rust, memory management is handled by the compiler using the ownership system. This system ensures that memory is freed when it is no longer needed, and prevents common memory-related bugs like use-after-free and double-free errors.

Here's an example of manual memory management in C:

#include <stdio.h>
#include <stdlib.h>

int main()
{
    int *ptr = malloc(sizeof(int));
    *ptr = 10;
    printf("Value of x: %d\n", *ptr);
    free(ptr);
}

In Rust the same code would look like this:

fn main() {
    let x = 10;
    let ptr = &x;
    println!("Value of x: {}", *ptr);
}

The x will be dropped when it goes out of scope, and the memory will be freed automatically. In the above example, this happens immediately after the println! statement.

Strings

In C, strings are represented as arrays of characters. In Rust, strings are represented as String objects (editable), or &str string slices (static). Rust strings are UTF-8 encoded, which means they can contain any valid Unicode code point.

Here's an example of working with strings in C:

#include <stdio.h>

int main()
{
    char *name = "John";
    printf("Hello, %s!\n", name);
}

In Rust, the same code would look like this:

fn main() {
    let name = "John";
    println!("Hello, {name}!");
}

In Rust, you can also create a String object like this:

fn main() {
    let name = "John".to_string();
    println!("Hello, {name}!");
}

Arrays

In C, arrays are fixed-size collections of elements. In Rust, arrays are also fixed-size, but they are bounds-checked at runtime. This means that if you try to access an element outside the bounds of the array, your program will panic.

Here's an example of working with arrays in C:

#include <stdio.h>

int main()
{
    int arr[3] = {1, 2, 3};
    for (int i = 0; i < 3; i++) {
        printf("%d\n", arr[i]);
    }
}

In Rust, the same code would look like this:

fn main() {
    let arr = [1, 2, 3];
    for i in 0..3 {
        println!("{}", arr[i]);
    }
}

In Rust, you can also use the iter method to iterate over the elements of an array. This would be a more idiomatic way to write the above code:

fn main() {
    let arr = [1, 2, 3];
    arr.iter().for_each(|i| println!("{i}"));
}

Arrays vs. Vectors

In Rust, arrays are fixed-size collections of elements, while vectors are dynamic-size collections. Vectors are more flexible than arrays, but they have a performance cost. If you need a collection that can grow or shrink at runtime, you should use a vector. If you know the size of the collection at compile time, you should use an array.

Here's an example of working with vectors in Rust:

fn main() {
    let mut list = vec![1, 2, 3];
    list.push(4);
    list.iter().for_each(|i| println!("{i}"));
}

Ternary operator

In C, you can use the ternary operator ? : to conditionally assign a value to a variable. In Rust, you can use the if expression to achieve the same result.

Here's an example of using the ternary operator in C:

#include <stdio.h>

int main()
{
    int x = 10;
    int y = x > 5 ? 1 : 0;
    printf("%d\n", y);
}

In Rust, the same code would look like this:

fn main() {
    let x = 10;
    let y = if x > 5 { 1 } else { 0 };
    println!("{y}");
}

Dealing with file I/O

In C, you can read and write files using the fopen, fread, fwrite, and fclose functions. In Rust, you can use the std::fs module to read and write files.

Here's an example of reading a file in C:

#include <stdio.h>

int main()
{
    FILE *file = fopen("file.txt", "r");
    char buffer[256];
    fread(buffer, 1, 256, file);
    printf("%s\n", buffer);
    fclose(file);
}

In Rust, the same code would look like this:

use std::fs::File;
use std::io::Read;

fn main() {
    let mut file = File::open("file.txt").expect("Unable to open file");
    let mut buffer = String::new();
    file.read_to_string(&mut buffer).expect("Unable to read file");
    println!("{buffer}");
}

In Rust, you can use the std::fs::write function to write to a file:

use std::fs;

fn main() {
    let data = "Hello, world!";
    fs::write("another_file.txt", data).expect("Unable to write file");
}

In C you would use fwrite to write to a file:

#include <stdio.h>

int main()
{
    FILE *file = fopen("another_file.txt", "w");
    char data[] = "Hello, world!";
    fwrite(data, 1, sizeof(data), file);
    fclose(file);
}

Functions

In C, functions are defined using the function_name syntax. In Rust, functions are defined using the fn function_name syntax. Rust functions can take parameters and return values, just like C functions.

Here's an example of defining a function in C:

#include <stdio.h>

int add(int a, int b)
{
    return a + b;
}

int main()
{
    int result = add(1, 2);
    printf("%d\n", result);
}

In Rust, the same code would look like this:

fn add(a: i32, b: i32) -> i32 {
    a + b
}

fn main() {
    let result = add(1, 2);
    println!("{result}");
}

Note: In Rust, the last expression in a function is implicitly returned. This means that you don't need to use the return keyword to return a value from a function.

Void functions

In C, you can define a function that doesn't return a value using the void keyword. In Rust, you can define a function that doesn't return a value using the () type. This is called the unit type, but it is often not explicitly written.

Here's an example of defining a void function in C:

#include <stdio.h>

void greet()
{
    printf("Hello, world!\n");
}

int main()
{
    greet();
}

In Rust, the same code would look like this:

fn greet() {
    println!("Hello, world!");
}

fn main() {
    greet();
}

We could have written the greet function like this:

#![allow(unused)]
fn main() {
fn greet() -> () {
    println!("Hello, world!");
}
}

But it is not idiomatic Rust to write it like this. And Clippy will warn you about it:

warning: unneeded unit return type
 --> src/main.rs:1:11
  |
1 | fn greet() -> () {
  |           ^^^^^^ help: remove the `-> ()`
  |
  = help: for further information visit https://rust-lang.github.io/rust-clippy/master/index.html#unused_unit
  = note: `#[warn(clippy::unused_unit)]` on by default

For-loops

In C you can use a for loop to initialize and then iterate over a range of values. A typical C-style for loop looks like this:

#include <stdio.h>
#include <stdlib.h>

int main()
{
    int size = 10;
    int* arr = (int*)malloc(size * sizeof(int));   
    if (arr == NULL) { 
        exit(0);
    }  
    for (int i = 0; i < size; ++i) { 
        arr[i] = i + 1; 
    }
    for (int i = 0; i < size; ++i) { 
        printf("%d\n", arr[i]); 
    } 
}

What I've seen from people that move to Rust from C is that they often use the for loop in Rust in a C-style way:

fn main() {
    let size = 10;
    let mut arr = vec![0; size];

    for i in 0..size {
        arr[i] = i + 1;
    }

    for i in 0..size {
        println!("{}", arr[i]);
    }
}

This is not idiomatic Rust. And worse, it has performance implications. Rust uses bounds checking to ensure that you don't access elements outside the bounds of the array. This means that the above code will be slower than the equivalent C code. The idiomatic way to write the above code in Rust would be like this:

fn main() {
    let size = 10;
    let arr: Vec<i32> = (1..=size).collect();
    arr.iter().for_each(|i| println!("{i}"));
}

Using iterators is the idiomatic way to write loops in Rust. It is more concise and more performant than using a C-style for loop.

Macros

Rust has a powerful macro system that allows you to define custom syntax that gets evaluated at compile time. This is similar to the preprocessor in C. Here's an example of a macro in C:

#include <stdio.h>

#define LOG(message) printf("%s\n", message)

int main()
{
    LOG("Hello, world!");
}

In Rust, the same code would look like this:

macro_rules! log {
    ($message:expr) => {
        println!("{}", $message);
    };
}

fn main() {
    log!("Hello, world!");
}

Multi-threading

In C, you can create threads using the pthread library. In Rust, you can create threads using the std::thread module. A simple example in C that creates a thread looks like this:

#include <stdio.h>
#include <pthread.h>

void* print_message(void* message)
{
    printf("%s\n", (char*)message);
    return NULL;
}

int main()
{
    pthread_t thread;
    char* message = "Hello, world!";
    pthread_create(&thread, NULL, print_message, (void*)message);
    pthread_join(thread, NULL);
}

In Rust, the same code would look like this:

use std::thread;

fn print_message(message: &str) {
    println!("{message}");
}

fn main() {
    let message = "Hello, world!";
    let handle = thread::spawn(|| print_message(message));
    handle.join().expect("thread panicked");
}

Stuff that you can't do in "safe" Rust

I came across a C example that let you modify a global static variable from a thread, without any synchronization or locking mechanism. This is undefined behavior in Rust. Safe Rust will not allow you to modify a global static variable from a thread without using a synchronization primitive like a Mutex or an Atomic type.

In C you would be allowed to do this:

#include <stdio.h>
#include <pthread.h>

int counter = 0;

void* increment_counter(void* arg)
{
    counter++;
    return NULL;
}

int main()
{
    pthread_t thread;
    pthread_create(&thread, NULL, increment_counter, NULL);
    pthread_join(thread, NULL);
    printf("%d\n", counter);
}

To the credit of the author of the original C example, they did mention that this is not a good practice and that you should use a synchronization primitive like a mutex to protect the global variable. But as you can see, C is not going to stop you from doing this.

In "safe" Rust this is not possible. The code will not compile! Let's look at the equivalent "unsafe" Rust code:

use std::thread;

static mut COUNTER: i32 = 0;

fn increment_counter() {
    unsafe {
        COUNTER += 1;
    }
}

fn main() {
    let handle = thread::spawn(|| increment_counter());
    handle.join().expect("thread panicked");
    unsafe {
        println!("COUNTER: {}", COUNTER);
    }
}

The use of unsafe is a big red flag. It tells the developer that the code is doing something that will likely cause Undefined Behavior. I suggest you read more about Undefined Behavior (UB) in Rust, and why it is important to avoid it. UB will bite you when you least expect it. Stay safe! 😁

Exercise

Try removing the unsafe block from the above code and see what happens.

So now let's build the Rust code the way it should be built; assuming that you still want a global counter:

use std::sync::Mutex;
use std::thread;

static COUNTER: Mutex<i32> = Mutex::new(0);

fn increment_counter() {
    let mut counter = COUNTER.lock().expect("mutex is poisoned");
    *counter += 1;
}

fn main() {
    let handle = thread::spawn(|| increment_counter());
    handle.join().expect("thread panicked");
    let counter = COUNTER.lock().expect("mutex is poisoned");
    println!("COUNTER: {counter}");
}

This is the idiomatic way to write the above code in Rust. It uses a Mutex to protect the global counter from being modified by multiple threads at the same time. The Mutex ensures that only one thread can access the counter at a time.

Notice that we do not need to unlock the Mutex after we are done with it. The Mutex will be automatically dropped when it goes out of scope and the Drop trait of the Mutex will automatically unlock the mutex.

Also note that the inner value of the Mutex is 'consumed' by the Mutex when we create it with Mutex::new(0). This means that the Mutex takes ownership of the value and will be responsible for dropping it when it goes out of scope. This is why we don't need to drop the inner value of the Mutex after we are done with it.

Because the Mutex has ownership of the inner value, there is no way to access the inner value directly, so we don't need to worry about it being used incorrectly.

In Rust - and arguably in C as well - it is better to avoid global state as much as possible. What you will often see in Rust is that you will construct a struct that holds the state that you want to share between threads, and then pass a reference to that struct to the threads. This is a safer and more idiomatic way to write multithreaded code in Rust.

The mechanism to do this is to wrap the state in an Arc<Mutex<T>> where T is the type of the state. Arc is a thread-safe reference-counted pointer, and Mutex is a thread-safe mutual exclusion lock. This is a common pattern in Rust for sharing state between threads.

Let's redo the example once more using this pattern:

use std::sync::{Arc, Mutex};
use std::thread;

struct Counter {
    value: i32,
}

fn increment_counter(counter: &Arc<Mutex<Counter>>) {
    let mut counter = counter.lock().expect("mutex is poisoned");
    counter.value += 1;
}

fn main() {
    let counter = Arc::new(Mutex::new(Counter { value: 0 }));
    let counter_for_thread = counter.clone();
    let handle = thread::spawn(move || increment_counter(&counter_for_thread));
    handle.join().expect("thread panicked");
    let counter = counter.lock().expect("mutex is poisoned");
    println!("COUNTER: {}", counter.value);
}

Notice that the thread needs an owned copy of the Arc<Mutex<Counter>> to be able to use it. This is why we need to call clone on the Arc before passing it to the thread. The clone method on Arc only increments the reference count, so it is a cheap operation. The move keyword in the closure tells the compiler to move the Arc into the closure, so that the closure takes ownership of the Arc.

In this way, you can spawn many tasks each with their own reference to the shared state. Like so:

use std::sync::{Arc, Mutex};
use std::thread;

struct Counter {
    value: i32,
}

fn increment_counter(counter: &Arc<Mutex<Counter>>) {
    let mut counter = counter.lock().expect("mutex is poisoned");
    counter.value += 1;
}

fn main() {
    let counter = Arc::new(Mutex::new(Counter { value: 0 }));

    // The scope function is used to ensure that the spawned threads are joined before the main thread continues.
    thread::scope(|s| {
        for _ in 0..10 {
            let counter_for_thread = counter.clone();
            s.spawn(move || increment_counter(&counter_for_thread));
        }
    });

    let counter = counter.lock().expect("mutex is poisoned");
    println!("COUNTER: {}", counter.value);
}

Reference material