Back to the basics - A small database

In this chapter, we'll add some more people to the people.txt file, and explore how iterators can be used to perform various operations on this list of Person.

First update main to read the data for 4 people:

use crate::utils::{read_people, write_people, Person};

mod utils;

fn main() {
    let mut people = vec![];
    for _ in 0..4 {
        let person = Person::new();
        people.push(person)
    }

    match write_people(people) {
        Ok(_) => println!("people.txt was written successfully"),
        Err(err) => println!("There was error while writing people.txt: {err}"),
    }

    match read_people() {
        Ok(people) => print_people(people),
        Err(err) => println!("There was error while reading people.txt: {err}"),
    }
}

fn print_people(people: Vec<Person>) {
    println!(
        "people.txt was read successfully. {} people found.",
        people.len()
    );
}

Exercise

Run the program and enter the information for the people. Include two children (age < 18). When done, check the contents of people.txt.

If all went well, your program should end with the line: people.txt was read successfully. 4 people found.

Now that we have some data in our database, we do not want to re-enter this data when we restart the program. This means we need to find a way to check if the database already has entries, and only show the data input if it does not. We'll make the following changes:

use crate::utils::{read_people, write_people, Person};

mod utils;

fn main() {
    match read_or_create_db() {
        Ok(people) => print_people(people),
        Err(err) => println!("There was error while reading people.txt: {}", err),
    }
}

fn print_people(people: Vec<Person>) {
    println!(
        "people.txt was read successfully. {} persons found.",
        people.len()
    );
}

fn create_db() -> std::io::Result<Vec<Person>> {
    let mut people = vec![];
    for _ in 0..4 {
        let person = Person::new();
        people.push(person)
    }

    write_people(people)?;
    read_people()
}

fn read_or_create_db() -> std::io::Result<Vec<Person>> {
    match read_people() {
        Ok(people) => {
            if people.len() < 4 {
                // not enough people in the database; let's re-create
                create_db()
            } else {
                Ok(people)
            }
        }
        Err(_err) => {
            // database error; let's re-create
            create_db()
        }
    }
}

Notice the lack of ; at the end of the match and if arms. This means we are returning either the result of create_db(), or Ok(people) as the result of read_or_create_db().

Because the code is getting more complex, with a nested if block inside a match. I've added some comments to help myself (and others) read the code.

You can add a comment by prepending // to a line. Comments are ignored by the compiler.

Exercise

Run the program again and notice that it skips the input and immediately prints the number of people found in the database. Now open the file people.txt and delete the last 3 lines. Save it and run the program again. Check that it asks for the people data. For completeness, you should also delete the people.txt file completely and confirm that you are asked to enter the people again.

At this stage we have created a small database with 4 people in it. The database is read on startup and if successful, the print_people function is called with the list of people read.

Filtering people

We've done all this groundwork in order to experiment with iterators. Iterators allow you to perform a series of tasks on a sequence of items. One item at a time. We'll start by implementing a filter that will filter out all the children in our list of people:

fn print_people(people: Vec<Person>) {
    let kids: Vec<Person> = people.into_iter().filter(is_kid).collect();

    println!(
        "people.txt was read successfully. {} kids found.",
        kids.len()
    );
}

fn is_kid(person: &Person) -> bool {
    person.age < 18
}

Exercise

Run the program and check the output.

All the action is done with this one line:

let kids: Vec<Person> = people.into_iter().filter(is_kid).collect();

Let's focus on the second part of the statement. We see the following methods that are called in sequence on people:

  • into_iter()
  • filter()
  • collect()

into_iter and collect do the opposite. In our case into_iter takes the vector of people and creates an iterator for the list of people. The iterator is used by subsequent methods to perform a particular action on each of the elements individually. The result of this operation is sent to collect. collect gathers all the items and converts them back into a vector.

The filter method runs the is_kid function for every item. It will keep the items for which true is returned. Or maybe better it will filter out any items that return false.

We had to explicitly tell Rust the type of kids:

let kids: Vec<Person>

Rust will automatically figure out the type of a variable in most cases. In this case, the `collect' method needs a little help to understand what type we are collecting the results in.

Exercise

Experiment with the is_kid function by changing the test condition. See what effect this has on the number of items that is returned.

Operations in iterators can be chained together. Imagine we only want to display teenagers, we could add another filter to the chain of operations:

fn print_people(people: Vec<Person>) {
    let kids: Vec<Person> = people
        .into_iter()
        .filter(is_kid)
        .filter(is_a_teenager)
        .collect();

    println!(
        "people.txt was read successfully. {} teenagers found.",
        kids.len()
    );
}

fn is_kid(person: &Person) -> bool {
    person.age < 18
}

fn is_a_teenager(person: &Person) -> bool {
    person.age >= 10
}

Another common operation is map. map takes the input and converts it to another type. In our case, I'd like to display the names of the teenagers. I will use the map function to take these names from the Person and make a String with their name.

fn print_people(people: Vec<Person>) {
    let kids: Vec<String> = people
        .into_iter()
        .filter(is_kid)
        .filter(is_a_teenager)
        .map(extract_name)
        .collect();

    println!(
        "people.txt was read successfully. These teenagers found: {:?}",
        kids
    );
}

fn is_kid(person: &Person) -> bool {
    person.age < 18
}

fn is_a_teenager(person: &Person) -> bool {
    person.age >= 10
}

fn extract_name(person: Person) -> String {
    format!("{} {}", person.first_name, person.last_name)
}

Note that the vector's signature has changed to: Vec<String>.

The format! function in the extract_name function works similarly to print!. It formats the output in the same way, but instead of printing the result to the screen, it returns the resulting String.

Exercise

Experiment with format! to display the children in different ways.

The cool thing that we can do with a vector of strings is that we can use join to stitch these individual strings together into one big String, separated by a specific piece of text:

fn print_people(people: Vec<Person>) {
    let kids: Vec<String> = people
        .into_iter()
        .filter(is_kid)
        .filter(is_a_teenager)
        .map(extract_name)
        .collect();

    let kid_names = kids.join(", ");

    println!(
        "people.txt was read successfully. These teenagers found: {}",
        kid_names
    );
}

Closures

In the examples above, we've created three functions that contain only a single line of code: is_kid(), is_a_teenager(), and extract_name(). This is a bit of a waste of space, especially if we don't use these functions anywhere else in our code.

Rust provides a way to quickly create an anonymous function; a closure. A closure is a function without a name, so we call them anonymous functions. Closures have a slightly different signature than regular functions. In over 90% of cases, you do not need to specify the input and output of a closure, Rust will figure it out.

So the fn is_kid(person: &Person) -> bool can be rewritten as a closure like this:

|person| person.age < 18

We use the pipe symbol: | to capture the input parameters. Again, you typically do not need to specify the type of the input parameters. Closures capture their environment and can therefore figure out the types out on their own.

If we rewrite our iterator chain with closures, it looks like this:

fn print_people(people: Vec<Person>) {
    let kids: Vec<String> = people
        .into_iter()
        .filter(|person| person.age < 18)
        .filter(|person| person.age >= 10)
        .map(|person| format!("{} {}", person.first_name, person.last_name))
        .collect();

    let kid_names = kids.join(", ");

    println!(
        "people.txt was read successfully. These teenagers found: {}",
        kid_names
    );
}

We can now also delete the three functions we've replaced with closures. The combination of iterators and closures allows you to write very expressive code in a condensed way.

Exercise

Rewrite the iterator in the print_people function to display the lastnames of seniors (65+). If you need to amend your database you can either modify it directly in RustRover (by double-clicking the people.txt file), or delete the file and re-run your code to input the data from scratch.