Better Rust code with TDD

The most challenging aspect of TDD is to find the right "level" of testing. If you start writing tests for high-level functions, you'll end up with tests that are large, complicated and need a lot of "mocking" or dependency injection.

My suggestion is to write tests from the "inside out". Start with the smallest possible unit of code and write tests for that. Then move on to the next level of code and write tests for that. This way you'll end up with a suite of tests that are small, simple and fast to run.

The alternative approach would be to write tests for the high-level functions first. This is called "top-down" testing. This approach is often used when you have legacy code, and you want to add tests to it. This is not well-suited for TDD.

In this chapter, we'll explore how code will look like when written with TDD and with "top-down" testing. We'll start with the latter.

Top-down testing

Building on the previous chapter, imagine that we need to read a line from a file and then parse the comma-separated fields, and return these in a Vec.

As seasoned Rustaceans, we know that we can use the fs module to read the file and the split method to parse the fields. So we write the following. Add an "input.txt" file with the content "Tom,Dick,,Harry" to the project root. Run the program and confirm all is fine!

use std::fs;

fn main() {
    let filename = "input.txt";
    let fields = parse_fields_from_file(filename);
    println!("{fields:?}");
}

fn parse_fields_from_file(filename: &str) -> Vec<String> {
    let lines = fs::read_to_string(filename).unwrap();
    let first_line = lines.lines().next().unwrap();
    first_line.split(",").filter(|f| !f.is_empty()).map(|f| f.to_string()).collect()
}

Now the QA team reminds me that I need a test to make sure there is enough code coverage. As I start writing the test, I realize that in my unit test I don't have access to the "input.txt" file. As it turns out, there is no easy way to test this code without refactoring it. 🤔

Test-driven Development

Let's do the same exercise using TDD.

We'll start by testing & writing the smallest piece of logic we can think of, which in this case (coincidentally) is parsing the single line of comma-separated values. Exactly what we did in the previous chapter! As a reminder:

fn main() {}

fn parse_fields(csv: &str) -> Vec<String> {
    csv.split(',')
        .map(|s| s.trim())
        .filter(|s| !s.is_empty())
        .map(|s| s.to_string())
        .collect()
}

#[cfg(test)]
mod tests {
    use super::*;

    /// This test validates that we can parse comma-separated input like: "Tom,Dick,,Harry" into
    /// a Vec of Strings. Empty fields should be skipped.
    #[test]
    fn can_parse_fields() {
        let fields = parse_fields("");
        assert!(fields.is_empty());

        let fields = parse_fields("Tom");
        assert_eq!(fields.len(), 1);
        assert_eq!(fields.first().unwrap(), "Tom");

        let fields = parse_fields("Tom,Dick");
        assert_eq!(fields.len(), 2);
        assert_eq!(fields.first().unwrap(), "Tom");
        assert_eq!(fields.get(1).unwrap(), "Dick");

        let fields = parse_fields("Tom,,Dick");
        assert_eq!(fields.len(), 2);
        assert_eq!(fields.first().unwrap(), "Tom");
        assert_eq!(fields.get(1).unwrap(), "Dick");

        let fields = parse_fields("Tom,Dick,,Harry");
        assert_eq!(fields.len(), 3);
        assert_eq!(fields.first().unwrap(), "Tom");
        assert_eq!(fields.get(1).unwrap(), "Dick");
        assert_eq!(fields.get(2).unwrap(), "Harry");

        let fields = parse_fields(",Tom, Dick,, ,Harry,");
        assert_eq!(fields.len(), 3);
        assert_eq!(fields.first().unwrap(), "Tom");
        assert_eq!(fields.get(1).unwrap(), "Dick");
        assert_eq!(fields.get(2).unwrap(), "Harry");
    }
}

The next piece of logic to test & write would be to fetch the first line of a newline-separated String. Let's add the test for the most trivial case.

fn main() {}

fn parse_fields(csv: &str) -> Vec<String> {
    csv.split(',')
        .map(|s| s.trim())
        .filter(|s| !s.is_empty())
        .map(|s| s.to_string())
        .collect()
}

#[cfg(test)]
mod tests {
    use super::*;

    /// This test validates that we can parse comma-separated input like: "Tom,Dick,,Harry" into
    /// a Vec of Strings. Empty fields should be skipped.
    #[test]
    fn can_parse_fields() {
        // ...
    }

    /// Confirm that we can retrieve the first line of a newline-separated String
    #[test]
    fn can_parse_lines() {
        let first_line = read_first_line("");
        assert_eq!(first_line, "");
    }
}

The first version of the parse_lines() function could be:

fn read_first_line(file_contents: &str) -> &str {
    file_contents
}

Using TDD, we'll extend the test coverage and function code, and end up with this:

fn main() {}

fn parse_fields(csv: &str) -> Vec<String> {
    csv.split(',')
        .map(|s| s.trim())
        .filter(|s| !s.is_empty())
        .map(|s| s.to_string())
        .collect()
}

fn read_first_line(file_contents: &str) -> &str {
    file_contents.split('\n').find(|s| !s.is_empty())
        .unwrap_or("")
}

#[cfg(test)]
mod tests {
    use super::*;

    /// This test validates that we can parse comma-separated input like: "Tom,Dick,,Harry" into
    /// a Vec of Strings. Empty fields should be skipped.
    #[test]
    fn can_parse_fields() {
        let fields = parse_fields("");
        assert!(fields.is_empty());

        let fields = parse_fields("Tom");
        assert_eq!(fields.len(), 1);
        assert_eq!(fields.first().unwrap(), "Tom");

        let fields = parse_fields("Tom,Dick");
        assert_eq!(fields.len(), 2);
        assert_eq!(fields.first().unwrap(), "Tom");
        assert_eq!(fields.get(1).unwrap(), "Dick");

        let fields = parse_fields("Tom,,Dick");
        assert_eq!(fields.len(), 2);
        assert_eq!(fields.first().unwrap(), "Tom");
        assert_eq!(fields.get(1).unwrap(), "Dick");

        let fields = parse_fields("Tom,Dick,,Harry");
        assert_eq!(fields.len(), 3);
        assert_eq!(fields.first().unwrap(), "Tom");
        assert_eq!(fields.get(1).unwrap(), "Dick");
        assert_eq!(fields.get(2).unwrap(), "Harry");

        let fields = parse_fields(",Tom, Dick,, ,Harry,");
        assert_eq!(fields.len(), 3);
        assert_eq!(fields.first().unwrap(), "Tom");
        assert_eq!(fields.get(1).unwrap(), "Dick");
        assert_eq!(fields.get(2).unwrap(), "Harry");
    }

    /// Confirm that we can retrieve the first line of a newline-separated String
    #[test]
    fn can_parse_lines() {
        let first_line = read_first_line("");
        assert_eq!(first_line, "");

        let first_line = read_first_line("\n");
        assert_eq!(first_line, "");

        let first_line = read_first_line("a");
        assert_eq!(first_line, "a");

        let first_line = read_first_line("a\nb");
        assert_eq!(first_line, "a");

        let first_line = read_first_line("\na\nb");
        assert_eq!(first_line, "a");

        let first_line = read_first_line("\na\nb\n");
        assert_eq!(first_line, "a");
    }
}

Another important aspect of writing tests is that we think carefully about what we are trying to test. And, perhaps more importantly, that we not test code that is not part of our program, such as code from the standard Rust library.

So in our case, there is no need to verify that fs::read_to_string() works as expected. We can assume that it does. With this in mind, we can write the following code:

fn main() {
    let filename = "input.txt";
    let fields = parse_fields_from_file(filename);
    println!("{fields:?}");
}

fn parse_fields(csv: &str) -> Vec<String> {
    csv.split(',')
        .map(|s| s.trim())
        .filter(|s| !s.is_empty())
        .map(|s| s.to_string())
        .collect()
}

fn read_first_line(file_contents: &str) -> &str {
    file_contents.split('\n').find(|s| !s.is_empty())
        .unwrap_or("")
}

fn parse_fields_from_file(filename: &str) -> Vec<String> {
    let file_contents = std::fs::read_to_string(filename)
        .expect("Could not read file");
    let first_line = read_first_line(&file_contents);
    parse_fields(first_line)
}

#[cfg(test)]
mod tests {
    use super::*;

    /// This test validates that we can parse comma-separated input like: "Tom,Dick,,Harry" into
    /// a Vec of Strings. Empty fields should be skipped.
    #[test]
    fn can_parse_fields() {
        let fields = parse_fields("");
        assert!(fields.is_empty());

        let fields = parse_fields("Tom");
        assert_eq!(fields.len(), 1);
        assert_eq!(fields.first().unwrap(), "Tom");

        let fields = parse_fields("Tom,Dick");
        assert_eq!(fields.len(), 2);
        assert_eq!(fields.first().unwrap(), "Tom");
        assert_eq!(fields.get(1).unwrap(), "Dick");

        let fields = parse_fields("Tom,,Dick");
        assert_eq!(fields.len(), 2);
        assert_eq!(fields.first().unwrap(), "Tom");
        assert_eq!(fields.get(1).unwrap(), "Dick");

        let fields = parse_fields("Tom,Dick,,Harry");
        assert_eq!(fields.len(), 3);
        assert_eq!(fields.first().unwrap(), "Tom");
        assert_eq!(fields.get(1).unwrap(), "Dick");
        assert_eq!(fields.get(2).unwrap(), "Harry");

        let fields = parse_fields(",Tom, Dick,, ,Harry,");
        assert_eq!(fields.len(), 3);
        assert_eq!(fields.first().unwrap(), "Tom");
        assert_eq!(fields.get(1).unwrap(), "Dick");
        assert_eq!(fields.get(2).unwrap(), "Harry");
    }

    /// Confirm that we can retrieve the first line of a newline-separated String
    #[test]
    fn can_parse_lines() {
        let first_line = read_first_line("");
        assert_eq!(first_line, "");

        let first_line = read_first_line("\n");
        assert_eq!(first_line, "");

        let first_line = read_first_line("a");
        assert_eq!(first_line, "a");

        let first_line = read_first_line("a\nb");
        assert_eq!(first_line, "a");

        let first_line = read_first_line("\na\nb");
        assert_eq!(first_line, "a");

        let first_line = read_first_line("\na\nb\n");
        assert_eq!(first_line, "a");
    }
}

As you can see, our main function is identical to our other attempt. But our code is much better structured and has proper test coverage.

The importance of having properly structured code with good test coverage increases when it comes to extending the functionality of the code. As an exercise, try extending the program using TDD to retrieve the fields from all the lines in the file.

Clean-up and refactoring

When you have a suite of tests that cover your code, you can refactor your code with confidence. You can change the implementation of a function, and as long as the tests pass, you can be sure that you haven't broken anything.

The test suite itself is also part of the code. It is important to keep the tests clean and well-structured. If the tests are messy, it will be difficult to understand what the code is supposed to do.

In the above example, we have a lot of repetition in the can_parse_fields tests. We can refactor the tests to make them more readable:

fn assert_fields(input: &str, expected: &[&str]) {
    let fields = parse_fields(input);
    assert_eq!(fields.len(), expected.len());
    for (i, field) in fields.iter().enumerate() {
        assert_eq!(field, expected.get(i).unwrap());
    }
}

/// This test validates that we can parse comma-separated input like: "Tom,Dick,,Harry" into
/// a Vec of Strings. Empty fields should be skipped.
#[test]
fn can_parse_fields() {
    let fields = parse_fields("");
    assert!(fields.is_empty());

    assert_fields("Tom", &["Tom"]);
    assert_fields("Tom,Dick", &["Tom", "Dick"]);
    assert_fields("Tom,,Dick", &["Tom", "Dick"]);
    assert_fields("Tom,Dick,,Harry", &["Tom", "Dick", "Harry"]);
    assert_fields(",Tom, Dick,, ,Harry,", &["Tom", "Dick", "Harry"]);
}

Refactoring the tests in this way makes them easier to maintain and more inherently meaningful.

Conclusion

Using Test-Driven Development will result in better structured code. As a side effect, your code will have good test coverage.

The tests themselves are also a form of documentation. They show how the code should work. If you need to change the code, you can look at the tests to see what the code is supposed to do.