class: center, middle # Rust Crash Course Rohan Kumar Slides by Rahul Kumar and Edward Zeng --- # Agenda 1. Basic syntax 2. The borrow checker 3. Idiomatic Rust 4. Advice --- # Why Rust? -- * Memory safe! * Fast executables. * Fewer runtime bugs. --- # Disclaimers * You won't become proficient in Rust just by watching this. -- * Like C, all Rust statements must be enclosed in a function (eg. `main`). * But for simplicity, we will show code that appears to be free standing. --- class: center, middle # Basic syntax --- # Defining variables -- Define a variable named `x` of type `i32`: -- ```rust let x: i32 = 2; ``` -- If you omit the type, the compiler will try to guess: -- ```rust // The default integer type is i32. let x = 2; ``` -- Variables are **immutable** by default. -- ```rust // This won't compile. let x = 2; x = 3; ``` -- Must declare variables **mutable** if you want to modify them: -- ```rust // This is OK. let mut x = 2; x = 3; ``` --- # Re-defining variables Does this work? ```rust let x = 2; let x = 3; ``` -- Yes, you are defining a new variable. -- The variables can even be of different types: ```rust let x = 2; let x = "Hello!"; ``` -- This is called **shadowing**. --- # References Rust has **references**. They are like pointers in C, but are: -- * Always valid (no pointers to freed memory) * Never null -- The Rust compiler checks these properties at **compile time**! -- Reference with `&`, dereference with `*`. -- ```rust let x = 2; let ptr = &x; assert_eq!(*ptr, 2); ``` --- # Mutable References By default, references are **immutable**. -- This won't work: ```rust let mut x = 2; let ptr = &x; **ptr = 3; // Cannot mutate through an immutable reference. ``` -- If you want to modify a variable through a pointer, you must have a **mutable reference**. -- ```rust let mut x = 2; let ptr = &mut x; **ptr = 3; // This is OK. ``` -- To obtain a mutable reference, the variable itself must be mutable: ```rust let x = 2; let ptr = &mut x; // Cannot mutably reference an immutable variable. ``` -- In Rust lingo, obtaining a reference is called **borrowing**. --- # Primitive types * Signed integers: `i8, i16, i32, i64, i128, isize` * Unsigned integers: `u8, u16, u32, u64, u128, usize` * Floating point: `f32, f64` * Boolean: `bool` * Character: `char` (use single quotes, eg. `'a'`) --- # Compound types ##### Tuples: ```rust // The type annotation is unnecessary let my_tuple: (i32, char, bool) = (162, 'X', true); let second: char = my_tuple.1; // 'X' ``` -- ##### Arrays: ```rust // Arrays have a fixed size, as indicated in the (optional) type annotation let nums: [i32; 5] = [1, 2, 3, 4, 5]; let second = nums[1]; ``` -- ##### Structs: ```rust struct Coordinate { x: i32, y: i32, } // ... let point = Coordinate { x: 5, y: 3, }; ``` ??? No need to remember exact syntax. --- # Functions -- A function that squares the input: ```rust fn square(x: i32) -> i32 { return x * x; } ``` -- Equivalent to: ```rust fn square(x: i32) -> i32 { x * x } ``` -- If a function does not return a value, just omit the return type: ```rust fn add_one(x: &mut i32) { *x += 1; } ``` --- # Hello World! ```rust fn main() { println!("Hello world!"); } ``` -- `println` is a **macro**. You can tell by the exclamation mark. -- The compiler expands macros during compilation. The macro will be replaced by "regular" Rust code. The distinction isn't too important when you're getting started, but just be aware that `println` is not a typical function. --- # If statements Rust has conditional if statements. Parentheses are discouraged. ```rust let x = 4; if x > 5 { println!("Greater than 5"); } else if x > 3 { println!("Greater than 3"); } else { println!("x was not big enough"); } ``` -- All expressions can evaluate to a value. -- This code is equivalent: ```rust let x = 4; let message = if x > 5 { "Greater than 5" } else if x > 3 { "Greater than 3" } else { "x was not big enough" }; println!("{}", message); ``` ??? Note the placement of semicolons. --- # Loops ```rust loop { println!("stuck in a loop!"); // This will repeat until the program is stopped. } ``` -- Use `break` to exit: ```rust // A loop that exits immediately. loop { break; } ``` --- # While loops Rust `while` loops are fairly straightforward: ```rust let mut count = 5; while count > 0 { count -= 1; } ``` --- # For loops For loops can iterate over a collection (more like Python than C). ```rust let nums = [0, 1, 2, 3, 4]; for num in nums { print!("{} ", num); } // Prints 0 1 2 3 4 ``` -- Shorthand range notation: ```rust for num in 0..5 { print!("{} ", num); } // Prints 0 1 2 3 4 ``` -- This code is equivalent: ```rust for num in 0..=4 { print!("{} ", num); } // Prints 0 1 2 3 4 ``` --- # Enums Useful when a type should have only a few possible values: ``` enum Coin { Head, Tail, } let a = Coin::Head; let b = Coin::Tail; ``` -- Each value in an enum is called a **variant**. --- # Enums Enum for different operating systems: ``` enum OperatingSystem { Mac, Windows, Linux, Other, } ``` -- Even better: ``` enum OperatingSystem { Mac, Windows, Linux, Other(String) } let a = OperatingSystem:Linux; let b = OperatingSystem:Other("Redox OS".to_string()); ``` Enum variants can store data! --- # Matching Rust **match expressions** are like C `switch` statements. However, they must _always be exhaustive_. -- What's wrong? ``` let num = 162; match num { 160 => println!("160"), 161 => println!("161"), 162 => println!("162"), 168 => println!("168"), } ``` -- The match is not exhaustive! What if `num` was 164? --- # Matching This **is** valid: ``` let num = 162; match num { 160 => println!("160"), 161 => println!("161"), 162 => println!("162"), 168 => println!("168"), _ => println!("another course"), } ``` The underscore matches anything that was not already matched. -- Each pattern in the `match` statement is called a **match arm**. --- # Matching Matching is very useful in combination with enums. Match expressions can also evaluate to a value (just like any other expression). -- ``` enum OperatingSystem { Mac, Windows, Linux, Other(String) } fn os_name(os: OperatingSystem) -> String { match os { OperatingSystem::Mac => "mac".to_string(), OperatingSystem::Windows => "windows".to_string(), OperatingSystem::Linux => "linux".to_string(), OperatingSystem::Other(s) => s, } } ``` ??? Pop quiz, why does `os_name` not have a return statement? --- # `Impl` blocks -- Suppose we have the following struct definition (not to be confused with `Vec`): ``` struct Vector { x: f64, y: f64, } ``` -- We might want to add 2 `Vector`s elementwise. Here's one way to do that: ``` fn add(v1: &Vector, v2: &Vector) -> Vector { Vector { x: v1.x + v2.x, y: v2.y + v2.y, } } // let (v1, v2) = ...; let sum = add(&v1, &v2); ``` --- # `Impl` blocks -- We can also do the same thing using `impl` blocks, which define methods related to a struct (or enum). ``` impl Vector { fn add(&self, other: &Self) -> Self { Self { x: self.x + other.x, y: self.y + other.y, } } } // let (v1, v2) = ...; let sum = v1.add(&v2); // Can also be called like this: let sum = Vector::add(&v1, &v2); ``` -- `Self` is a shorthand for the type of the `impl` block (in this case, `Vector`). -- We'll now go into more detail about what the `&self` argument does. --- # Arguments to `impl` block functions First argument is `&self`: the compiler will immutably borrow the object. -- This: ``` let v1 = ...; let sum = v1.add(&v2); ``` is approximately equivalent to: -- ``` let v1 = ...; *let reference = &v1; let sum = Vector::add(reference, &v2); ``` --- # Arguments to `impl` block functions First argument is `&mut self`: The compiler will mutably borrow the object. -- ``` impl Vector { fn double(&mut self) { self.x *= 2; self.y *= 2; } } ``` -- ``` let mut v1 = ...; v1.double(); ``` is approximately equivalent to: -- ``` let mut v1 = ...; // Automatically generated by the compiler. let reference = &mut v1; Vector::double(reference); ``` -- Normal borrow checking rules apply! --- class: center, middle # The Borrow Checker --- # Why have a borrow checker? We are trying to solve the problem of when to allocate and deallocate memory. * In C, you have to do this manually via `malloc` and `free`. * In Go, a garbage collector runs periodically to free values that are no longer usable. In Rust, the borrow checker frees a variable as soon as it goes out of scope! --- # Borrow checker The **borrow checker** is what makes Rust _very_ different from other languages. The borrow checker verifies a set of rules at compile time. It does the magic of making sure your references are always valid (among other things). The borrow checker rules can initially seem mysterious. But they become easier with practice. --- # Basic rules Here's one set of borrow checking rules: * Every value has one and only one owner. * When a value's owner goes out of **scope**, the value is **dropped** (Rust lingo for "freed"). * To drop a value `v` _early_, call `drop(v)`. --- # Scopes **Scopes** are enclosed in curly braces: ```rust fn do_stuff() { let a = String::from("hello"); { let b = String::from("goodbye"); // Can access a and b // ... // b goes out of scope and is dropped } // Cannot access b here: it is out of scope // a is dropped at the end of the function } ``` Functions, loops, if statements, etc. have their own scope. You can also create nested scopes using curly braces. --- # Moves Every value has one owner. Sometimes that owner can change. This is called a **move**. Assignment moves values. This is invalid: ```rust let s1 = String::from("my string"); let s2 = s1; // Ownership of the string moves from s1 to s2. // s1 no longer owns the string, so we can't access data via s1. println!("{}", s1); // This is an error; data has been moved out of s1. ``` -- This is fine: ```rust let s1 = String::from("my string"); let s2 = s1; // Ownership of the string moves from s1 to s2. println!("{}", s2); // This is okay; s2 owns the string now. ``` --- # Cloning If you need multiple variables to own data, you can `clone` a value: ```rust let s1 = String::from("my string"); let s2 = s1.clone(); // s2 is a clone of s1 println!("{}", s2); // This is okay; s2 owns its data. println!("{}", s1); // This is okay; s1 also owns its data. ``` -- Note that cloning is usually _expensive_. In this case, we are allocating memory for a string _twice_. In the previous examples, we only allocated space for the string once. -- Cloned values are completely independent of the value they were cloned from. If `s1` is modified, code using `s2` will not see those changes. --- # Copy Certain types (like integers) are cloned automatically when they are moved. These types are said to implement `Copy`. -- For example: ```rust let x = 5; let y = x; println!("{} {}", x, y); ``` This is fine because the value in `x` (5) is copied, not moved. So `x` and `y` both own their values and can be accessed. -- In general, types that require heap allocations are not `Copy`. -- `Copy`: integers, floats, booleans, chars, immutable references, and compound types containing only `Copy` types. Not `Copy`: Strings, Vectors, mutable references, and compound types containing at least one non-`Copy` type. --- # Deriving `Copy` Consider the following struct: ``` struct Person { id: u64, age: u32, } ``` Is it `Copy`? -- **No.** But shouldn't it be `Copy`, since it only contains `Copy` types? -- We must explicitly tell the Rust compiler we wish to make it `Copy`: ``` *#[derive(Copy)] struct Person { id: u64, age: u32, } ``` Can also make structs cloneable by adding `#[derive(Clone)]`. --- # More moves Passing a value to a function moves the value: ```rust fn main() { let s = String::from("hello"); do_stuff(s); // s no longer accessible; it was moved into do_stuff } fn do_stuff(s: String) { // do stuff with s } ``` -- Returning a value from a function moves the value to the caller: ```rust fn main() { let s = get_string(); // We can now use s, which owns the string } fn get_string() -> String { String::from("hello") } ``` --- # Cool features of the borrow checker You can have multiple immutable pointers to the same variable. ```rust let x = 162; let p1 = &x; let p2 = p1; ``` `x` is said to be **aliased**: multiple variables can read (not modify) the variable. -- On the other hand, this is _forbidden_: ```rust let mut x = 162; let p1 = &mut x; let p2 = &mut x; ``` -- This is also forbidden: ```rust let mut x = 162; let p1 = &mut x; let p2 = &x; ``` --- # Remember this bug? ```c int count_words(WordCount** wclist, FILE* infile) { char inbuf[MAX_WORD_LEN]; size_t len; while ((len = get_word(infile, inbuf, MAX_WORD_LEN))) { if (len > 1) { add_word(wclist, inbuf); } } return 0; } ``` ```c int add_word(WordCount** wclist, char* word) { WordCount* wc = find_word(*wclist, word); if (wc) { wc->count++; } else { // ... wc->word = word; // ... } return 0; } ``` --- # Let's try making a similar mistake with Rust ```rust fn count_words(infile: &mut File) -> usize { let mut infile = BufReader::new(infile); let mut wclist: Vec<&Vec
> = Vec::new(); let mut inbuf = Vec::new(); while infile.read_until(b'\n', &mut inbuf).expect("read_until failed") > 0 { wclist.push(&inbuf); } return wclist.len(); } ``` -- Get the following compiler error: ``` error[E0502]: cannot borrow `inbuf` as mutable because it is also borrowed as immutable --> src/main.rs:10:32 | 10 | .read_until(b'\n', &mut inbuf) | ^^^^^^^^^^ mutable borrow occurs here ... 14 | wclist.push(&inbuf); | ------------------- | | | | | immutable borrow occurs here | immutable borrow later used here For more information about this error, try `rustc --explain E0502`. error: could not compile `count` due to previous error ``` --- # No dangling pointers! Rust ensures that you don't create dangling references _at compile time_. This code won't compile: ```rust fn get_string() -> &String { let s = String::from("hi"); &s } // s is dropped at the end of this function, // so &s would be a dangling pointer. // The Rust compiler won't allow this. ``` -- Key point: a reference can never outlive the value it points to! --- # Remember that bug? ```c char* get_word(FILE* infile) { char inbuf[MAX_WORD_LEN]; // Populate `inbuf` with the first word in `infile`. return inbuf; } ``` --- # Let's try making the same mistake with Rust ```rust fn get_word() -> &str { let s = String::from("hi"); &s } ``` -- Get the following compiler error: ``` error[E0106]: missing lifetime specifier --> src/main.rs:4:18 | 4 | fn get_word() -> &str { | ^ expected named lifetime parameter | = help: this function's return type contains a borrowed value, but there is no value for it to be borrowed from help: consider using the `'static` lifetime | 4 | fn get_word() -> &'static str { | +++++++ ``` --- # Let's try making the same mistake with Rust ```rust fn get_word() -> &'static str { let s = String::from("hi"); &s } ``` -- Get the following compiler error: ``` error[E0515]: cannot return reference to local variable `s` --> src/main.rs:6:5 | 6 | &s | ^^ returns a reference to data owned by the current function ``` --- # Summary of borrowing rules * References are always valid and non-null. * Every value has one owner. * Values are freed when their owner goes out of scope. * Assignment moves values (unless the value is `Copy`). * Values that allocate memory on the heap are usually not `Copy`. * You can have one mutable reference, or multiple immutable references. But not both. * A reference can never outlive its value. --- class: center, middle # Rust idioms Rust has many, many convenience types. --- # Vectors Dynamically sized arrays. Create a `Vec`: ```rust // The type annotation is needed if the compiler can't determine the element type. let x: Vec
= Vec::new(); ``` -- Or use the `vec!` macro: ```rust let x = vec![1, 2, 3]; let y = vec![162; 3]; // Equivalent to vec![162, 162, 162]. ``` -- Operations on a `Vec`: ```rust let mut x = vec![1, 2, 3]; assert_eq(x.len(), 3); x.push(4); ``` -- Read the [docs](https://doc.rust-lang.org/std/vec/struct.Vec.html)! There are _many_ convenience methods. --- # Options Pointers are never null! What if you actually _want_ something to be null? -- Use an `Option
`! Here's the definition of `Option`, from the standard library: ``` pub enum Option
{ None, Some(T), } ``` -- Two possible cases: the option is either `None`, or it is `Some`. If an `Option` is `Some`, the value in the `Some` variant will always be a valid value of type `T`. --- # Options Here's how you can use an option: ``` let x = Some(4); assert!(x.is_some()); let y = x.unwrap(); assert_eq!(y, 4); // This type annotation IS necessary! let z: Option
= None; assert!(z.is_none()); let w = Some(String::from("hello")); match w { Some(s) => println!("{} world!", s), None => panic!("didn't expect to get here"), } ``` Again, there are _many_ convenience methods. Read the [docs](https://doc.rust-lang.org/std/option/)! --- # Error handling The idiomatic way to handle errors in Rust is via `Result
`: ``` pub enum Result
{ Ok(T), Err(E), } ``` -- Functions that may not complete successfully should return a `Result`. -- If the result is `Ok`, the caller can access the returned value (of generic type `T`). -- If the result is `Err`, additional information (of generic type `E`) about the error is returned. -- If a program cannot recover from an error, you can `panic!` instead of returning a `Result`. The panic will immediately terminate the program, just like an exception in C. --- # Error handling Example: ``` use std::fs::File; fn main() { let f = File::open("hello.txt"); let f = match f { Ok(file) => file, Err(error) => panic!("Problem opening the file: {:?}", error), }; } ``` `File::open` returns a `Result
`. -- We check if the `open` was successful; if not, we panic and exit. Otherwise, we extract the file itself from the result. --- # Error handling Matching on `Result`s all the time can be tedious. If we know we are going to panic on an `Err`, we can use `unwrap()` instead: ``` use std::fs::File; fn main() { let f = File::open("hello.txt").unwrap(); } ``` -- `unwrap()` panics on error; otherwise, it returns the data contained in the `Ok` variant. --- # Error handling Another shortcut is the `?` operator: ``` fn read_file() -> Result
{ let mut f = File::open("hello.txt")?; let mut s = String::new(); f.read_to_string(&mut s)?; Ok(s) } ``` -- The approximately equivalent code: ``` fn read_file() -> Result
{ let mut f = match File::open("hello.txt") { Ok(f) => f, Err(e) => return Err(e), }; let mut s = String::new(); match f.read_to_string(&mut s) { Ok(_) => {}, Err(e) => return Err(e), }; Ok(s) } ``` --- # Error handling If you don't care about returning an easy-to-inspect error type, you can return ``` Result
>; ``` The `?` operator can convert _most_ error types into `Box
`. -- ``` /// Return the first line of the given file. pub fn firstline(filename: &str) -> Result
> { let file = std::fs::File::open(filename)?; // Potential IO Error let mut reader = std::io::BufReader::new(file); let mut buf = vec![]; let len = reader.read_until(b'\n', &mut buf)?; // Potential IO Error let result = String::from_utf8(buf)?; // Potential FromUtf8Error Ok(result) } ``` --- # Further reading There's a lot we haven't covered! Here are some things that we strongly suggest you read about: * [Collections](https://doc.rust-lang.org/book/ch08-00-common-collections.html) * [Smart Pointers](https://doc.rust-lang.org/book/ch15-00-smart-pointers.html) * [Concurrency](https://doc.rust-lang.org/book/ch16-00-concurrency.html) * [Generics, Traits, and Lifetimes](https://doc.rust-lang.org/book/ch10-00-generics.html) Some of this material is crucial to understand for the assignments in this course (if you plan to do them in Rust, of course). --- # Possibly incorrect advice * Writing a Rust program is harder than writing a C program. * Writing a _correct_ Rust program is easier than writing a _correct_ C program. * Learn by doing! * The compiler usually prints very helpful error messages. Read and understand them! --- # Other resources * [The Rust Book](https://doc.rust-lang.org/book) * [Rust by Example](https://doc.rust-lang.org/rust-by-example) * [Rustlings](https://github.com/rust-lang/rustlings) * [A half hour to learn Rust](https://fasterthanli.me/articles/a-half-hour-to-learn-rust) --- # Acknowledgements Some of the content and examples in these slides were drawn from: * [The Rust Book](https://doc.rust-lang.org/book/) * [Effective Rust](https://www.lurklurk.org/effective-rust/cover.html) --- class: center, middle # Rustlings Let's dive into some coding! --- class: center, middle # Appendix Extra information that may be helpful. --- # Modules * Rust code can be organized into **modules**. Modules serve as namespaces. -- * Code is generally **private** by default, meaning it can only be accessed within the current module. -- * To make something visible outside a module, declare it with the `pub` (for public) keyword. -- * Aside: there are levels _between_ private and public; see `pub(crate)` and `pub(super)` for examples. --- # Modules Modules are defined using the `mod` keyword: ``` mod inner { fn add_two(x: &mut i32) { *x += 2; } } fn outer() { let mut x = 160; inner::add_two(&mut x); assert_eq!(x, 162); } ``` --- # Modules You can move a module to a separate file, but you must declare that the module exists. Suppose this is in `src/lib.rs`: ``` mod inner; fn outer() { let mut x = 160; inner::add_two(&mut x); assert_eq!(x, 162); } ``` -- Rust will look for a file called `src/inner.rs` _or_ `src/inner/mod.rs`, which should contain the contents of the module: ``` fn add_two(x: &mut i32) { *x += 2; } ``` Note: no `mod` keyword here. -- Modules can be nested in other modules. Modules form a tree. --- # Use declarations You can always use any (visible) code by using the fully qualified name (eg. `inner::add_two`). -- Use declarations let you "import" items so you don't have to use the full name each time: ``` mod inner; *use inner::add_two; fn outer() { let mut x = 160; add_two(&mut x); assert_eq!(x, 162); } ``` -- Can rename items when you `use` them: ``` use inner::add_two as add2; add2(...) ``` --- # Crates Crates (~ packages) contain modules. The root of the module tree of a crate is `src/lib.rs`. -- You can import crates to leverage code that other people write. Do this by adding to your `Cargo.toml` file. -- You'll then be able to use all public items from that crate. For example, to import the popular `serde` crate: ``` use serde::{Serialize, Deserialize}; ``` (Items can be referenced using their full name if you do not want to `use` them.)