From dfcfc6eab34a0d946055b91cccd7ea799475ec03 Mon Sep 17 00:00:00 2001 From: "Carol (Nichols || Goulding)" Date: Sun, 13 Aug 2017 16:18:20 -0400 Subject: [PATCH 01/18] Edits to 15-00 --- second-edition/src/ch15-00-smart-pointers.md | 127 ++++++++++++++----- 1 file changed, 92 insertions(+), 35 deletions(-) diff --git a/second-edition/src/ch15-00-smart-pointers.md b/second-edition/src/ch15-00-smart-pointers.md index 33e5a92d32..f4ce90e239 100644 --- a/second-edition/src/ch15-00-smart-pointers.md +++ b/second-edition/src/ch15-00-smart-pointers.md @@ -1,41 +1,98 @@ # Smart Pointers -*Pointer* is a generic programming term for something that refers to a location -that stores some other data. We learned about Rust’s references in Chapter 4; -they’re a plain sort of pointer indicated by the `&` symbol and borrow the -value that they point to. *Smart pointers* are data structures that act like a -pointer, but also have additional metadata and capabilities, such as reference -counting. The smart pointer pattern originated in C++. In Rust, an additional -difference between plain references and smart pointers is that references are a -kind of pointer that only borrow data; by contrast, in many cases, smart -pointers *own* the data that they point to. - -We’ve actually already encountered a few smart pointers in this book, even -though we didn’t call them that by name at the time. For example, in a certain -sense, `String` and `Vec` from Chapter 8 are both smart pointers. They own -some memory and allow you to manipulate it, and have metadata (like their -capacity) and extra capabilities or guarantees (`String` data will always be -valid UTF-8). The characteristics that distinguish a smart pointer from an -ordinary struct are that smart pointers implement the `Deref` and `Drop` -traits, and in this chapter we’ll be discussing both of those traits and why -they’re important to smart pointers. +A *pointer* is the generic programming concept for an address to a location +that stores some data. The most common kind of pointer in Rust is a +*reference*, which we learned about in Chapter 4. References are indicated by +the `&` symbol and borrow the value that they point to. They don't have any +special abilities other than referring to data, but they also don't have any +more overhead than they need to do straightforward referencing, so they're used +the most often. + +*Smart pointers*, on the other hand, are data structures that act like a +pointer, but they also have additional metadata and capabilities. The concept +of smart pointers isn't unique to Rust; it originated in C++ and exists in +other languages as well. The different smart pointers defined in Rust's +standard library provide extra functionality beyond what references provide. +One example that we'll explore in this chapter is the *reference counting* +smart pointer type, which enables you to have multiple owners of data. The +reference counting smart pointer keeps track of how many owners there are, and +when there aren't any remaining, the smart pointer takes care of cleaning up +the data. + + + + + + + +In Rust, where we have the concept of ownership and borrowing, an additional +difference between references and smart pointers is that references are a kind +of pointer that only borrow data; by contrast, in many cases, smart pointers +*own* the data that they point to. + +We've actually already encountered a few smart pointers in this book, such as +`String` and `Vec` from Chapter 8, though we didn't call them smart pointers +at the time. Both these types count as smart pointers because they own some +memory and allow you to manipulate it. They also have metadata (such as their +capacity) and extra capabilities or guarantees (such as `String` ensuring its +data will always be valid UTF-8). + + + + +Smart pointers are usually implemented using structs. The characteristics that +distinguish a smart pointer from an ordinary struct are that smart pointers +implement the `Deref` and `Drop` traits. The `Deref` trait allows an instance +of the smart pointer struct to behave like a reference so that we can write +code that works with either references or smart pointers. The `Drop` trait +allows us to customize the code that gets run when an instance of the smart +pointer goes out of scope. In this chapter, we’ll be discussing both of those +traits and demonstrating why they’re important to smart pointers. Given that the smart pointer pattern is a general design pattern used -frequently in Rust, this chapter won’t cover every smart pointer that exists. -Many libraries have their own and you may write some yourself. The ones we -cover here are the most common ones from the standard library: - -* `Box`, for allocating values on the heap -* `Rc`, a reference counted type so data can have multiple owners -* `RefCell`, which isn’t a smart pointer itself, but manages access to the - smart pointers `Ref` and `RefMut` to enforce the borrowing rules at runtime - instead of compile time - -Along the way, we’ll also cover: - -* The *interior mutability* pattern where an immutable type exposes an API for - mutating an interior value, and the borrowing rules apply at runtime instead - of compile time -* Reference cycles, how they can leak memory, and how to prevent them +frequently in Rust, this chapter won't cover every smart pointer that exists. +Many libraries have their own smart pointers and you can even write some +yourself. We'll just cover the most common smart pointers from the standard +library: + + + + +* `Box` for allocating values on the heap +* `Rc`, a reference counted type that enables multiple ownership +* `Ref` and `RefMut`, accessed through `RefCell`, a type that enforces the + borrowing rules at runtime instead of compile time + + + + +Along the way, we’ll cover the *interior mutability* pattern where an immutable +type exposes an API for mutating an interior value. We'll also discuss +*reference cycles*, how they can leak memory, and how to prevent them. Let’s dive in! From 27addf4b2becc30ac267e095ddcc7f778bbdf312 Mon Sep 17 00:00:00 2001 From: "Carol (Nichols || Goulding)" Date: Tue, 15 Aug 2017 11:34:52 -0400 Subject: [PATCH 02/18] Edits to 15-01 --- second-edition/src/ch15-01-box.md | 317 ++++++++++++++++++++++-------- 1 file changed, 232 insertions(+), 85 deletions(-) diff --git a/second-edition/src/ch15-01-box.md b/second-edition/src/ch15-01-box.md index df2d2197e5..893b1dd92b 100644 --- a/second-edition/src/ch15-01-box.md +++ b/second-edition/src/ch15-01-box.md @@ -1,9 +1,41 @@ ## `Box` Points to Data on the Heap and Has a Known Size The most straightforward smart pointer is a *box*, whose type is written -`Box`. Boxes allow you to put a single value on the heap (we talked about -the stack vs. the heap in Chapter 4). Listing 15-1 shows how to use a box to -store an `i32` on the heap: +`Box`. Boxes allow you to store a value on the heap rather than the stack. +The box data consisting of the pointer to the heap will be stored on the stack. +Refer back to Chapter 4 if you'd like to review the difference between the +stack and the heap. + + + + +Boxes don't have a lot of performance overhead, but they don't have a lot of +extra abilities either. They're most often used in these situations: + +- When you have a type whose size can't be known at compile time, and you want + to use a value of that type in a context that needs to know an exact size +- When you have a large amount of data and you want to transfer ownership but + ensure the data won't be copied when you do so +- When you want to own a value and only care that it's a type that implements a + particular trait rather than knowing the concrete type itself + +We're going to demonstrate the first case in the rest of this section. To +elaborate on the other two situations a bit more: in the second case when you +have a lot of data that you don't want to be copied when you move the value to +be owned by another part of code, boxes make it so that the data stays in one +place on the heap and only the pointer data in the box is copied around on the +stack. The third case is known as a *trait object*, and Chapter 17 has an entire +section devoted just to that topic. So know that what you learn here will be +applied again in Chapter 17! + +### Using a `Box` to Store Data on the Heap + +Before we get into a use case for `Box`, let's get familiar with the syntax and +how to interact with values stored within a `Box`. + +Listing 15-1 shows how to use a box to store an `i32` on the heap: Filename: src/main.rs @@ -17,34 +49,63 @@ fn main() { Listing 15-1: Storing an `i32` value on the heap using a box -This will print `b = 5`. In this case, we can access the data in the box in a -similar way as we would if this data was on the stack. Just like any value that -has ownership of data, when a box goes out of scope like `b` does at the end of -`main`, it will be deallocated. The deallocation happens for both the box -(stored on the stack) and the data it points to (stored on the heap). +We define the variable `b` to have the value of a `Box` that points to the +value `5`, which is allocated on the heap. This program will print `b = 5`; in +this case, we can access the data in the box in a similar way as we would if +this data was on the stack. Just like any value that has ownership of data, +when a box goes out of scope like `b` does at the end of `main`, it will be +deallocated. The deallocation happens for both the box (stored on the stack) +and the data it points to (stored on the heap). Putting a single value on the heap isn’t very useful, so you won’t use boxes by -themselves in the way that Listing 15-1 does very often. A time when boxes are -useful is when you want to ensure that your type has a known size. For example, -consider Listing 15-2, which contains an enum definition for a *cons list*, a -type of data structure that comes from functional programming. Note that this -won’t compile quite yet: - -Filename: src/main.rs - -```rust,ignore -enum List { - Cons(i32, List), - Nil, -} -``` - -Listing 15-2: The first attempt of defining an enum to -represent a cons list data structure of `i32` values - -We’re implementing a cons list that holds only `i32` values. We -could have also chosen to implement a cons list independent of the -type of value by using generics as discussed in Chapter 10. +themselves in the way that Listing 15-1 does very often. Having values like a +single `i32` on the stack, where they're stored by default is more appropriate +in the majority of cases. Let's get into a case where boxes allow us to define +types that we wouldn't be allowed to if we didn't have boxes. + + + + +### Boxes Enable Recursive Types + + + + + + +Rust needs to know at compile time how much space a type takes up. One kind of +type whose size can't be known at compile time is a *recursive type* where a +value can have as part of itself another value of the same type. This nesting +of values could theoretically continue infinitely, so Rust doesn't know how +much space a value of a recursive type needs. Boxes have a known size, however, +so by inserting a box in a recursive type definition, we are allowed to have +recursive types. + +Let's explore the *cons list*, a data type common in functional programming +languages, to illustrate this concept. The cons list type we're going to define +is straightforward except for the recursion, so the concepts in this example +will be useful any time you get into more complex situations involving +recursive types. + + + + +A cons list is a list where each item in the list contains two things: the +value of the current item and the next item. The last item in the list contains +only a value called `Nil` without a next item. > #### More Information About the Cons List > @@ -63,14 +124,57 @@ type of value by using generics as discussed in Chapter 10. > announces the end of the list. Note that this is not the same as the “null” > or “nil” concept from Chapter 6, which is an invalid or absent value. -A cons list is a list where each element contains both a single value as well -as the remains of the list at that point. The remains of the list are defined -by nested cons lists. The end of the list is signified by the value `Nil`. Cons -lists aren’t used very often in Rust; `Vec` is usually a better choice. -Implementing this data structure is a good example of a situation where -`Box` is useful, though. Let’s find out why! +Note that while functional programming languages use cons lists frequently, +this isn't a commonly used data structure in Rust. Most of the time when you +have a list of items in Rust, `Vec` is a better choice. Other, more complex +recursive data types *are* useful in various situations in Rust, but by +starting with the cons list, we can explore how boxes let us define a recursive +data type without much distraction. + + + + +Listing 15-2 contains an enum definition for a cons list. Note that this +won’t compile quite yet because this is type doesn't have a known size, which +we'll demonstrate: + + + -Using a cons list to store the list `1, 2, 3` would look like this: +Filename: src/main.rs + +```rust,ignore +enum List { + Cons(i32, List), + Nil, +} +``` + +Listing 15-2: The first attempt of defining an enum to +represent a cons list data structure of `i32` values + +> Note: We're choosing to implement a cons list that only holds `i32` values +> for the purposes of this example. We could have implemented it using +> generics, as we discussed in Chapter 10, in order to define a cons list type +> that could store values of any type. + + + + +Using our cons list type to store the list `1, 2, 3` would look like the code +in Listing 15-3: + +Filename: src/main.rs ```rust,ignore use List::{Cons, Nil}; @@ -80,12 +184,15 @@ fn main() { } ``` +Listing 15-3: Using the `List` enum to store the list `1, +2, 3` + The first `Cons` value holds `1` and another `List` value. This `List` value is another `Cons` value that holds `2` and another `List` value. This is one more `Cons` value that holds `3` and a `List` value, which is finally `Nil`, the non-recursive variant that signals the end of the list. -If we try to compile the above code, we get the error shown in Listing 15-3: +If we try to compile the above code, we get the error shown in Listing 15-4: ```text error[E0072]: recursive type `List` has infinite size @@ -100,14 +207,26 @@ error[E0072]: recursive type `List` has infinite size make `List` representable ``` -Listing 15-3: The error we get when attempting to define +Listing 15-4: The error we get when attempting to define a recursive enum -The error says this type ‘has infinite size’. Why is that? It’s because we’ve -defined `List` to have a variant that is recursive: it holds another value of -itself. This means Rust can’t figure out how much space it needs in order to -store a `List` value. Let’s break this down a bit: first let’s look at how Rust -decides how much space it needs to store a value of a non-recursive type. + + + +The error says this type 'has infinite size'. The reason is the way we've +defined `List` is with a variant that is recursive: it holds another value of +itself directly. This means Rust can’t figure out how much space it needs in +order to store a `List` value. Let’s break this down a bit: first let’s look at +how Rust decides how much space it needs to store a value of a non-recursive +type. + +### Computing the Size of a Non-Recursive Type + Recall the `Message` enum we defined in Listing 6-2 when we discussed enum definitions in Chapter 6: @@ -120,29 +239,31 @@ enum Message { } ``` -When Rust needs to know how much space to allocate for a `Message` value, it -can go through each of the variants and see that `Message::Quit` does not need -any space, `Message::Move` needs enough space to store two `i32` values, and so -forth. Therefore, the most space a `Message` value will need is the space it -would take to store the largest of its variants. - -Contrast this to what happens when the Rust compiler looks at a recursive type -like `List` in Listing 15-2. The compiler tries to figure out how much memory -is needed to store a value of the `List` enum, and starts by looking at the `Cons` +To determine how much space to allocate for a `Message` value, Rust goes +through each of the variants to see which variant needs the most space. Rust +sees that `Message::Quit` does not need any space, `Message::Move` needs enough +space to store two `i32` values, and so forth. Since only one variant will end +up being used, the most space a `Message` value will need is the space it would +take to store the largest of its variants. + +Contrast this to what happens when Rust tries to determine how much space a +recursive type like the `List` enum in Listing 15-2 needs. The compiler starts +by looking at the `Cons` variant, which holds a value of type `i32` and a value +of type `List`. Therefore, `Cons` needs an amount of space equal to the size of +an `i32` plus the size of a `List`. To figure out how much memory the `List` +type needs, the compiler looks at the variants, starting with the `Cons` variant. The `Cons` variant holds a value of type `i32` and a value of type -`List`, so `Cons` needs an amount of space equal to the size of an `i32` plus -the size of a `List`. To figure out how much memory a `List` needs, it looks at -its variants, starting with the `Cons` variant. The `Cons` variant holds a -value of type `i32` and a value of type `List`, and this continues infinitely, -as shown in Figure 15-4. +`List`, and this continues infinitely, as shown in Figure 15-5. An infinite Cons list -Figure 15-4: An infinite `List` consisting of infinite +Figure 15-5: An infinite `List` consisting of infinite `Cons` variants +### Using `Box` to Get a Recursive Type with a Known Size + Rust can’t figure out how much space to allocate for recursively defined types, -so the compiler gives the error in Listing 15-3. The error did include this +so the compiler gives the error in Listing 15-4. The error does include this helpful suggestion: ```text @@ -150,12 +271,24 @@ helpful suggestion: make `List` representable ``` -Because a `Box` is a pointer, we always know how much space it needs: a -pointer takes up a `usize` amount of space. The value of the `usize` will be -the address of the heap data. The heap data can be any size, but the address to -the start of that heap data will always fit in a `usize`. We can change our -definition from Listing 15-2 to look like the definition in Listing 15-5 by -changing `main` to use `Box::new` for the values inside the `Cons` variants: +In this suggestion, "indirection" means that instead of storing a value +directly, we're going to store the value indirectly by storing a pointer to +the value instead. + +Because a `Box` is a pointer, Rust always knows how much space a `Box` +needs: a pointer takes up a `usize` amount of space. The value of the `usize` +will be the address of the heap data. The heap data can be any size, but the +address to the start of that heap data will always fit in a `usize`. + +So we can put a `Box` inside the `Cons` variant instead of another `List` value +directly. The `Box` will point to the next `List` value that will be on the +heap, rather than inside the `Cons` variant. Conceptually, we still have a list +created by lists "holding" other lists, but the way this concept is implemented +is now more like the items being next to one another rather than inside one +another. + +We can change the definition of the `List` enum from Listing 15-2 and the usage +of the `List` from Listing 15-3 to the code in Listing 15-6, which will compile: Filename: src/main.rs @@ -175,30 +308,44 @@ fn main() { } ``` -Listing 15-5: Definition of `List` that uses `Box` in +Listing 15-6: Definition of `List` that uses `Box` in order to have a known size -The compiler will now be able to figure out the size it needs to store a `List` -value. Rust will look at `List`, and again start by looking at the `Cons` -variant. The `Cons` variant will need the size of `i32` plus the space to store -a `usize`, since a box always has the size of a `usize`, no matter what it’s -pointing to. Then Rust looks at the `Nil` variant, which does not store a -value, so `Nil` doesn’t need any space. We’ve broken the infinite, recursive -chain by adding in a box. Figure 15-6 shows what the `Cons` variant looks like -now: +The `Cons` variant will need the size of an `i32` plus the space to store a +`usize`, since a box always has the size of a `usize`, no matter what it's +pointing to. The `Nil` variant stores no values and doesn't need any space. By +using a box, we've broken the infinite, recursive chain so the compiler is able +to figure out the size it needs to store a `List` value. Figure 15-7 shows what +the `Cons` variant looks like now: A finite Cons list -Figure 15-6: A `List` that is not infinitely sized since +Figure 15-7: A `List` that is not infinitely sized since `Cons` holds a `Box` -This is the main area where boxes are useful: breaking up an infinite data -structure so that the compiler can know what size it is. We’ll look at another -case where Rust has data of unknown size in Chapter 17 when we discuss trait -objects. - -Even though you won’t be using boxes very often, they are a good way to -understand the smart pointer pattern. Two of the aspects of `Box` that are -commonly used with smart pointers are its implementations of the `Deref` trait -and the `Drop` trait. Let’s investigate how these traits work and how smart -pointers use them. + + + +Boxes only provide the indirection and heap allocation; they don't have any +other special abilities like those we'll see with the other smart pointer +types. They also don't have any performance overhead that these special +abilities incur, so they can be useful in cases like the cons list where the +indirection is the only feature we need. We'll look at more use cases for boxes +in Chapter 17, too. + +The `Box` type is a smart pointer because it implements the `Deref` trait, +which allows `Box` values to be treated like references. When a `Box` +value goes out of scope, the heap data that the box is pointing to is cleaned +up as well because of the `Box` type's `Drop` trait implementation. Let's +explore these two types in more detail; these traits are going to be even more +important to the functionality provided by the other smart pointer types we'll +be discussing in the rest of this chapter. + + + From 8be16664c68df30fee68a0a36bee81b6a0a3989f Mon Sep 17 00:00:00 2001 From: "Carol (Nichols || Goulding)" Date: Wed, 16 Aug 2017 22:07:41 -0400 Subject: [PATCH 03/18] Edits to 15-02 --- second-edition/dictionary.txt | 1 + second-edition/src/ch15-02-deref.md | 538 ++++++++++++++++++++-------- 2 files changed, 396 insertions(+), 143 deletions(-) diff --git a/second-edition/dictionary.txt b/second-edition/dictionary.txt index c60ec821c7..d077ce7e4a 100644 --- a/second-edition/dictionary.txt +++ b/second-edition/dictionary.txt @@ -224,6 +224,7 @@ Mutex mutexes Mutexes MutexGuard +MyBox namespace namespaced namespaces diff --git a/second-edition/src/ch15-02-deref.md b/second-edition/src/ch15-02-deref.md index 9713ab58b4..3374addab9 100644 --- a/second-edition/src/ch15-02-deref.md +++ b/second-edition/src/ch15-02-deref.md @@ -1,193 +1,445 @@ -## The `Deref` Trait Allows Access to the Data Through a Reference - -The first important smart pointer-related trait is `Deref`, which allows us to -override `*`, the dereference operator (as opposed to the multiplication -operator or the glob operator). Overriding `*` for smart pointers makes -accessing the data behind the smart pointer convenient, and we’ll talk about -what we mean by convenient when we get to deref coercions later in this section. - -We briefly mentioned the dereference operator in Chapter 8, in the hash map -section titled “Update a Value Based on the Old Value”. We had a mutable -reference, and we wanted to change the value that the reference was pointing -to. In order to do that, first we had to dereference the reference. Here’s -another example using references to `i32` values: +## Treating Smart Pointers like Regular References with the `Deref` Trait + +Implementing `Deref` trait allows us to customize the behavior of the +*dereference operator* `*`(as opposed to the multiplication or glob operator). +By implementing `Deref` in such a way that a smart pointer can be treated like +a regular reference, we can write code that is able to operate on either smart +pointers or regular references. + + + + + + + +Let's first take a look at how `*` works with regular references, then try and +define our own type like `Box` and see why `*` doesn't work like a +reference. We'll explore how implementing the `Deref` trait makes it possible +for smart pointers to work in a similar way as references. Finally, we'll look +at the *deref coercion* feature of Rust and how that lets us work with either +references or smart pointers. + +### Following the Pointer to the Value with `*` + + + + + + + +A regular reference is a type of pointer, and one way to think of a pointer is +that it's an arrow to a value stored somewhere else. In Listing 15-8, let's +create a reference to an `i32` value then use the dereference operator to +follow the reference to the data: + + + + + + + +Filename: src/main.rs ```rust -let mut x = 5; -{ - let y = &mut x; +fn main() { + let x = 5; + let y = &x; - *y += 1 + assert_eq!(5, x); + assert_eq!(5, *y); } +``` -assert_eq!(6, x); +Listing 15-8: Using the dereference operator to follow a +reference to an `i32` value + +The variable `x` holds an `i32` value, `5`. We set `y` equal to a reference to +`x`. We can assert that `x` is equal to `5`. However, if we want to make an +assertion about the value in `y`, we have to use `*y` to follow the reference +to the value that the reference is pointing to (hence *de-reference*). Once we +de-reference `y`, we have access to the integer value `y` is pointing to that +we can compare with `5`. + +If we try to write `assert_eq!(5, y);` instead, we'll get this compilation +error: + +```text +error[E0277]: the trait bound `{integer}: std::cmp::PartialEq<&{integer}>` is +not satisfied + --> :5:19 + | +5 | if ! ( * left_val == * right_val ) { + | ^^ can't compare `{integer}` with `&{integer}` + | + = help: the trait `std::cmp::PartialEq<&{integer}>` is not implemented for + `{integer}` ``` -We use `*y` to access the data that the mutable reference in `y` refers to, -rather than the mutable reference itself. We can then modify that data, in this -case by adding 1. - -With references that aren’t smart pointers, there’s only one value that the -reference is pointing to, so the dereference operation is straightforward. -Smart pointers can also store metadata about the pointer or the data. When -dereferencing a smart pointer, we only want the data, not the metadata, since -dereferencing a regular reference only gives us data and not metadata. We want -to be able to use smart pointers in the same places that we can use regular -references. To enable that, we can override the behavior of the `*` operator by -implementing the `Deref` trait. - -Listing 15-7 has an example of overriding `*` using `Deref` on a struct we’ve -defined to hold mp3 data and metadata. `Mp3` is, in a sense, a smart pointer: -it owns the `Vec` data containing the audio. In addition, it holds some -optional metadata, in this case the artist and title of the song in the audio -data. We want to be able to conveniently access the audio data, not the -metadata, so we implement the `Deref` trait to return the audio data. -Implementing the `Deref` trait requires implementing one method named `deref` -that borrows `self` and returns the inner data: +Comparing a reference to a number with a number isn't allowed because they're +different types. We have to use `*` to follow the reference to the value it's +pointing to. -Filename: src/main.rs +### Using `Box` Like a Reference + +We can rewrite the code in Listing 15-8 to use a `Box` instead of a +reference, and the de-reference operator will work the same way as shown in +Listing 15-9: + +Filename: src/main.rs ```rust -use std::ops::Deref; +fn main() { + let x = 5; + let y = Box::new(x); -struct Mp3 { - audio: Vec, - artist: Option, - title: Option, + assert_eq!(5, x); + assert_eq!(5, *y); } +``` -impl Deref for Mp3 { - type Target = Vec; +Listing 15-9: Using the dereference operator on a +`Box` - fn deref(&self) -> &Vec { - &self.audio +The only part of Listing 15-8 that we changed was to set `y` to be an instance +of a box pointing to the value in `x` rather than a reference pointing to the +value of `x`. In the last assertion, we can use the dereference operator to +follow the box's pointer in the same way that we did when `y` was a reference. +Let's explore what is special about `Box` that enables us to do this by +defining our own box type. + +### Defining Our Own Smart Pointer + +Let's build a smart pointer similar to the `Box` type that the standard +library has provided for us, in order to experience that smart pointers don't +behave like references by default. Then we'll learn about how to add the +ability to use the dereference operator. + +`Box` is ultimately defined as a tuple struct with one element, so Listing +15-10 defines a `MyBox` type in the same way. We'll also define a `new` +function to match the `new` function defined on `Box`: + +Filename: src/main.rs + +```rust +struct MyBox(T); + +impl MyBox { + fn new(x: T) -> MyBox { + MyBox(x) } } +``` +Listing 15-10: Defining a `MyBox` type + +We define a struct named `MyBox` and declare a generic parameter `T`, since we +want our type to be able to hold values of any type. `MyBox` is a tuple struct +with one element of type `T`. The `MyBox::new` function takes one parameter of +type `T` and returns a `MyBox` instance that holds the value passed in. + +Let's try adding the code from Listing 15-9 to the code in Listing 15-10 and +changing `main` to use the `MyBox` type we've defined instead of `Box`. +The code in Listing 15-11 won't compile because Rust doesn't know how to +dereference `MyBox`: + +Filename: src/main.rs + +```rust,ignore fn main() { - let my_favorite_song = Mp3 { - // we would read the actual audio data from an mp3 file - audio: vec![1, 2, 3], - artist: Some(String::from("Nirvana")), - title: Some(String::from("Smells Like Teen Spirit")), - }; - - assert_eq!(vec![1, 2, 3], *my_favorite_song); + let x = 5; + let y = MyBox::new(x); + + assert_eq!(5, x); + assert_eq!(5, *y); +} +``` + +Listing 15-11: Attempting to use `MyBox` in the same +way we were able to use references and `Box` + +The compilation error we get is: + +```text +error: type `MyBox<{integer}>` cannot be dereferenced + --> src/main.rs:14:19 + | +14 | assert_eq!(5, *y); + | ^^ +``` + +Our `MyBox` type can't be dereferenced because we haven't implemented that +ability on our type. To enable dereferencing with the `*` operator, we can +implement the `Deref` trait. + +### Implementing the `Deref` Trait Defines How To Treat a Type Like a Reference + +As we discussed in Chapter 10, in order to implement a trait, we need to +provide implementations for the trait's required methods. The `Deref` trait, +provided by the standard library, requires implementing one method named +`deref` that borrows `self` and returns a reference to the inner data. Listing +15-12 contains an implementation of `Deref` to add to the definition of `MyBox`: + +Filename: src/main.rs + +```rust +use std::ops::Deref; + +# struct MyBox(T); +impl Deref for MyBox { + type Target = T; + + fn deref(&self) -> &T { + &self.0 + } } ``` -Listing 15-7: An implementation of the `Deref` trait on a -struct that holds mp3 file data and metadata - -Most of this should look familiar: a struct, a trait implementation, and a -main function that creates an instance of the struct. There is one part we -haven’t explained thoroughly yet: similarly to Chapter 13 when we looked at the -Iterator trait with the `type Item`, the `type Target = T;` syntax is defining -an associated type, which is covered in more detail in Chapter 19. Don’t worry -about that part of the example too much; it is a slightly different way of -declaring a generic parameter. - -In the `assert_eq!`, we’re verifying that `vec![1, 2, 3]` is the result we get -when dereferencing the `Mp3` instance with `*my_favorite_song`, which is what -happens since we implemented the `deref` method to return the audio data. If -we hadn’t implemented the `Deref` trait for `Mp3`, Rust wouldn’t compile the -code `*my_favorite_song`: we’d get an error saying type `Mp3` cannot be -dereferenced. - -Without the `Deref` trait, the compiler can only dereference `&` references, -which `my_favorite_song` is not (it is an `Mp3` struct). With the `Deref` -trait, the compiler knows that types implementing the `Deref` trait have a -`deref` method that returns a reference (in this case, `&self.audio` because of -our definition of `deref` in Listing 15-7). So in order to get a `&` reference -that `*` can dereference, the compiler expands `*my_favorite_song` to this: +Listing 15-12: Implementing `Deref` on `MyBox` + +The `type Target = T;` syntax defines an associated type for this trait to use. +Associated types are a slightly different way of declaring a generic parameter +that you don't need to worry about too much for now; we'll cover it in more +detail in Chapter 19. + + + + +We filled in the body of the `deref` method with `&self.0` so that `deref` +returns a reference to the value we want to access with the `*` operator. The +`main` function from Listing 15-11 that calls `*` on the `MyBox` value now +compiles and the assertions pass! + +Without the `Deref` trait, the compiler can only dereference `&` references. +The `Deref` trait's `deref` method gives the compiler the ability to take a +value of any type that implements `Deref` and call the `deref` method in order +to get a `&` reference that it knows how to dereference. + +When we typed `*y` in Listing 15-11, what Rust actually ran behind the scenes +was this code: ```rust,ignore -*(my_favorite_song.deref()) +*(y.deref()) ``` -The result is the value in `self.audio`. The reason `deref` returns a reference -that we then have to dereference, rather than just returning a value directly, -is because of ownership: if the `deref` method directly returned the value -instead of a reference to it, the value would be moved out of `self`. We don’t -want to take ownership of `my_favorite_song.audio` in this case and most cases -where we use the dereference operator. + + + +Rust substitutes the `*` operator with a call to the `deref` method and then a +plain dereference so that we don't have to think about when we have to call the +`deref` method or not. This feature of Rust lets us write code that functions +identically whether we have a regular reference or a type that implements +`Deref`. + +The reason the `deref` method returns a reference to a value, and why the plain +dereference outside the parentheses in `*(y.deref())` is still necessary, is +because of ownership. If the `deref` method returned the value directly instead +of a reference to the value, the value would be moved out of `self`. We don’t +want to take ownership of the inner value inside `MyBox` in this case and in +most cases where we use the dereference operator. Note that replacing `*` with a call to the `deref` method and then a call to -`*` happens once, each time the `*` is used. The substitution of `*` does not -recurse infinitely. That’s how we end up with data of type `Vec`, which -matches the `vec![1, 2, 3]` in the `assert_eq!` in Listing 15-7. +`*` happens once, each time we type a `*` in our code. The substitution of `*` +does not recurse infinitely. That’s how we end up with data of type `i32`, +which matches the `5` in the `assert_eq!` in Listing 15-11. ### Implicit Deref Coercions with Functions and Methods -Rust tends to favor explicitness over implicitness, but one case where this -does not hold true is *deref coercions* of arguments to functions and methods. -A deref coercion will automatically convert a reference to any pointer into a -reference to that pointer’s contents. A deref coercion happens when the -reference type of the argument passed into the function differs from the -reference type of the parameter defined in that function’s signature. Deref -coercion was added to Rust to make calling functions and methods not need as -many explicit references and dereferences with `&` and `*`. + + + +Rust tends to favor explicitness over implicitness, but one exception is deref +coercions of arguments to functions and methods. Rust performs *deref coercion* +to convert a reference to a type that implements `Deref` into a reference to a +type that `Deref` can convert the original type into. Deref coercion happens +automatically when we pass a reference to a value of a particular type as an +argument to a function or method that doesn't match the type of the parameter +in the function or method definition, and there's a sequence of calls to the +`deref` method that will convert the type we provided into the type that the +parameter needs. + +Deref coercion was added to Rust so that programmers writing function and +method calls don't need to add as many explicit references and dereferences +with `&` and `*`. This feature also lets us write more code that can work for +either references or smart pointers. + +To illustrate deref coercion in action, let's use the `MyBox` type we +defined in Listing 15-10 as well as the implementation of `Deref` that we added +in Listing 15-12. Listing 15-13 shows the definition of a function that has a +string slice parameter: -Using our `Mp3` struct from Listing 15-7, here’s the signature of a function to -compress mp3 audio data that takes a slice of `u8`: +Filename: src/main.rs -```rust,ignore -fn compress_mp3(audio: &[u8]) -> Vec { - // the actual implementation would go here +```rust +fn hello(name: &str) { + println!("Hello, {}!", name); } ``` -If Rust didn’t have deref coercion, in order to call this function with the -audio data in `my_favorite_song`, we’d have to write: +Listing 15-13: A `hello` function that has the parameter +`name` of type `&str` -```rust,ignore -compress_mp3(my_favorite_song.audio.as_slice()) +We can call the `hello` function with a string slice as an argument, like +`hello("Rust");` for example. Deref coercion makes it possible for us to call +`hello` with a reference to a value of type `MyBox`, as shown in +Listing 15-14: + +Filename: src/main.rs + +```rust +# use std::ops::Deref; +# +# struct MyBox(T); +# +# impl MyBox { +# fn new(x: T) -> MyBox { +# MyBox(x) +# } +# } +# +# impl Deref for MyBox { +# type Target = T; +# +# fn deref(&self) -> &T { +# &self.0 +# } +# } +# +# fn hello(name: &str) { +# println!("Hello, {}!", name); +# } +# +fn main() { + let m = MyBox::new(String::from("Rust")); + hello(&m); +} ``` -That is, we’d have to explicitly say that we want the data in the `audio` field -of `my_favorite_song` and that we want a slice referring to the whole -`Vec`. If there were a lot of places where we’d want to process the `audio` -data in a similar manner, `.audio.as_slice()` would be wordy and repetitive. +Listing 15-14: Calling `hello` with a reference to a +`MyBox`, which works because of deref coercion -However, because of deref coercion and our implementation of the `Deref` trait -on `Mp3`, we can call this function with the data in `my_favorite_song` by -using this code: +Here we're calling the `hello` function with the argument `&m`, which is a +reference to a `MyBox` value. Because we implemented the `Deref` trait +on `MyBox` in Listing 15-12, Rust can turn `&MyBox` into `&String` +by calling `deref`. The standard library provides an implementation of `Deref` +on `String` that returns a string slice, which we can see in the API +documentation for `Deref`. Rust calls `deref` again to turn the `&String` into +`&str`, which matches the `hello` function's definition. -```rust,ignore -let result = compress_mp3(&my_favorite_song); +If Rust didn't implement deref coercion, in order to call `hello` with a value +of type `&MyBox`, we'd have to write the code in Listing 15-15 instead +of the code in Listing 15-14: + +Filename: src/main.rs + +```rust +# use std::ops::Deref; +# +# struct MyBox(T); +# +# impl MyBox { +# fn new(x: T) -> MyBox { +# MyBox(x) +# } +# } +# +# impl Deref for MyBox { +# type Target = T; +# +# fn deref(&self) -> &T { +# &self.0 +# } +# } +# +# fn hello(name: &str) { +# println!("Hello, {}!", name); +# } +# +fn main() { + let m = MyBox::new(String::from("Rust")); + hello(&(*m)[..]); +} ``` -Just an `&` and the instance, nice! We can treat our smart pointer as if it was -a regular reference. Deref coercion means that Rust can use its knowledge of -our `Deref` implementation, namely: Rust knows that `Mp3` implements the -`Deref` trait and returns `&Vec` from the `deref` method. Rust also knows -the standard library implements the `Deref` trait on `Vec` to return `&[T]` -from the `deref` method (and we can find that out too by looking at the API -documentation for `Vec`). So, at compile time, Rust will see that it can use -`Deref::deref` twice to turn `&Mp3` into `&Vec` and then into `&[T]` to -match the signature of `compress_mp3`. That means we get to do less typing! -Rust will analyze types through `Deref::deref` as many times as it needs to in -order to get a reference to match the parameter’s type, when the `Deref` trait -is defined for the types involved. This indirection is resolved at compile time, -so there is no run-time penalty for taking advantage of deref coercion! - -Similar to how we use the `Deref` trait to override `*` on `&T`s, there is also -a `DerefMut` trait for overriding `*` on `&mut T`. +Listing 15-15: The code we'd have to write if Rust didn't +have deref coercion + +The `(*m)` is dereferencing the `MyBox` into a `String`. Then the `&` +and `[..]` are taking a string slice of the `String` that is equal to the whole +string to match the signature of `hello`. The code without deref coercions is +harder to read, write, and understand with all of these symbols involved. Deref +coercion makes it so that Rust takes care of these conversions for us +automatically. + +When the `Deref` trait is defined for the types involved, Rust will analyze the +types and use `Deref::deref` as many times as it needs in order to get a +reference to match the parameter's type. This is resolved at compile time, so +there is no run-time penalty for taking advantage of deref coercion! + +### How Deref Coercion Interacts with Mutability + + + + +Similar to how we use the `Deref` trait to override `*` on immutable +references, Rust provides a `DerefMut` trait for overriding `*` on mutable +references. Rust does deref coercion when it finds types and trait implementations in three cases: + + + * From `&T` to `&U` when `T: Deref`. * From `&mut T` to `&mut U` when `T: DerefMut`. * From `&mut T` to `&U` when `T: Deref`. -The first two are the same, except for mutability: if you have a `&T`, and -`T` implements `Deref` to some type `U`, you can get a `&U` transparently. Same -for mutable references. The last one is more tricky: if you have a mutable -reference, it will also coerce to an immutable one. The other case is _not_ -possible though: immutable references will never coerce to mutable ones. - -The reason that the `Deref` trait is important to the smart pointer pattern is -that smart pointers can then be treated like regular references and used in -places that expect regular references. We don’t have to redefine methods and -functions to take smart pointers explicitly, for example. +The first two cases are the same except for mutability. The first case says +that if you have a `&T`, and `T` implements `Deref` to some type `U`, you can +get a `&U` transparently. The second case states that the same deref coercion +happens for mutable references. + +The last case is trickier: Rust will also coerce a mutable reference to an +immutable one. The reverse is *not* possible though: immutable references will +never coerce to mutable ones. Because of the borrowing rules, if you have a +mutable reference, that mutable reference must be the only reference to that +data (otherwise, the program wouldn't compile). Converting one mutable +reference to one immutable reference will never break the borrowing rules. +Converting an immutable reference to a mutable reference would require that +there was only one immutable reference to that data, and the borrowing rules +don't guarantee that. Therefore, Rust can't make the assumption that converting +an immutable reference to a mutable reference is possible. + + + From c2014ea50da7d69cd85616ca545175478cb15a00 Mon Sep 17 00:00:00 2001 From: "Carol (Nichols || Goulding)" Date: Wed, 23 Aug 2017 14:07:43 -0400 Subject: [PATCH 04/18] Edits to 15-03 --- second-edition/dictionary.txt | 1 + second-edition/src/ch15-03-drop.md | 260 +++++++++++++++++++---------- 2 files changed, 177 insertions(+), 84 deletions(-) diff --git a/second-edition/dictionary.txt b/second-edition/dictionary.txt index d077ce7e4a..fe21fa5649 100644 --- a/second-edition/dictionary.txt +++ b/second-edition/dictionary.txt @@ -70,6 +70,7 @@ ctrl Ctrl customizable CustomSmartPointer +deallocate deallocated deallocating deallocation diff --git a/second-edition/src/ch15-03-drop.md b/second-edition/src/ch15-03-drop.md index d5c1d5d30a..fc3488a946 100644 --- a/second-edition/src/ch15-03-drop.md +++ b/second-edition/src/ch15-03-drop.md @@ -1,32 +1,53 @@ ## The `Drop` Trait Runs Code on Cleanup -The other trait that’s important to the smart pointer pattern is the `Drop` -trait. `Drop` lets us run some code when a value is about to go out of scope. -Smart pointers perform important cleanup when being dropped, like deallocating -memory or decrementing a reference count. More generally, data types can manage -resources beyond memory, like files or network connections, and use `Drop` to -release those resources when our code is done with them. We’re discussing -`Drop` in the context of smart pointers, though, because the functionality of -the `Drop` trait is almost always used when implementing smart pointers. - -In some other languages, we have to remember to call code to free the memory or -resource every time we finish using an instance of a smart pointer. If we -forget, the system our code is running on might get overloaded and crash. In -Rust, we can specify that some code should be run when a value goes out of -scope, and the compiler will insert this code automatically. That means we don’t -need to remember to put this code everywhere we’re done with an instance of -these types, but we still won’t leak resources! - -The way we specify code should be run when a value goes out of scope is by -implementing the `Drop` trait. The `Drop` trait requires us to implement one -method named `drop` that takes a mutable reference to `self`. - -Listing 15-8 shows a `CustomSmartPointer` struct that doesn’t actually do -anything, but we’re printing out `CustomSmartPointer created.` right after we -create an instance of the struct and `Dropping CustomSmartPointer!` when the -instance goes out of scope so that we can see when each piece of code gets run. -Instead of a `println!` statement, you’d fill in `drop` with whatever cleanup -code your smart pointer needs to run: +The second trait important to the smart pointer pattern is `Drop`, which lets +us customize what happens when a value is about to go out of scope. We can +provide an implementation for the `Drop` trait on any type, and the code we +specify can be used to release resources like files or network connections. +We're introducing `Drop` in the context of smart pointers because the +functionality of the `Drop` trait is almost always used when implementing a +smart pointer. For example, `Box` customizes `Drop` in order to deallocate +the space on the heap that the box points to. + +In some languages, the programmer must call code to free memory or resources +every time they finish using an instance of a smart pointer. If they forget, +the system might become overloaded and crash. In Rust, we can specify that a +particular bit of code should be run whenever a value goes out of scope, and +the compiler will insert this code automatically. + + + + +This means we don't need be careful about placing clean up code everywhere in a +program that an instance of a particular type is finished with, but we still +won't leak resources! + +We specify the code to run when a value goes out of scope by implementing the +`Drop` trait. The `Drop` trait requires us to implement one method named `drop` +that takes a mutable reference to `self`. In order to be able to see when Rust +calls `drop`, let's implement `drop` with `println!` statements for now. + + + + +Listing 15-8 shows a `CustomSmartPointer` struct whose only custom +functionality is that it will print out `Dropping CustomSmartPointer!` when the +instance goes out of scope. This will demonstrate when Rust runs the `drop` +function: + + + Filename: src/main.rs @@ -44,40 +65,66 @@ impl Drop for CustomSmartPointer { fn main() { let c = CustomSmartPointer { data: String::from("some data") }; println!("CustomSmartPointer created."); - println!("Wait for it..."); } ``` Listing 15-8: A `CustomSmartPointer` struct that -implements the `Drop` trait, where we could put code that would clean up after -the `CustomSmartPointer`. +implements the `Drop` trait, where we would put our clean up code. -The `Drop` trait is in the prelude, so we don’t need to import it. The `drop` -method implementation calls the `println!`; this is where you’d put the actual -code needed to close the socket. In `main`, we create a new instance of -`CustomSmartPointer` then print out `CustomSmartPointer created.` to be able to -see that our code got to that point at runtime. At the end of `main`, our -instance of `CustomSmartPointer` will go out of scope. Note that we didn’t call -the `drop` method explicitly. +The `Drop` trait is included in the prelude, so we don't need to import it. We +implement the `Drop` trait on `CustomSmartPointer`, and provide an +implementation for the `drop` method that calls `println!`. The body of the +`drop` function is where you'd put any logic that you wanted to run when an +instance of your type goes out of scope. We're choosing to print out some text +here in order to demonstrate when Rust will call `drop`. -When we run this program, we’ll see: + + + +In `main`, we create a new instance of `CustomSmartPointer` and then print out +`CustomSmartPointer created.`. At the end of `main`, our instance of +`CustomSmartPointer` will go out of scope, and Rust will call the code we put +in the `drop` method, printing our final message. Note that we didn't need to +call the `drop` method explicitly. + +When we run this program, we'll see the following output: ```text CustomSmartPointer created. -Wait for it... Dropping CustomSmartPointer! ``` -printed to the screen, which shows that Rust automatically called `drop` for us -when our instance went out of scope. +Rust automatically called `drop` for us when our instance went out of scope, +calling the code we specified. This is just to give you a visual guide to how +the drop method works, but usually you would specify the cleanup code that your +type needs to run rather than a print message. + + + + +#### Dropping a Value Early with `std::mem::drop` -We can use the `std::mem::drop` function to drop a value earlier than when it -goes out of scope. This isn’t usually necessary; the whole point of the `Drop` -trait is that it’s taken care of automatically for us. We’ll see an example of -a case when we’ll need to drop a value earlier than when it goes out of scope -in Chapter 16 when we’re talking about concurrency. For now, let’s just see -that it’s possible, and `std::mem::drop` is in the prelude so we can just call -`drop` as shown in Listing 15-9: + + + +Rust inserts the call to `drop` automatically when a value goes out of scope, +and there's no way to disable this functionality if we want to force a value to +clean itself up early. This isn't usually necessary; the whole point of the +`Drop` trait is that it's taken care of automatically for us. Occasionally you +may find that you want to clean up a value early. One example is when using +smart pointers that manage locks; you may want to force the `drop` method that +releases the lock to run so that other code in the same scope can acquire the +lock. First, let's see what happens if we try to call the `Drop` trait's `drop` +method ourselves by modifying the `main` function from Listing 15-8 as shown in +Listing 15-9: + + + Filename: src/main.rs @@ -85,56 +132,101 @@ that it’s possible, and `std::mem::drop` is in the prelude so we can just call fn main() { let c = CustomSmartPointer { data: String::from("some data") }; println!("CustomSmartPointer created."); - drop(c); - println!("Wait for it..."); + c.drop(); + println!("CustomSmartPointer dropped before the end of main."); } ``` -Listing 15-9: Calling `std::mem::drop` to explicitly drop -a value before it goes out of scope +Listing 15-9: Attempting to call the `drop` method from +the `Drop` trait manually to clean up early -Running this code will print the following, showing that the destructor code is -called since `Dropping CustomSmartPointer!` is printed between -`CustomSmartPointer created.` and `Wait for it...`: +If we try to compile this, we'll get this error: ```text -CustomSmartPointer created. -Dropping CustomSmartPointer! -Wait for it... +error[E0040]: explicit use of destructor method + --> src/main.rs:15:7 + | +15 | c.drop(); + | ^^^^ explicit destructor calls not allowed ``` -Note that we aren’t allowed to call the `drop` method that we defined directly: -if we replaced `drop(c)` in Listing 15-9 with `c.drop()`, we’ll get a compiler -error that says `explicit destructor calls not allowed`. We’re not allowed to -call `Drop::drop` directly because when Rust inserts its call to `Drop::drop` -automatically when the value goes out of scope, then the value would get -dropped twice. Dropping a value twice could cause an error or corrupt memory, -so Rust doesn’t let us. Instead, we can use `std::mem::drop`, whose definition -is: +This error message says we're not allowed to explicitly call `drop`. The error +message uses the term *destructor*, which is the general programming term for a +function that cleans up an instance. A *destructor* is analogous to a +*constructor* that creates an instance. The `drop` function in Rust is one +particular destructor. + +Rust doesn't let us call `drop` explicitly because Rust would still +automatically call `drop` on the value at the end of `main`, and this would be +a *double free* error since Rust would be trying to clean up the same value +twice. + +Because we can't disable the automatic insertion of `drop` when a value goes +out of scope, and we can't call the `drop` method explicitly, if we need to +force a value to be cleaned up early, we can use the `std::mem::drop` function. + +The `std::mem::drop` function is different than the `drop` method in the `Drop` +trait. We call it by passing the value we want to force to be dropped early as +an argument. `std::mem::drop` is in the prelude, so we can modify `main` from +Listing 15-8 to call the `drop` function as shown in Listing 15-10: + +Filename: src/main.rs ```rust -pub mod std { - pub mod mem { - pub fn drop(x: T) { } - } +# struct CustomSmartPointer { +# data: String, +# } +# +# impl Drop for CustomSmartPointer { +# fn drop(&mut self) { +# println!("Dropping CustomSmartPointer!"); +# } +# } +# +fn main() { + let c = CustomSmartPointer { data: String::from("some data") }; + println!("CustomSmartPointer created."); + drop(c); + println!("CustomSmartPointer dropped before the end of main."); } ``` -This function is generic over any type `T`, so we can pass any value to it. The -function doesn’t actually have anything in its body, so it doesn’t use its -parameter. The reason this empty function is useful is that `drop` takes -ownership of its parameter, which means the value in `x` gets dropped at the -end of this function when `x` goes out of scope. +Listing 15-10: Calling `std::mem::drop` to explicitly +drop a value before it goes out of scope + +Running this code will print the following: + +```text +CustomSmartPointer created. +Dropping CustomSmartPointer! +CustomSmartPointer dropped before the end of main. +``` + + + -Code specified in a `Drop` trait implementation can be used for many reasons to +The `Dropping CustomSmartPointer!` is printed between `CustomSmartPointer +created.` and `CustomSmartPointer dropped before the end of main.`, showing +that the `drop` method code is called to drop `c` at that point. + + + + +Code specified in a `Drop` trait implementation can be used in many ways to make cleanup convenient and safe: we could use it to create our own memory -allocator, for instance! By using the `Drop` trait and Rust’s ownership system, -we don’t have to remember to clean up after ourselves since Rust takes care of -it automatically. We’ll get compiler errors if we write code that would clean -up a value that’s still in use, since the ownership system that makes sure +allocator, for instance! With the `Drop` trait and Rust's ownership system, you +don't have to remember to clean up after yourself, Rust takes care of it +automatically. + +We also don't have to worry about accidentally cleaning up values still in use +because that would cause a compiler error: the ownership system that makes sure references are always valid will also make sure that `drop` only gets called -one time when the value is no longer being used. +once when the value is no longer being used. -Now that we’ve gone over `Box` and some of the characteristics of smart -pointers, let’s talk about a few other smart pointers defined in the standard -library that add different kinds of useful functionality. +Now that we've gone over `Box` and some of the characteristics of smart +pointers, let's talk about a few other smart pointers defined in the standard +library. From 88ae5d945519c0ba07945201cd826717a842d369 Mon Sep 17 00:00:00 2001 From: "Carol (Nichols || Goulding)" Date: Wed, 23 Aug 2017 15:40:26 -0400 Subject: [PATCH 05/18] Edits to 15-04 --- second-edition/src/ch15-04-rc.md | 233 +++++++++++++++++++------------ 1 file changed, 143 insertions(+), 90 deletions(-) diff --git a/second-edition/src/ch15-04-rc.md b/second-edition/src/ch15-04-rc.md index fd32277b34..927d1d99e2 100644 --- a/second-edition/src/ch15-04-rc.md +++ b/second-edition/src/ch15-04-rc.md @@ -1,50 +1,55 @@ ## `Rc`, the Reference Counted Smart Pointer -In the majority of cases, ownership is very clear: you know exactly which -variable owns a given value. However, this isn’t always the case; sometimes, -you may actually need multiple owners. For this, Rust has a type called -`Rc`. Its name is an abbreviation for *reference counting*. Reference -counting means keeping track of the number of references to a value in order to -know if a value is still in use or not. If there are zero references to a -value, we know we can clean up the value without any references becoming -invalid. - -To think about this in terms of a real-world scenario, it’s like a TV in a -family room. When one person comes in the room to watch TV, they turn it on. -Others can also come in the room and watch the TV. When the last person leaves -the room, they’ll turn the TV off since it’s no longer being used. If someone -turns off the TV while others are still watching it, though, the people -watching the TV would get mad! - -`Rc` is for use when we want to allocate some data on the heap for multiple -parts of our program to read, and we can’t determine at compile time which part -of our program using this data will finish using it last. If we knew which part -would finish last, we could make that part the owner of the data and the normal -ownership rules enforced at compile time would kick in. - -Note that `Rc` is only for use in single-threaded scenarios; the next -chapter on concurrency will cover how to do reference counting in -multithreaded programs. If you try to use `Rc` with multiple threads, -you’ll get a compile-time error. +In the majority of cases, ownership is clear: you know exactly which variable +owns a given value. However, there are cases when a single value may have +multiple owners. For example, in graph data structures, multiple edges may +point to the same node, and that node is conceptually owned by all of the edges +that point to it. A node shouldn't be cleaned up unless it doesn't have any +edges pointing to it. + + + + +In order to enable multiple ownership, Rust has a type called `Rc`. Its name +is an abbreviation for reference counting. *Reference counting* means keeping +track of the number of references to a value in order to know if a value is +still in use or not. If there are zero references to a value, the value can be +cleaned up without any references becoming invalid. + +Imagine it like a TV in a family room. When one person enters to watch TV, they +turn it on. Others can come into the room and watch the TV. When the last +person leaves the room, they turn the TV off because it's no longer being used. +If someone turns the TV off while others are still watching it, there'd be +uproar from the remaining TV watchers! + +`Rc` is used when we want to allocate some data on the heap for multiple +parts of our program to read, and we can't determine at compile time which part +will finish using the data last. If we did know which part would finish last, +we could just make that the owner of the data and the normal ownership rules +enforced at compile time would kick in. + +Note that `Rc` is only for use in single-threaded scenarios; Chapter 16 on +concurrency will cover how to do reference counting in multithreaded programs. ### Using `Rc` to Share Data -Let’s return to our cons list example from Listing 15-5. In Listing 15-11, we’re -going to try to use `List` as we defined it using `Box`. First we’ll create -one list instance that contains 5 and then 10. Next, we want to create two more -lists: one that starts with 3 and continues on to our first list containing 5 -and 10, then another list that starts with 4 and *also* continues on to our -first list containing 5 and 10. In other words, we want two lists that both -share ownership of the third list, which conceptually will be something like -Figure 15-10: +Let's return to our cons list example from Listing 15-6, as we defined it using +`Box`. This time, we want to create two lists that both share ownership of a +third list, which conceptually will look something like Figure 15-11: Two lists that share ownership of a third list -Figure 15-10: Two lists, `b` and `c`, sharing ownership +Figure 15-11: Two lists, `b` and `c`, sharing ownership of a third list, `a` -Trying to implement this using our definition of `List` with `Box` won’t -work, as shown in Listing 15-11: +We'll create list `a` that contains 5 and then 10, then make two more lists: +`b` that starts with 3 and `c` that starts with 4. Both `b` and `c` lists will +then continue on to the first `a` list containing 5 and 10. In other words, +both lists will try to share the first list containing 5 and 10. + +Trying to implement this using our definition of `List` with `Box` won't +work, as shown in Listing 15-12: Filename: src/main.rs @@ -65,8 +70,8 @@ fn main() { } ``` -Listing 15-11: Having two lists using `Box` that try -to share ownership of a third list won’t work +Listing 15-12: Demonstrating we're not allowed to have +two lists using `Box` that try to share ownership of a third list If we compile this, we get this error: @@ -83,17 +88,32 @@ error[E0382]: use of moved value: `a` implement the `Copy` trait ``` -The `Cons` variants own the data they hold, so when we create the `b` list it -moves `a` to be owned by `b`. Then when we try to use `a` again when creating -`c`, we’re not allowed to since `a` has been moved. +The `Cons` variants own the data they hold, so when we create the `b` list, `a` +is moved into `b` and `b` owns `a`. Then, when we try to use `a` again when +creating `c`, we're not allowed to because `a` has been moved. We could change the definition of `Cons` to hold references instead, but then -we’d have to specify lifetime parameters and we’d have to construct elements of -a list such that every element lives at least as long as the list itself. -Otherwise, the borrow checker won’t even let us compile the code. - -Instead, we can change our definition of `List` to use `Rc` instead of -`Box` as shown here in Listing 15-12: +we'd have to specify lifetime parameters. By specifying lifetime parameters, +we'd be specifying that every element in the list will live at least as long as +the list itself. The borrow checker wouldn't let us compile `let a = Cons(10, +&Nil);` for example, since the temporary `Nil` value would be dropped before +`a` could take a reference to it. + +Instead, we'll change our definition of `List` to use `Rc` in place of +`Box` as shown here in Listing 15-13. Each `Cons` variant now holds a value +and an `Rc` pointing to a `List`. When we create `b`, instead of taking +ownership of `a`, we clone the `Rc` that `a` is holding, which increases the +number of references from 1 to 2 and lets `a` and `b` share ownership of the +data in that `Rc`. We also clone `a` when creating `c`, which increases the +number of references from 2 to 3. Every time we call `Rc::clone`, the reference +count to the data within the `Rc` is increased, and the data won't be cleaned +up unless there are zero references to it: + + + Filename: src/main.rs @@ -108,29 +128,55 @@ use std::rc::Rc; fn main() { let a = Rc::new(Cons(5, Rc::new(Cons(10, Rc::new(Nil))))); - let b = Cons(3, a.clone()); - let c = Cons(4, a.clone()); + let b = Cons(3, Rc::clone(&a)); + let c = Cons(4, Rc::clone(&a)); } ``` -Listing 15-12: A definition of `List` that uses +Listing 15-13: A definition of `List` that uses `Rc` -Note that we need to add a `use` statement for `Rc` because it’s not in the -prelude. In `main`, we create the list holding 5 and 10 and store it in a new -`Rc` in `a`. Then when we create `b` and `c`, we call the `clone` method on `a`. +We need to add a `use` statement to bring `Rc` into scope because it's not in +the prelude. In `main`, we create the list holding 5 and 10 and store it in a +new `Rc` in `a`. Then when we create `b` and `c`, we call the `Rc::clone` +function and pass a reference to the `Rc` in `a` as an argument. + +We could have called `a.clone()` rather than `Rc::clone(&a)`, but Rust +convention is to use `Rc::clone` in this case. The implementation of `clone` +doesn't make a deep copy of all the data like most types' implementations of +`clone` do. `Rc::clone` only increments the reference count, which doesn't take +very much time. Deep copies of data can take a lot of time, so by using +`Rc::clone` for reference counting, we can visually distinguish between the +deep copy kinds of clones that might have a large impact on runtime performance +and memory usage and the types of clones that increase the reference count that +have a comparatively small impact on runtime performance and don't allocate new +memory. ### Cloning an `Rc` Increases the Reference Count -We’ve seen the `clone` method previously, where we used it for making a -complete copy of some data. With `Rc`, though, it doesn’t make a full copy. -`Rc` holds a *reference count*, that is, a count of how many clones exist. -Let’s change `main` as shown in Listing 15-13 to have an inner scope around -where we create `c`, and to print out the results of the `Rc::strong_count` -associated function at various points. `Rc::strong_count` returns the reference -count of the `Rc` value we pass to it, and we’ll talk about why this function -is named `strong_count` in the section later in this chapter about preventing -reference cycles. +Let's change our working example from Listing 15-13 so that we can see the +reference counts changing as we create and drop references to the `Rc` in `a`. + + + + +In Listing 15-14, we'll change `main` so that it has an inner scope around list +`c`, so that we can see how the reference count changes when `c` goes out of +scope. At each point in the program where the reference count changes, we'll +print out the reference count, which we can get by calling the +`Rc::strong_count` function. We'll talk about why this function is named +`strong_count` rather than `count` in the section later in this chapter about +preventing reference cycles. + + + Filename: src/main.rs @@ -145,43 +191,50 @@ reference cycles. # fn main() { let a = Rc::new(Cons(5, Rc::new(Cons(10, Rc::new(Nil))))); - println!("rc = {}", Rc::strong_count(&a)); + println!("count after creating a = {}", Rc::strong_count(&a)); let b = Cons(3, a.clone()); - println!("rc after creating b = {}", Rc::strong_count(&a)); + println!("count after creating b = {}", Rc::strong_count(&a)); { let c = Cons(4, a.clone()); - println!("rc after creating c = {}", Rc::strong_count(&a)); + println!("count after creating c = {}", Rc::strong_count(&a)); } - println!("rc after c goes out of scope = {}", Rc::strong_count(&a)); + println!("count after c goes out of scope = {}", Rc::strong_count(&a)); } ``` -Listing 15-13: Printing out the reference count +Listing 15-14: Printing out the reference count This will print out: ```text -rc = 1 -rc after creating b = 2 -rc after creating c = 3 -rc after c goes out of scope = 2 +count after creating a = 1 +count after creating b = 2 +count after creating c = 3 +count after c goes out of scope = 2 ``` -We’re able to see that `a` has an initial reference count of one. Then each -time we call `clone`, the count goes up by one. When `c` goes out of scope, the -count is decreased by one, which happens in the implementation of the `Drop` -trait for `Rc`. What we can’t see in this example is that when `b` and then -`a` go out of scope at the end of `main`, the count of references to the list -containing 5 and 10 is then 0, and the list is dropped. This strategy lets us -have multiple owners, as the count will ensure that the value remains valid as -long as any of the owners still exist. - -In the beginning of this section, we said that `Rc` only allows you to share -data for multiple parts of your program to read through immutable references to -the `T` value the `Rc` contains. If `Rc` let us have a mutable reference, -we’d run into the problem that the borrowing rules disallow that we discussed -in Chapter 4: two mutable borrows to the same place can cause data races and -inconsistencies. But mutating data is very useful! In the next section, we’ll -discuss the interior mutability pattern and the `RefCell` type that we can -use in conjunction with an `Rc` to work with this restriction on -immutability. + + + +We're able to see that the `Rc` in `a` has an initial reference count of one, +then each time we call `clone`, the count goes up by one. When `c` goes out of +scope, the count goes down by one. We don't have to call a function to decrease +the reference count like we have to call `Rc::clone` to increase the reference +count; the implementation of the `Drop` trait decreases the reference count +automatically when an `Rc` value goes out of scope. + +What we can't see from this example is that when `b` and then `a` go out of +scope at the end of `main`, the count is then 0, and the `Rc` is cleaned up +completely at that point. Using `Rc` allows a single value to have multiple +owners, and the count will ensure that the value remains valid as long as any +of the owners still exist. + +`Rc` allows us to share data between multiple parts of our program for +reading only, via immutable references. If `Rc` allowed us to have multiple +mutable references too, we'd be able to violate one of the the borrowing rules +that we discussed in Chapter 4: multiple mutable borrows to the same place can +cause data races and inconsistencies. But being able to mutate data is very +useful! In the next section, we'll discuss the interior mutability pattern and +the `RefCell` type that we can use in conjunction with an `Rc` to work +with this restriction on immutability. From 494002748f278d315ffbb539b7490e09437fcc87 Mon Sep 17 00:00:00 2001 From: "Carol (Nichols || Goulding)" Date: Fri, 25 Aug 2017 18:03:50 -0400 Subject: [PATCH 06/18] Edits to 15-05 --- second-edition/dictionary.txt | 2 + .../src/ch15-05-interior-mutability.md | 594 +++++++++++++----- 2 files changed, 429 insertions(+), 167 deletions(-) diff --git a/second-edition/dictionary.txt b/second-edition/dictionary.txt index fe21fa5649..7f1dbeea5d 100644 --- a/second-edition/dictionary.txt +++ b/second-edition/dictionary.txt @@ -194,6 +194,7 @@ librarys libreoffice libstd lifecycle +LimitTracker lobally locators login @@ -211,6 +212,7 @@ Mibbit minigrep mixup mkdir +MockMessenger modifiability modularity monomorphization diff --git a/second-edition/src/ch15-05-interior-mutability.md b/second-edition/src/ch15-05-interior-mutability.md index 4e2b6c2853..e14c6af409 100644 --- a/second-edition/src/ch15-05-interior-mutability.md +++ b/second-edition/src/ch15-05-interior-mutability.md @@ -1,191 +1,447 @@ ## `RefCell` and the Interior Mutability Pattern + + + + + + *Interior mutability* is a design pattern in Rust for allowing you to mutate -data even though there are immutable references to that data, which would -normally be disallowed by the borrowing rules. The interior mutability pattern -involves using `unsafe` code inside a data structure to bend Rust’s usual rules -around mutation and borrowing. We haven’t yet covered unsafe code; we will in -Chapter 19. The interior mutability pattern is used when you can ensure that -the borrowing rules will be followed at runtime, even though the compiler can’t +data even when there are immutable references to that data, normally disallowed +by the borrowing rules. To do so, the pattern uses `unsafe` code inside a data +structure to bend Rust's usual rules around mutation and borrowing. We haven't +yet covered unsafe code; we will in Chapter 19. We can choose to use types that +make use of the interior mutability pattern when we can ensure that the +borrowing rules will be followed at runtime, even though the compiler can't ensure that. The `unsafe` code involved is then wrapped in a safe API, and the outer type is still immutable. -Let’s explore this by looking at the `RefCell` type that follows the +Let's explore this by looking at the `RefCell` type that follows the interior mutability pattern. -### `RefCell` has Interior Mutability +### Enforcing Borrowing Rules at Runtime with `RefCell` Unlike `Rc`, the `RefCell` type represents single ownership over the data -that it holds. So, what makes `RefCell` different than a type like `Box`? -Let’s recall the borrowing rules we learned in Chapter 4: +it holds. So, what makes `RefCell` different than a type like `Box`? +Let's recall the borrowing rules we learned in Chapter 4: 1. At any given time, you can have *either* but not both of: * One mutable reference. * Any number of immutable references. 2. References must always be valid. -With references and `Box`, the borrowing rules’ invariants are enforced at +With references and `Box`, the borrowing rules' invariants are enforced at compile time. With `RefCell`, these invariants are enforced *at runtime*. -With references, if you break these rules, you’ll get a compiler error. With -`RefCell`, if you break these rules, you’ll get a `panic!`. - -Static analysis, like the Rust compiler performs, is inherently conservative. -There are properties of code that are impossible to detect by analyzing the -code: the most famous is the Halting Problem, which is out of scope of this -book but an interesting topic to research if you’re interested. - -Because some analysis is impossible, the Rust compiler does not try to even -guess if it can’t be sure, so it’s conservative and sometimes rejects correct -programs that would not actually violate Rust’s guarantees. Put another way, if -Rust accepts an incorrect program, people would not be able to trust in the -guarantees Rust makes. If Rust rejects a correct program, the programmer will -be inconvenienced, but nothing catastrophic can occur. `RefCell` is useful -when you know that the borrowing rules are respected, but the compiler can’t -understand that that’s true. - -Similarly to `Rc`, `RefCell` is only for use in single-threaded -scenarios. We’ll talk about how to get the functionality of `RefCell` in a -multithreaded program in the next chapter on concurrency. For now, all you -need to know is that if you try to use `RefCell` in a multithreaded -context, you’ll get a compile time error. - -With references, we use the `&` and `&mut` syntax to create references and -mutable references, respectively. But with `RefCell`, we use the `borrow` -and `borrow_mut` methods, which are part of the safe API that `RefCell` has. -`borrow` returns the smart pointer type `Ref`, and `borrow_mut` returns the -smart pointer type `RefMut`. These two types implement `Deref` so that we can -treat them as if they’re regular references. `Ref` and `RefMut` track the -borrows dynamically, and their implementation of `Drop` releases the borrow -dynamically. - -Listing 15-14 shows what it looks like to use `RefCell` with functions that -borrow their parameters immutably and mutably. Note that the `data` variable is -declared as immutable with `let data` rather than `let mut data`, yet -`a_fn_that_mutably_borrows` is allowed to borrow the data mutably and make -changes to the data! +With references, if you break these rules, you'll get a compiler error. With +`RefCell`, if you break these rules, you'll get a `panic!`. + + + + +The advantages to checking the borrowing rules at compile time are that errors +will be caught sooner in the development process and there is no impact on +runtime performance since all the analysis is completed beforehand. For those +reasons, checking the borrowing rules at compile time is the best choice for +the majority of cases, which is why this is Rust's default. + +The advantage to checking the borrowing rules at runtime instead is that +certain memory safe scenarios are then allowed, whereas they are disallowed by +the compile time checks. Static analysis, like the Rust compiler, is inherently +conservative. Some properties of code are impossible to detect by analyzing the +code: the most famous exampled is the Halting Problem, which is out of scope of +this book but an interesting topic to research if you're interested. + + + + +Because some analysis is impossible, if the Rust compiler can't be sure the +code complies with the ownership rules, it may reject a correct program; in +this way, it is conservative. If Rust were to accept an incorrect program, +users would not be able to trust in the guarantees Rust makes, but if Rust +rejects a correct program, the programmer will be inconvenienced, but nothing +catastrophic can occur. `RefCell` is useful when you yourself are sure that +your code follows the borrowing rules, but the compiler is not able to +understand and guarantee that. + +Similarly to `Rc`, `RefCell` is only for use in single-threaded scenarios +and will give you a compile time error if you try in a multithreaded context. +We'll talk about how to get the functionality of `RefCell` in a +multithreaded program in Chapter 16. + + + + +To recap the reasons to choose `Box`, `Rc`, or `RefCell`: + +- `Rc` enables multiple owners of the same data; `Box` and `RefCell` + have single owners. +- `Box` allows immutable or mutable borrows checked at compile time; `Rc` + only allows immutable borrows checked at compile time; `RefCell` allows + immutable or mutable borrows checked at runtime. +- Because `RefCell` allows mutable borrows checked at runtime, we can mutate + the value inside the `RefCell` even when the `RefCell` is itself + immutable. + +The last reason is the *interior mutability* pattern. Let's look at a case when +interior mutability is useful and discuss how this is possible. + +### Interior Mutability: A Mutable Borrow to an Immutable Value + +A consequence of the borrowing rules is that when we have an immutable value, +we can't borrow it mutably. For example, this code won't compile: -Filename: src/main.rs +```rust,ignore +fn main() { + let x = 5; + let y = &mut x; +} +``` -```rust -use std::cell::RefCell; +If we try to compile this, we'll get this error: -fn a_fn_that_immutably_borrows(a: &i32) { - println!("a is {}", a); -} +```text +error[E0596]: cannot borrow immutable local variable `x` as mutable + --> src/main.rs:3:18 + | +2 | let x = 5; + | - consider changing this to `mut x` +3 | let y = &mut x; + | ^ cannot borrow mutably +``` + +However, there are situations where it would be useful for a value to be able +to mutate itself in its methods, but to other code, the value would appear to +be immutable. Code outside the value's methods would not be able to mutate the +value. `RefCell` is one way to get the ability to have interior mutability. +`RefCell` isn't getting around the borrowing rules completely, but the +borrow checker in the compiler allows this interior mutability and the +borrowing rules are checked at runtime instead. If we violate the rules, we'll +get a `panic!` instead of a compiler error. + +Let's work through a practical example where we can use `RefCell` to make it +possible to mutate an immutable value and see why that's useful. + +#### A Use Case for Interior Mutability: Mock Objects + +A *mock object* is the general programming concept for a type that stands in +the place of another type during testing. Mock objects simulate real objects, +and they can record what happens during a test so that we can assert that the +correct actions took place. + +While Rust doesn't have objects in the exact same sense that other languages +have objects, and Rust doesn't have mock object functionality built into the +standard library like some other languages do, we can definitely create a +struct that will serve the same purposes as a mock object. + +Here's the scenario we'd like to test: we're creating a library that tracks a +value against a maximum value, and sends messages based on how close to the +maximum value the current value is. This could be used for keeping track of a +user's quota for the number of API calls they're allowed to make, for example. + +Our library is only going to provide the functionality of tracking how close to +the maximum a value is, and what the messages should be at what times. +Applications that use our library will be expected to provide the actual +mechanism for sending the messages: the application could choose to put a +message in the application, send an email, send a text message, or something +else. Our library doesn't need to know about that detail; all it needs is +something that implements a trait we'll provide called `Messenger`. Listing +15-15 shows our library code: + +Filename: src/lib.rs -fn a_fn_that_mutably_borrows(b: &mut i32) { - *b += 1; +```rust +pub trait Messenger { + fn send(&self, msg: &str); } -fn demo(r: &RefCell) { - a_fn_that_immutably_borrows(&r.borrow()); - a_fn_that_mutably_borrows(&mut r.borrow_mut()); - a_fn_that_immutably_borrows(&r.borrow()); +pub struct LimitTracker<'a, T: 'a + Messenger> { + messenger: &'a T, + value: usize, + max: usize, } -fn main() { - let data = RefCell::new(5); - demo(&data); +impl<'a, T> LimitTracker<'a, T> + where T: Messenger { + pub fn new(messenger: &T, max: usize) -> LimitTracker { + LimitTracker { + messenger, + value: 0, + max, + } + } + + pub fn set_value(&mut self, value: usize) { + self.value = value; + + let percentage_of_max = self.value as f64 / self.max as f64; + + if percentage_of_max >= 0.75 && percentage_of_max < 0.9 { + self.messenger.send("Warning: You've used up over 75% of your quota!"); + } else if percentage_of_max >= 0.9 && percentage_of_max < 1.0 { + self.messenger.send("Urgent warning: You've used up over 90% of your quota!"); + } else if percentage_of_max >= 1.0 { + self.messenger.send("Error: You are over your quota!"); + } + } } ``` -Listing 15-14: Using `RefCell`, `borrow`, and -`borrow_mut` +Listing 15-15: A library to keep track of how close to a +maximum value a value is, and warn when the value is at certain levels + +One important part of this code is that the `Messenger` trait has one method, +`send`, that takes an immutable reference to `self` and text of the message. +This is the interface our mock object will need to have. The other important +part is that we want to test the behavior of the `set_value` method on the +`LimitTracker`. We can change what we pass in for the `value` parameter, but +`set_value` doesn't return anything for us to make assertions on. What we want +to be able to say is that if we create a `LimitTracker` with something that +implements the `Messenger` trait and a particular value for `max`, when we pass +different numbers for `value`, the messenger gets told to send the appropriate +messages. + +What we need is a mock object that, instead of actually sending an email or +text message when we call `send`, will only keep track of the messages it's +told to send. We can create a new instance of the mock object, create a +`LimitTracker` that uses the mock object, call the `set_value` method on +`LimitTracker`, then check that the mock object has the messages we expect. +Listing 15-16 shows an attempt of implementing a mock object to do just that, +but that the borrow checker won't allow: + +Filename: src/lib.rs + +```rust +#[cfg(test)] +mod tests { + use super::*; + + struct MockMessenger { + sent_messages: Vec, + } + + impl MockMessenger { + fn new() -> MockMessenger { + MockMessenger { sent_messages: vec![] } + } + } + + impl Messenger for MockMessenger { + fn send(&self, message: &str) { + self.sent_messages.push(String::from(message)); + } + } + + #[test] + fn it_sends_an_over_75_percent_warning_message() { + let mock_messenger = MockMessenger::new(); + let mut limit_tracker = LimitTracker::new(&mock_messenger, 100); + + limit_tracker.set_value(80); + + assert_eq!(mock_messenger.sent_messages.len(), 1); + } +} +``` -This example prints: +Listing 15-16: An attempt to implement a `MockMessenger` +that isn't allowed by the borrow checker + +This test code defines a `MockMessenger` struct that has a `sent_messages` +field with a `Vec` of `String` values to keep track of the messages it's told +to send. We also defined an associated function `new` to make it convenient to +create new `MockMessenger` values that start with an empty list of messages. We +then implement the `Messenger` trait for `MockMessenger` so that we can give a +`MockMessenger` to a `LimitTracker`. In the definition of the `send` method, we +take the message passed in as a parameter and store it in the `MockMessenger` +list of `sent_messages`. + +In the test, we're testing what happens when the `LimitTracker` is told to set +`value` to something that's over 75% of the `max` value. First, we create a new +`MockMessenger`, which will start with an empty list of messages. Then we +create a new `LimitTracker` and give it a reference to the new `MockMessenger` +and a `max` value of 100. We call the `set_value` method on the `LimitTracker` +with a value of 80, which is more than 75% of 100. Then we assert that the list +of messages that the `MockMessenger` is keeping track of should now have one +message in it. + +There's one problem with this test, however: ```text -a is 5 -a is 6 +error[E0596]: cannot borrow immutable field `self.sent_messages` as mutable + --> src/lib.rs:46:13 + | +45 | fn send(&self, message: &str) { + | ----- use `&mut self` here to make mutable +46 | self.sent_messages.push(String::from(message)); + | ^^^^^^^^^^^^^^^^^^ cannot mutably borrow immutable field ``` -In `main`, we’ve created a new `RefCell` containing the value 5, and stored -in the variable `data`, declared without the `mut` keyword. We then call the -`demo` function with an immutable reference to `data`: as far as `main` is -concerned, `data` is immutable! +We can't modify the `MockMessenger` to keep track of the messages because the +`send` method takes an immutable reference to `self`. We also can't take the +suggestion from the error text to use `&mut self` instead because then the +signature of `send` wouldn't match the signature in the `Messenger` trait +definition (feel free to try and see what error message you get). -In the `demo` function, we get an immutable reference to the value inside the -`RefCell` by calling the `borrow` method, and we call -`a_fn_that_immutably_borrows` with that immutable reference. More -interestingly, we can get a *mutable* reference to the value inside the -`RefCell` with the `borrow_mut` method, and the function -`a_fn_that_mutably_borrows` is allowed to change the value. We can see that the -next time we call `a_fn_that_immutably_borrows` that prints out the value, it’s -6 instead of 5. +This is where interior mutability can help! We're going to store the +`sent_messages` within a `RefCell`, and then the `send` message will be able to +modify `sent_messages` to store the messages we've seen. Listing 15-17 shows +what that looks like: -### Borrowing Rules are Checked at Runtime on `RefCell` +Filename: src/lib.rs -Recall from Chapter 4 that because of the borrowing rules, this code using -regular references that tries to create two mutable borrows in the same scope -won’t compile: +```rust +#[cfg(test)] +mod tests { + use super::*; + use std::cell::RefCell; + + struct MockMessenger { + sent_messages: RefCell>, + } + + impl MockMessenger { + fn new() -> MockMessenger { + MockMessenger { sent_messages: RefCell::new(vec![]) } + } + } + + impl Messenger for MockMessenger { + fn send(&self, message: &str) { + self.sent_messages.borrow_mut().push(String::from(message)); + } + } + + #[test] + fn it_sends_an_over_75_percent_warning_message() { + // ...snip... +# let mock_messenger = MockMessenger::new(); +# let mut limit_tracker = LimitTracker::new(&mock_messenger, 100); +# limit_tracker.set_value(75); + + assert_eq!(mock_messenger.sent_messages.borrow().len(), 1); + } +} +``` -```rust,ignore -let mut s = String::from("hello"); +Listing 15-17: Using `RefCell` to be able to mutate an +inner value while the outer value is considered immutable -let r1 = &mut s; -let r2 = &mut s; -``` +The `sent_messages` field is now of type `RefCell>` instead of +`Vec`. In the `new` function, we create a new `RefCell` instance around +the empty vector. -We’ll get this compiler error: +For the implementation of the `send` method, the first parameter is still an +immutable borrow of `self`, which matches the trait definition. We call +`borrow_mut` on the `RefCell` in `self.sent_messages` to get a mutable +reference to the value inside the `RefCell`, which is the vector. Then we can +call `push` on the mutable reference to the vector in order to keep track of +the messages seen during the test. -```text -error[E0499]: cannot borrow `s` as mutable more than once at a time - --> - | -5 | let r1 = &mut s; - | - first mutable borrow occurs here -6 | let r2 = &mut s; - | ^ second mutable borrow occurs here -7 | } - | - first borrow ends here -``` +The last change we have to make is in the assertion: in order to see how many +items are in the inner vector, we call `borrow` on the `RefCell` to get an +immutable reference to the vector. -In contrast, using `RefCell` and calling `borrow_mut` twice in the same -scope *will* compile, but it’ll panic at runtime instead. This code: +Now that we've seen how to use `RefCell`, let's dig into how it works! -```rust,should_panic -use std::cell::RefCell; +#### `RefCell` Keeps Track of Borrows at Runtime -fn main() { - let s = RefCell::new(String::from("hello")); +When creating immutable and mutable references we use the `&` and `&mut` +syntax, respectively. With `RefCell`, we use the `borrow` and `borrow_mut` +methods, which are part of the safe API that belongs to `RefCell`. The +`borrow` method returns the smart pointer type `Ref`, and `borrow_mut` returns +the smart pointer type `RefMut`. Both types implement `Deref` so we can treat +them like regular references. + + + + +The `RefCell` keeps track of how many `Ref` and `RefMut` smart pointers are +currently active. Every time we call `borrow`, the `RefCell` increases its +count of how many immutable borrows are active. When a `Ref` value goes out of +scope, the count of immutable borrows goes down by one. Just like the compile +time borrowing rules, `RefCell` lets us have many immutable borrows or one +mutable borrow at any point in time. + +If we try to violate these rules, rather than getting a compiler error like we +would with references, the implementation of `RefCell` will `panic!` at +runtime. Listing 15-18 shows a modification to the implementation of `send` +from Listing 15-17 where we're deliberately trying to create two mutable +borrows active for the same scope in order to illustrate that `RefCell` +prevents us from doing this at runtime: + +Filename: src/lib.rs - let r1 = s.borrow_mut(); - let r2 = s.borrow_mut(); +```rust,ignore +impl Messenger for MockMessenger { + fn send(&self, message: &str) { + let mut one_borrow = self.sent_messages.borrow_mut(); + let mut two_borrow = self.sent_messages.borrow_mut(); + + one_borrow.push(String::from(message)); + two_borrow.push(String::from(message)); + } } ``` -compiles but panics with the following error when we `cargo run`: +Listing 15-18: Creating two mutable references in the +same scope to see that `RefCell` will panic + +We create a variable `one_borrow` for the `RefMut` smart pointer returned from +`borrow_mut`. Then we create another mutable borrow in the same way in the +variable `two_borrow`. This makes two mutable references in the same scope, +which isn't allowed. If we run the tests for our library, this code will +compile without any errors, but the test will fail: ```text - Finished dev [unoptimized + debuginfo] target(s) in 0.83 secs - Running `target/debug/refcell` -thread 'main' panicked at 'already borrowed: BorrowMutError', -/stable-dist-rustc/build/src/libcore/result.rs:868 +---- tests::it_sends_an_over_75_percent_warning_message stdout ---- + thread 'tests::it_sends_an_over_75_percent_warning_message' panicked at + 'already borrowed: BorrowMutError', src/libcore/result.rs:906:4 note: Run with `RUST_BACKTRACE=1` for a backtrace. ``` -This runtime `BorrowMutError` is similar to the compiler error: it says we’ve -already borrowed `s` mutably once, so we’re not allowed to borrow it again. We -aren’t getting around the borrowing rules, we’re just choosing to have Rust -enforce them at runtime instead of compile time. You could choose to use -`RefCell` everywhere all the time, but in addition to having to type -`RefCell` a lot, you’d find out about possible problems later (possibly in -production rather than during development). Also, checking the borrowing rules -while your program is running has a performance penalty. - -### Multiple Owners of Mutable Data by Combining `Rc` and `RefCell` - -So why would we choose to make the tradeoffs that using `RefCell` involves? -Well, remember when we said that `Rc` only lets you have an immutable -reference to `T`? Given that `RefCell` is immutable, but has interior -mutability, we can combine `Rc` and `RefCell` to get a type that’s both -reference counted and mutable. Listing 15-15 shows an example of how to do -that, again going back to our cons list from Listing 15-5. In this example, -instead of storing `i32` values in the cons list, we’ll be storing -`Rc>` values. We want to store that type so that we can have an -owner of the value that’s not part of the list (the multiple owners -functionality that `Rc` provides), and so we can mutate the inner `i32` -value (the interior mutability functionality that `RefCell` provides): +We can see that the code panicked with the message `already borrowed: +BorrowMutError`. This is how `RefCell` handles violations of the borrowing +rules at runtime. + +Catching borrowing errors at runtime rather than compile time means that we'd +find out that we made a mistake in our code later in the development process-- +and possibly not even until our code was deployed to production. There's also a +small runtime performance penalty our code will incur as a result of keeping +track of the borrows at runtime rather than compile time. However, using +`RefCell` made it possible for us to write a mock object that can modify itself +to keep track of the messages it has seen while we're using it in a context +where only immutable values are allowed. We can choose to use `RefCell` +despite its tradeoffs to get more abilities than regular references give us. + +### Having Multiple Owners of Mutable Data by Combining `Rc` and `RefCell` + +A common way to use `RefCell` is in combination with `Rc`. Recall that +`Rc` lets us have multiple owners of some data, but it only gives us +immutable access to that data. If we have an `Rc` that holds a `RefCell`, +then we can get a value that can have multiple owners *and* that we can mutate! + + + + +For example, recall the cons list example from Listing 15-13 where we used +`Rc` to let us have multiple lists share ownership of another list. Because +`Rc` only holds immutable values, we aren't able to change any of the values +in the list once we've created them. Let's add in `RefCell` to get the +ability to change the values in the lists. Listing 15-19 shows that by using a +`RefCell` in the `Cons` definition, we're allowed to modify the value stored +in all the lists: Filename: src/main.rs @@ -203,54 +459,58 @@ use std::cell::RefCell; fn main() { let value = Rc::new(RefCell::new(5)); - let a = Cons(value.clone(), Rc::new(Nil)); - let shared_list = Rc::new(a); + let a = Rc::new(Cons(Rc::clone(&value), Rc::new(Nil))); - let b = Cons(Rc::new(RefCell::new(6)), shared_list.clone()); - let c = Cons(Rc::new(RefCell::new(10)), shared_list.clone()); + let b = Cons(Rc::new(RefCell::new(6)), Rc::clone(&a)); + let c = Cons(Rc::new(RefCell::new(10)), Rc::clone(&a)); *value.borrow_mut() += 10; - println!("shared_list after = {:?}", shared_list); + println!("a after = {:?}", a); println!("b after = {:?}", b); println!("c after = {:?}", c); } ``` -Listing 15-15: Using `Rc>` to create a +Listing 15-19: Using `Rc>` to create a `List` that we can mutate -We’re creating a value, which is an instance of `Rc>`. We’re -storing it in a variable named `value` because we want to be able to access it -directly later. Then we create a `List` in `a` that has a `Cons` variant that -holds `value`, and `value` needs to be cloned since we want `value` to also -have ownership in addition to `a`. Then we wrap `a` in an `Rc` so that we -can create lists `b` and `c` that start differently but both refer to `a`, -similarly to what we did in Listing 15-12. +We create a value that's an instance of `Rc` and store it in a +variable named `value` so we can access it directly later. Then we create a +`List` in `a` with a `Cons` variant that holds `value`. We need to clone +`value` so that both `a` and `value` have ownership of the inner `5` value, +rather than transferring ownership from `value` to `a` or having `a` borrow +from `value`. + + + + +We wrap the list `a` in an `Rc` so that when we create lists `b` and +`c`, they can both refer to `a`, the same as we did in Listing 15-13. -Once we have the lists in `shared_list`, `b`, and `c` created, then we add 10 -to the 5 in `value` by dereferencing the `Rc` and calling `borrow_mut` on -the `RefCell`. +Once we have the lists in `a`, `b`, and `c` created, we add 10 to the value in +`value` by dereferencing the `Rc` and calling `borrow_mut` on the `RefCell`. -When we print out `shared_list`, `b`, and `c`, we can see that they all have -the modified value of 15: +When we print out `a`, `b`, and `c`, we can see that they all have the modified +value of 15 rather than 5: ```text -shared_list after = Cons(RefCell { value: 15 }, Nil) +a after = Cons(RefCell { value: 15 }, Nil) b after = Cons(RefCell { value: 6 }, Cons(RefCell { value: 15 }, Nil)) c after = Cons(RefCell { value: 10 }, Cons(RefCell { value: 15 }, Nil)) ``` -This is pretty neat! By using `RefCell`, we can have an outwardly immutable +This is pretty neat! By using `RefCell`, we have an outwardly immutable `List`, but we can use the methods on `RefCell` that provide access to its -interior mutability to be able to modify our data when we need to. The runtime -checks of the borrowing rules that `RefCell` does protect us from data -races, and we’ve decided that we want to trade a bit of speed for the -flexibility in our data structures. - -`RefCell` is not the only standard library type that provides interior -mutability. `Cell` is similar but instead of giving references to the inner -value like `RefCell` does, the value is copied in and out of the `Cell`. -`Mutex` offers interior mutability that is safe to use across threads, and -we’ll be discussing its use in the next chapter on concurrency. Check out the -standard library docs for more details on the differences between these types. +interior mutability so we can modify our data when we need to. The runtime +checks of the borrowing rules protect us from data races, and it's sometimes +worth trading a bit of speed for this flexibility in our data structures. + +The standard library has other types that provide interior mutability, too, +like `Cell`, which is similar except that instead of giving references to +the inner value, the value is copied in and out of the `Cell`. There's also +`Mutex`, which offers interior mutability that's safe to use across threads, +and we'll be discussing its use in the next chapter on concurrency. Check out +the standard library docs for more details on the differences between these +types. From 12d1fce2e48e456b8b3af886fe9b3236d68810d2 Mon Sep 17 00:00:00 2001 From: "Carol (Nichols || Goulding)" Date: Thu, 7 Sep 2017 20:50:59 -0400 Subject: [PATCH 07/18] Edits to 15-06 --- .../src/ch15-06-reference-cycles.md | 541 ++++++++++++------ 1 file changed, 355 insertions(+), 186 deletions(-) diff --git a/second-edition/src/ch15-06-reference-cycles.md b/second-edition/src/ch15-06-reference-cycles.md index df52d48c94..37881b8688 100644 --- a/second-edition/src/ch15-06-reference-cycles.md +++ b/second-edition/src/ch15-06-reference-cycles.md @@ -1,29 +1,27 @@ ## Creating Reference Cycles and Leaking Memory is Safe -Rust makes a number of guarantees that we’ve talked about, for example that -we’ll never have a null value, and data races will be disallowed at compile -time. Rust’s memory safety guarantees make it more difficult to create memory -that never gets cleaned up, which is known as a *memory leak*. Rust does not -make memory leaks *impossible*, however: preventing memory leaks is *not* one -of Rust’s guarantees. In other words, memory leaks are memory safe. - -By using `Rc` and `RefCell`, it is possible to create cycles of -references where items refer to each other in a cycle. This is bad because the -reference count of each item in the cycle will never reach 0, and the values -will never be dropped. Let’s take a look at how that might happen and how to -prevent it. - -In Listing 15-16, we’re going to use another variation of the `List` definition -from Listing 15-5. We’re going back to storing an `i32` value as the first -element in the `Cons` variant. The second element in the `Cons` variant is now -`RefCell>`: instead of being able to modify the `i32` value this time, -we want to be able to modify which `List` a `Cons` variant is pointing to. -We’ve also added a `tail` method to make it convenient for us to access the -second item, if we have a `Cons` variant: +Rust's memory safety guarantees make it *difficult* to accidentally create +memory that's never cleaned up, known as a *memory leak*, but not impossible. +Entirely preventing memory leaks is not one of Rust's guarantees in the same +way that disallowing data races at compile time is, meaning memory leaks are +memory safe in Rust. We can see this with `Rc` and `RefCell`: it's +possible to create references where items refer to each other in a cycle. This +creates memory leaks because the reference count of each item in the cycle will +never reach 0, and the values will never be dropped. + +### Creating a Reference Cycle + +Let's take a look at how a reference cycle might happen and how to prevent it, +starting with the definition of the `List` enum and a `tail` method in Listing +15-20: Filename: src/main.rs ```rust,ignore +use std::rc::Rc; +use std::cell::RefCell; +use List::{Cons, Nil}; + #[derive(Debug)] enum List { Cons(i32, RefCell>), @@ -40,18 +38,42 @@ impl List { } ``` -Listing 15-16: A cons list definition that holds a +Listing 15-20: A cons list definition that holds a `RefCell` so that we can modify what a `Cons` variant is referring to -Next, in Listing 15-17, we’re going to create a `List` value in the variable -`a` that initially is a list of `5, Nil`. Then we’ll create a `List` value in -the variable `b` that is a list of the value 10 and then points to the list in -`a`. Finally, we’ll modify `a` so that it points to `b` instead of `Nil`, which -will then create a cycle: +We're using another variation of the `List` definition from Listing 15-6. The +second element in the `Cons` variant is now `RefCell>`, meaning that +instead of having the ability to modify the `i32` value like we did in Listing +15-19, we want to be able to modify which `List` a `Cons` variant is pointing +to. We've also added a `tail` method to make it convenient for us to access the +second item, if we have a `Cons` variant. + + + + +In listing 15-21, we're adding a `main` function that uses the definitions from +Listing 15-20. This code creates a list in `a`, a list in `b` that points to +the list in `a`, and then modifies the list in `a` to point to `b`, which +creates a reference cycle. There are `println!` statements along the way to +show what the reference counts are at various points in this process. + + + Filename: src/main.rs ```rust +# use List::{Cons, Nil}; +# use std::rc::Rc; +# use std::cell::RefCell; # #[derive(Debug)] # enum List { # Cons(i32, RefCell>), @@ -67,25 +89,20 @@ will then create a cycle: # } # } # -use List::{Cons, Nil}; -use std::rc::Rc; -use std::cell::RefCell; - fn main() { - let a = Rc::new(Cons(5, RefCell::new(Rc::new(Nil)))); println!("a initial rc count = {}", Rc::strong_count(&a)); println!("a next item = {:?}", a.tail()); - let b = Rc::new(Cons(10, RefCell::new(a.clone()))); + let b = Rc::new(Cons(10, RefCell::new(Rc::clone(&a)))); println!("a rc count after b creation = {}", Rc::strong_count(&a)); println!("b initial rc count = {}", Rc::strong_count(&b)); println!("b next item = {:?}", b.tail()); if let Some(ref link) = a.tail() { - *link.borrow_mut() = b.clone(); + *link.borrow_mut() = Rc::clone(&b); } println!("b rc count after changing a = {}", Rc::strong_count(&b)); @@ -97,73 +114,154 @@ fn main() { } ``` -Listing 15-17: Creating a reference cycle of two `List` +Listing 15-21: Creating a reference cycle of two `List` values pointing to each other -We use the `tail` method to get a reference to the `RefCell` in `a`, which we -put in the variable `link`. Then we use the `borrow_mut` method on the -`RefCell` to change the value inside from an `Rc` that holds a `Nil` value to -the `Rc` in `b`. We’ve created a reference cycle that looks like Figure 15-18: +We create an `Rc` instance holding a `List` value in the variable `a` with an +initial list of `5, Nil`. We then create an `Rc` instance holding another +`List` value in the variable `b` that contains the value 10, then points to the +list in `a`. + +Finally, we modify `a` so that it points to `b` instead of `Nil`, which creates +a cycle. We do that by using the `tail` method to get a reference to the +`RefCell` in `a`, which we put in the variable `link`. Then we use the +`borrow_mut` method on the `RefCell` to change the value inside from an `Rc` +that holds a `Nil` value to the `Rc` in `b`. -Reference cycle of lists +If we run this code, keeping the last `println!` commented out for the moment, +we'll get this output: -Figure 15-18: A reference cycle of lists `a` and `b` +```text +a initial rc count = 1 +a next item = Some(RefCell { value: Nil }) +a rc count after b creation = 2 +b initial rc count = 1 +b next item = Some(RefCell { value: Cons(5, RefCell { value: Nil }) }) +b rc count after changing a = 2 +a rc count after changing a = 2 +``` + +We can see that the reference count of the `Rc` instances in both `a` and `b` +are 2 after we change the list in `a` to point to `b`. At the end of `main`, +Rust will try and drop `b` first, which will decrease the count in each of the +`Rc` instances in `a` and `b` by one. + + + + + + + +However, because `a` is still referencing the `Rc` that was in `b`, that `Rc` +has a count of 1 rather than 0, so the memory the `Rc` has on the heap won't be +dropped. The memory will just sit there with a count of one, forever. + +To visualize this, we've created a reference cycle that looks like Figure 15-22: + +Reference cycle of lists + +Figure 15-22: A reference cycle of lists `a` and `b` pointing to each other -If you uncomment the last `println!`, Rust will try and print this cycle out -with `a` pointing to `b` pointing to `a` and so forth until it overflows the -stack. - -Looking at the results of the `println!` calls before the last one, we’ll see -that the reference count of both `a` and `b` are 2 after we change `a` to point -to `b`. At the end of `main`, Rust will try and drop `b` first, which will -decrease the count of the `Rc` by one. However, because `a` is still -referencing that `Rc`, its count is 1 rather than 0, so the memory the `Rc` has -on the heap won’t be dropped. It’ll just sit there with a count of one, -forever. In this specific case, the program ends right away, so it’s not a -problem, but in a more complex program that allocates lots of memory in a cycle -and holds onto it for a long time, this would be a problem. The program would -be using more memory than it needs to be, and might overwhelm the system and -cause it to run out of memory available to use. - -Now, as you can see, creating reference cycles is difficult and inconvenient in -Rust. But it’s not impossible: preventing memory leaks in the form of reference -cycles is not one of the guarantees Rust makes. If you have `RefCell` values -that contain `Rc` values or similar nested combinations of types with -interior mutability and reference counting, be aware that you’ll have to ensure -that you don’t create cycles. In the example in Listing 15-14, the solution -would probably be to not write code that could create cycles like this, since -we do want `Cons` variants to own the list they point to. - -With data structures like graphs, it’s sometimes necessary to have references -that create cycles in order to have parent nodes point to their children and -children nodes point back in the opposite direction to their parents, for -example. If one of the directions is expressing ownership and the other isn’t, -one way of being able to model the relationship of the data without creating -reference cycles and memory leaks is using `Weak`. Let’s explore that next! - -### Prevent Reference Cycles: Turn an `Rc` into a `Weak` - -The Rust standard library provides `Weak`, a smart pointer type for use in -situations that have cycles of references but only one direction expresses -ownership. We’ve been showing how cloning an `Rc` increases the -`strong_count` of references; `Weak` is a way to reference an `Rc` that -does not increment the `strong_count`: instead it increments the `weak_count` -of references to an `Rc`. When an `Rc` goes out of scope, the inner value will -get dropped if the `strong_count` is 0, even if the `weak_count` is not 0. To -be able to get the value from a `Weak`, we first have to upgrade it to an -`Option>` by using the `upgrade` method. The result of upgrading a -`Weak` will be `Some` if the `Rc` value has not been dropped yet, and `None` -if the `Rc` value has been dropped. Because `upgrade` returns an `Option`, we -know Rust will make sure we handle both the `Some` case and the `None` case and -we won’t be trying to use an invalid pointer. - -Instead of the list in Listing 15-17 where each item knows only about the -next item, let’s say we want a tree where the items know about their children -items *and* their parent items. - -Let’s start just with a struct named `Node` that holds its own `i32` value as -well as references to its children `Node` values: +If you uncomment the last `println!` and run the program, Rust will try and +print this cycle out with `a` pointing to `b` pointing to `a` and so forth +until it overflows the stack. + + + + +In this specific case, right after we create the reference cycle, the program +ends. The consequences of this cycle aren't so dire. If a more complex program +allocates lots of memory in a cycle and holds onto it for a long time, the +program would be using more memory than it needs, and might overwhelm the +system and cause it to run out of available memory. + +Creating reference cycles is not easily done, but it's not impossible either. +If you have `RefCell` values that contain `Rc` values or similar nested +combinations of types with interior mutability and reference counting, be aware +that you have to ensure you don't create cycles yourself; you can't rely on +Rust to catch them. Creating a reference cycle would be a logic bug in your +program that you should use automated tests, code reviews, and other software +development practices to minimize. + + + + +Another solution is reorganizing your data structures so that some references +express ownership and some references don't. In this way, we can have cycles +made up of some ownership relationships and some non-ownership relationships, +and only the ownership relationships affect whether a value may be dropped or +not. In Listing 15-20, we always want `Cons` variants to own their list, so +reorganizing the data structure isn't possible. Let's look at an example using +graphs made up of parent nodes and child nodes to see when non-ownership +relationships are an appropriate way to prevent reference cycles. + +### Preventing Reference Cycles: Turn an `Rc` into a `Weak` + +So far, we've shown how calling `Rc::clone` increases the `strong_count` of an +`Rc` instance, and that an `Rc` instance is only cleaned up if its +`strong_count` is 0. We can also create a *weak reference* to the value within +an `Rc` instance by calling `Rc::downgrade` and passing a reference to the +`Rc`. When we call `Rc::downgrade`, we get a smart pointer of type `Weak`. +Instead of increasing the `strong_count` in the `Rc` instance by one, calling +`Rc::downgrade` increases the `weak_count` by one. The `Rc` type uses +`weak_count` to keep track of how many `Weak` references exist, similarly to +`strong_count`. The difference is the `weak_count` does not need to be 0 in +order for the `Rc` instance to be cleaned up. + + + + +Strong references are how we can share ownership of an `Rc` instance. Weak +references don't express an ownership relationship. They won't cause a +reference cycle since any cycle involving some weak references will be broken +once the strong reference count of values involved is 0. + + + + +Because the value that `Weak` references might have been dropped, in order +to do anything with the value that a `Weak` is pointing to, we have to check +to make sure the value is still around. We do this by calling the `upgrade` +method on a `Weak` instance, which will return an `Option>`. We'll get +a result of `Some` if the `Rc` value has not been dropped yet, and `None` if +the `Rc` value has been dropped. Because `upgrade` returns an `Option`, we can +be sure that Rust will handle both the `Some` case and the `None` case, and +there won't be an invalid pointer. + +As an example, rather than using a list whose items know only about the next +item, we'll create a tree whose items know about their children items *and* +their parent items. + +#### Creating a Tree Data Structure: a `Node` with Child Nodes + +To start building this tree, we'll create a struct named `Node` that holds its +own `i32` value as well as references to its children `Node` values: Filename: src/main.rs @@ -178,17 +276,28 @@ struct Node { } ``` -We want to be able to have a `Node` own its children, and we also want to be -able to have variables own each node so we can access them directly. That’s why -the items in the `Vec` are `Rc` values. We want to be able to modify what -nodes are another node’s children, so that’s why we have a `RefCell` in -`children` around the `Vec`. In Listing 15-19, let’s create one instance of -`Node` named `leaf` with the value 3 and no children, and another instance -named `branch` with the value 5 and `leaf` as one of its children: +We want a `Node` to own its children, and we want to be able to share that +ownership with variables so we can access each `Node` in the tree directly. To +do this, we define the `Vec` items to be values of type `Rc`. We also +want to be able to modify which nodes are children of another node, so we have +a `RefCell` in `children` around the `Vec`. + +Next, let's use our struct definition and create one `Node` instance named +`leaf` with the value 3 and no children, and another instance named `branch` +with the value 5 and `leaf` as one of its children, as shown in Listing 15-23: Filename: src/main.rs -```rust,ignore +```rust +# use std::rc::Rc; +# use std::cell::RefCell; +# +# #[derive(Debug)] +# struct Node { +# value: i32, +# children: RefCell>>, +# } +# fn main() { let leaf = Rc::new(Node { value: 3, @@ -197,30 +306,41 @@ fn main() { let branch = Rc::new(Node { value: 5, - children: RefCell::new(vec![leaf.clone()]), + children: RefCell::new(vec![Rc::clone(&leaf)]), }); } ``` -Listing 15-19: Creating a `leaf` node and a `branch` node -where `branch` has `leaf` as one of its children but `leaf` has no reference to -`branch` +Listing 15-23: Creating a `leaf` node with no children +and a `branch` node with `leaf` as one of its children -The `Node` in `leaf` now has two owners: `leaf` and `branch`, since we clone -the `Rc` in `leaf` and store that in `branch`. The `Node` in `branch` knows -it’s related to `leaf` since `branch` has a reference to `leaf` in -`branch.children`. However, `leaf` doesn’t know that it’s related to `branch`, -and we’d like `leaf` to know that `branch` is its parent. +We clone the `Rc` in `leaf` and store that in `branch`, meaning the `Node` in +`leaf` now has two owners: `leaf` and `branch`. We can get to `leaf` from +`branch` through `branch.children`, but there's no way to get from `leaf` to +`branch`. `leaf` has no reference to `branch` and doesn't know they are +related. We'd like `leaf` to know that `branch` is its parent. -To do that, we’re going to add a `parent` field to our `Node` struct -definition, but what should the type of `parent` be? We know it can’t contain -an `Rc`, since `leaf.parent` would point to `branch` and `branch.children` -contains a pointer to `leaf`, which makes a reference cycle. Neither `leaf` nor -`branch` would get dropped since they would always refer to each other and -their reference counts would never be zero. +#### Adding a Reference from a Child to its Parent -So instead of `Rc`, we’re going to make the type of `parent` use `Weak`, -specifically a `RefCell>`: +To make the child node aware of its parent, we need to add a `parent` field to +our `Node` struct definition. The trouble is in deciding what the type of +`parent` should be. We know it can't contain an `Rc` because that would +create a reference cycle, with `leaf.parent` pointing to `branch` and +`branch.children` pointing to `leaf`, which would cause their `strong_count` +values to never be zero. + +Thinking about the relationships another way, a parent node should own its +children: if a parent node is dropped, its child nodes should be dropped as +well. However, a child should not own its parent: if we drop a child node, the +parent should still exist. This is a case for weak references! + +So instead of `Rc`, we'll make the type of `parent` use `Weak`, specifically +a `RefCell>`. Now our `Node` struct definition looks like this: + + + Filename: src/main.rs @@ -236,14 +356,35 @@ struct Node { } ``` -This way, a node will be able to refer to its parent node if it has one, -but it does not own its parent. A parent node will be dropped even if -it has child nodes referring to it, as long as it doesn’t have a parent -node as well. Now let’s update `main` to look like Listing 15-20: + + + +This way, a node will be able to refer to its parent node, but does not own its +parent. In Listing 15-24, let's update `main` to use this new definition so +that the `leaf` node will have a way to refer to its parent, `branch`: + + + Filename: src/main.rs -```rust,ignore +```rust +# use std::rc::{Rc, Weak}; +# use std::cell::RefCell; +# +# #[derive(Debug)] +# struct Node { +# value: i32, +# parent: RefCell>, +# children: RefCell>>, +# } +# fn main() { let leaf = Rc::new(Node { value: 3, @@ -265,30 +406,45 @@ fn main() { } ``` -Listing 15-20: A `leaf` node and a `branch` node where -`leaf` has a `Weak` reference to its parent, `branch` +Listing 15-24: A `leaf` node with a `Weak` reference to +its parent node, `branch` + + -Creating the `leaf` node looks similar; since it starts out without a parent, -we create a new `Weak` reference instance. When we try to get a reference to -the parent of `leaf` by using the `upgrade` method, we’ll get a `None` value, -as shown by the first `println!` that outputs: +Creating the `leaf` node looks similar to how creating the `leaf` node looked +in Listing 15-23, with the exception of the `parent` field: `leaf` starts out +without a parent, so we create a new, empty `Weak` reference instance. + +At this point, when we try to get a reference to the parent of `leaf` by using +the `upgrade` method, we get a `None` value. We see this in the output from the +first `println!`: ```text leaf parent = None ``` -Similarly, `branch` will also have a new `Weak` reference, since `branch` does -not have a parent node. We still make `leaf` be one of the children of -`branch`. Once we have a new `Node` instance in `branch`, we can modify `leaf` -to have a `Weak` reference to `branch` for its parent. We use the `borrow_mut` -method on the `RefCell` in the `parent` field of `leaf`, then we use the -`Rc::downgrade` function to create a `Weak` reference to `branch` from the `Rc` -in `branch.` + + + +When we create the `branch` node, it will also have a new `Weak` reference, +since `branch` does not have a parent node. We still have `leaf` as one of the +children of `branch`. Once we have the `Node` instance in `branch`, we can +modify `leaf` to give it a `Weak` reference to its parent. We use the +`borrow_mut` method on the `RefCell` in the `parent` field of `leaf`, then we +use the `Rc::downgrade` function to create a `Weak` reference to `branch` from +the `Rc` in `branch.` + + + -When we print out the parent of `leaf` again, this time we’ll get a `Some` -variant holding `branch`. Also notice we don’t get a cycle printed out that -eventually ends in a stack overflow like we did in Listing 15-14: the `Weak` -references are just printed as `(Weak)`: +When we print out the parent of `leaf` again, this time we'll get a `Some` +variant holding `branch`: `leaf` can now access its parent! When we print out +`leaf`, we also avoid the cycle that eventually ended in a stack overflow like +we had in Listing 15-21: the `Weak` references are printed as `(Weak)`: ```text leaf parent = Some(Node { value: 5, parent: RefCell { value: (Weak) }, @@ -296,12 +452,17 @@ children: RefCell { value: [Node { value: 3, parent: RefCell { value: (Weak) }, children: RefCell { value: [] } }] } }) ``` -The fact that we don’t get infinite output (or at least until the stack -overflows) is one way we can see that we don’t have a reference cycle in this -case. Another way we can tell is by looking at the values we get from calling -`Rc::strong_count` and `Rc::weak_count`. In Listing 15-21, let’s create a new -inner scope and move the creation of `branch` in there, so that we can see what -happens when `branch` is created and then dropped when it goes out of scope: +The lack of infinite output indicates that this code didn't create a reference +cycle. We can also tell this by looking at the values we get from calling +`Rc::strong_count` and `Rc::weak_count`. + +#### Visualizing Changes to `strong_count` and `weak_count` + +Let's look at how the `strong_count` and `weak_count` values of the `Rc` +instances change by creating a new inner scope and moving the creation of +`branch` into that scope. This will let us see what happens when `branch` is +created and then dropped when it goes out of scope. The modifications are shown +in Listing 15-25: Filename: src/main.rs @@ -323,7 +484,7 @@ fn main() { let branch = Rc::new(Node { value: 5, parent: RefCell::new(Weak::new()), - children: RefCell::new(vec![leaf.clone()]), + children: RefCell::new(vec![Rc::clone(&leaf)]), }); *leaf.parent.borrow_mut() = Rc::downgrade(&branch); @@ -349,53 +510,61 @@ fn main() { } ``` -Listing 15-21: Creating `branch` in an inner scope and -examining strong and weak reference counts of `leaf` and `branch` - -Right after creating `leaf`, its strong count is 1 (for `leaf` itself) and its -weak count is 0. In the inner scope, after we create `branch` and associate -`leaf` and `branch`, `branch` will have a strong count of 1 (for `branch` -itself) and a weak count of 1 (for `leaf.parent` pointing to `branch` with a -`Weak`). `leaf` will have a strong count of 2, since `branch` now has a -clone the `Rc` of `leaf` stored in `branch.children`. `leaf` still has a weak -count of 0. - -When the inner scope ends, `branch` goes out of scope, and its strong count -decreases to 0, so its `Node` gets dropped. The weak count of 1 from -`leaf.parent` has no bearing on whether `Node` gets dropped or not, so we don’t -have a memory leak! - -If we try to access the parent of `leaf` after the end of the scope, we’ll get -`None` again like we did before `leaf` had a parent. At the end of the program, -`leaf` has a strong count of 1 and a weak count of 0, since `leaf` is now the -only thing pointing to it again. - -All of the logic managing the counts and whether a value should be dropped or -not was managed by `Rc` and `Weak` and their implementations of the `Drop` -trait. By specifying that the relationship from a child to its parent should be -a `Weak` reference in the definition of `Node`, we’re able to have parent -nodes point to child nodes and vice versa without creating a reference cycle -and memory leaks. +Listing 15-25: Creating `branch` in an inner scope and +examining strong and weak reference counts + +Once `leaf` is created, its `Rc` has a strong count of 1 and a weak count of 0. +In the inner scope we create `branch` and associate it with `leaf`, at which +point the `Rc` in `branch` will have a strong count of 1 and a weak count of 1 +(for `leaf.parent` pointing to `branch` with a `Weak`). Here `leaf` will +have a strong count of 2, because `branch` now has a clone of the `Rc` of +`leaf` stored in `branch.children`, but will still have a weak count of 0. + +When the inner scope ends, `branch` goes out of scope and the strong count of +the `Rc` decreases to 0, so its `Node` gets dropped. The weak count of 1 from +`leaf.parent` has no bearing on whether `Node` is dropped or not, so we don't +get any memory leaks! + +If we try to access the parent of `leaf` after the end of the scope, we'll get +`None` again. At the end of the program, the `Rc` in `leaf` has a strong count +of 1 and a weak count of 0, because the variable `leaf` is now the only +reference to the `Rc` again. + + + + +All of the logic that manages the counts and value dropping is built in to +`Rc` and `Weak` and their implementations of the `Drop` trait. By specifying +that the relationship from a child to its parent should be a `Weak` +reference in the definition of `Node`, we're able to have parent nodes point to +child nodes and vice versa without creating a reference cycle and memory leaks. + + + ## Summary -We’ve now covered how you can use different kinds of smart pointers to choose -different guarantees and tradeoffs than those Rust makes with regular +This chapter covered how you can use smart pointers to make different +guarantees and tradeoffs than those Rust makes by default with regular references. `Box` has a known size and points to data allocated on the heap. `Rc` keeps track of the number of references to data on the heap so that data can have multiple owners. `RefCell` with its interior mutability gives -us a type that can be used where we need an immutable type, and enforces the -borrowing rules at runtime instead of at compile time. +us a type that can be used when we need an immutable type but need the ability +to change an inner value of that type, and enforces the borrowing rules at +runtime instead of at compile time. -We’ve also discussed the `Deref` and `Drop` traits that enable a lot of smart -pointers’ functionality. We explored how it’s possible to create a reference -cycle that would cause a memory leak, and how to prevent reference cycles by -using `Weak`. +We also discussed the `Deref` and `Drop` traits that enable a lot of the +functionality of smart pointers. We explored reference cycles that can cause +memory leaks, and how to prevent them using `Weak`. -If this chapter has piqued your interest and you now want to implement your own +If this chapter has piqued your interest and you want to implement your own smart pointers, check out [The Nomicon] for even more useful information. [The Nomicon]: https://doc.rust-lang.org/stable/nomicon/ -Next, let’s talk about concurrency in Rust. We’ll even learn about a few new -smart pointers that can help us with it. +Next, let's talk about concurrency in Rust. We'll even learn about a few new +smart pointers. From e53f22ab69b7fb5df70ef8f6c8801ce6b21c69d5 Mon Sep 17 00:00:00 2001 From: "Carol (Nichols || Goulding)" Date: Mon, 11 Sep 2017 13:50:49 -0400 Subject: [PATCH 08/18] Whoops missed a change to Rc::clone --- second-edition/src/ch15-06-reference-cycles.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/second-edition/src/ch15-06-reference-cycles.md b/second-edition/src/ch15-06-reference-cycles.md index 37881b8688..a8ee8d14f8 100644 --- a/second-edition/src/ch15-06-reference-cycles.md +++ b/second-edition/src/ch15-06-reference-cycles.md @@ -397,7 +397,7 @@ fn main() { let branch = Rc::new(Node { value: 5, parent: RefCell::new(Weak::new()), - children: RefCell::new(vec![leaf.clone()]), + children: RefCell::new(vec![Rc::clone(&leaf)]), }); *leaf.parent.borrow_mut() = Rc::downgrade(&branch); From 92e2630e7802ea109326a0e827129898e25754c3 Mon Sep 17 00:00:00 2001 From: "Carol (Nichols || Goulding)" Date: Mon, 18 Sep 2017 13:37:00 -0400 Subject: [PATCH 09/18] Make some more edits to ch 15 --- second-edition/src/ch15-00-smart-pointers.md | 17 ++++----- second-edition/src/ch15-01-box.md | 37 ++++++++++--------- second-edition/src/ch15-02-deref.md | 21 +++++------ second-edition/src/ch15-03-drop.md | 23 ++++++------ .../src/ch15-05-interior-mutability.md | 12 +++--- .../src/ch15-06-reference-cycles.md | 4 +- 6 files changed, 56 insertions(+), 58 deletions(-) diff --git a/second-edition/src/ch15-00-smart-pointers.md b/second-edition/src/ch15-00-smart-pointers.md index f4ce90e239..e65347f6d2 100644 --- a/second-edition/src/ch15-00-smart-pointers.md +++ b/second-edition/src/ch15-00-smart-pointers.md @@ -1,12 +1,11 @@ # Smart Pointers -A *pointer* is the generic programming concept for an address to a location -that stores some data. The most common kind of pointer in Rust is a -*reference*, which we learned about in Chapter 4. References are indicated by -the `&` symbol and borrow the value that they point to. They don't have any -special abilities other than referring to data, but they also don't have any -more overhead than they need to do straightforward referencing, so they're used -the most often. +A *pointer* is a general concept for a variable that contains an address in +memory. This address refers to, or "points at", some other data. The most +common kind of pointer in Rust is a *reference*, which we learned about in +Chapter 4. References are indicated by the `&` symbol and borrow the value that +they point to. They don't have any special abilities other than referring to +data. They also don't have any overhead, so they're used the most often. *Smart pointers*, on the other hand, are data structures that act like a pointer, but they also have additional metadata and capabilities. The concept @@ -84,8 +83,8 @@ http://researcher.watson.ibm.com/researcher/files/us-bacon/Bacon01Concurrent.pdf * `Box` for allocating values on the heap * `Rc`, a reference counted type that enables multiple ownership -* `Ref` and `RefMut`, accessed through `RefCell`, a type that enforces the - borrowing rules at runtime instead of compile time +* `Ref` and `RefMut`, accessed through `RefCell`, a type that enforces + the borrowing rules at runtime instead of compile time -Boxes don't have a lot of performance overhead, but they don't have a lot of -extra abilities either. They're most often used in these situations: +Boxes don't have performance overhead other than their data being on the heap +instead of on the stack, but they don't have a lot of extra abilities either. +They're most often used in these situations: - When you have a type whose size can't be known at compile time, and you want to use a value of that type in a context that needs to know an exact size @@ -22,18 +22,19 @@ extra abilities either. They're most often used in these situations: particular trait rather than knowing the concrete type itself We're going to demonstrate the first case in the rest of this section. To -elaborate on the other two situations a bit more: in the second case when you -have a lot of data that you don't want to be copied when you move the value to -be owned by another part of code, boxes make it so that the data stays in one -place on the heap and only the pointer data in the box is copied around on the -stack. The third case is known as a *trait object*, and Chapter 17 has an entire -section devoted just to that topic. So know that what you learn here will be -applied again in Chapter 17! - -### Using a `Box` to Store Data on the Heap - -Before we get into a use case for `Box`, let's get familiar with the syntax and -how to interact with values stored within a `Box`. +elaborate on the other two situations a bit more: in the second case, +transfering ownership of a large amount of data can take a long time because +the data gets copied around on the stack. To improve performance in this +situation, we can store the large amount of data on the heap in a box. Then, +only the small amount of pointer data is copied around on the stack, and the +data stays in one place on the heap. The third case is known as a *trait +object*, and Chapter 17 has an entire section devoted just to that topic. So +know that what you learn here will be applied again in Chapter 17! + +### Using a `Box` to Store Data on the Heap + +Before we get into a use case for `Box`, let's get familiar with the syntax +and how to interact with values stored within a `Box`. Listing 15-1 shows how to use a box to store an `i32` on the heap: diff --git a/second-edition/src/ch15-02-deref.md b/second-edition/src/ch15-02-deref.md index 3374addab9..4459034925 100644 --- a/second-edition/src/ch15-02-deref.md +++ b/second-edition/src/ch15-02-deref.md @@ -3,8 +3,8 @@ Implementing `Deref` trait allows us to customize the behavior of the *dereference operator* `*`(as opposed to the multiplication or glob operator). By implementing `Deref` in such a way that a smart pointer can be treated like -a regular reference, we can write code that is able to operate on either smart -pointers or regular references. +a regular reference, we can write code that operates on references and use that +code with smart pointers too. @@ -268,15 +268,14 @@ not, can you change this to an active tone? --> -Rust tends to favor explicitness over implicitness, but one exception is deref -coercions of arguments to functions and methods. Rust performs *deref coercion* -to convert a reference to a type that implements `Deref` into a reference to a -type that `Deref` can convert the original type into. Deref coercion happens -automatically when we pass a reference to a value of a particular type as an -argument to a function or method that doesn't match the type of the parameter -in the function or method definition, and there's a sequence of calls to the -`deref` method that will convert the type we provided into the type that the -parameter needs. +*Deref coercion* is a convenience that Rust performs on arguments to functions +and methods. Deref coercion converts a reference to a type that implements +`Deref` into a reference to a type that `Deref` can convert the original type +into. Deref coercion happens automatically when we pass a reference to a value +of a particular type as an argument to a function or method that doesn't match +the type of the parameter in the function or method definition, and there's a +sequence of calls to the `deref` method that will convert the type we provided +into the type that the parameter needs. Deref coercion was added to Rust so that programmers writing function and method calls don't need to add as many explicit references and dereferences diff --git a/second-edition/src/ch15-03-drop.md b/second-edition/src/ch15-03-drop.md index fc3488a946..d5685437f5 100644 --- a/second-edition/src/ch15-03-drop.md +++ b/second-edition/src/ch15-03-drop.md @@ -20,9 +20,9 @@ up, or that this code that can be run is specifically always for clean up? --> -This means we don't need be careful about placing clean up code everywhere in a -program that an instance of a particular type is finished with, but we still -won't leak resources! +This means we don't need to be careful about placing clean up code everywhere +in a program that an instance of a particular type is finished with, but we +still won't leak resources! We specify the code to run when a value goes out of scope by implementing the `Drop` trait. The `Drop` trait requires us to implement one method named `drop` @@ -111,15 +111,14 @@ drop method and why?--> Rust inserts the call to `drop` automatically when a value goes out of scope, -and there's no way to disable this functionality if we want to force a value to -clean itself up early. This isn't usually necessary; the whole point of the -`Drop` trait is that it's taken care of automatically for us. Occasionally you -may find that you want to clean up a value early. One example is when using -smart pointers that manage locks; you may want to force the `drop` method that -releases the lock to run so that other code in the same scope can acquire the -lock. First, let's see what happens if we try to call the `Drop` trait's `drop` -method ourselves by modifying the `main` function from Listing 15-8 as shown in -Listing 15-9: +and it's not straightforward to disable this functionality. Disabling `drop` +isn't usually necessary; the whole point of the `Drop` trait is that it's taken +care of automatically for us. Occasionally you may find that you want to clean +up a value early. One example is when using smart pointers that manage locks; +you may want to force the `drop` method that releases the lock to run so that +other code in the same scope can acquire the lock. First, let's see what +happens if we try to call the `Drop` trait's `drop` method ourselves by +modifying the `main` function from Listing 15-8 as shown in Listing 15-9: Because some analysis is impossible, if the Rust compiler can't be sure the code complies with the ownership rules, it may reject a correct program; in this way, it is conservative. If Rust were to accept an incorrect program, -users would not be able to trust in the guarantees Rust makes, but if Rust +users would not be able to trust in the guarantees Rust makes. However, if Rust rejects a correct program, the programmer will be inconvenienced, but nothing catastrophic can occur. `RefCell` is useful when you yourself are sure that your code follows the borrowing rules, but the compiler is not able to @@ -134,10 +134,10 @@ possible to mutate an immutable value and see why that's useful. #### A Use Case for Interior Mutability: Mock Objects -A *mock object* is the general programming concept for a type that stands in -the place of another type during testing. Mock objects simulate real objects, -and they can record what happens during a test so that we can assert that the -correct actions took place. +A *test double* is the general programming concept for a type that stands in +the place of another type during testing. *Mock objects* are specific types of +test doubles that record what happens during a test so that we can assert that +the correct actions took place. While Rust doesn't have objects in the exact same sense that other languages have objects, and Rust doesn't have mock object functionality built into the diff --git a/second-edition/src/ch15-06-reference-cycles.md b/second-edition/src/ch15-06-reference-cycles.md index a8ee8d14f8..b267b8ed72 100644 --- a/second-edition/src/ch15-06-reference-cycles.md +++ b/second-edition/src/ch15-06-reference-cycles.md @@ -315,8 +315,8 @@ fn main() { and a `branch` node with `leaf` as one of its children We clone the `Rc` in `leaf` and store that in `branch`, meaning the `Node` in -`leaf` now has two owners: `leaf` and `branch`. We can get to `leaf` from -`branch` through `branch.children`, but there's no way to get from `leaf` to +`leaf` now has two owners: `leaf` and `branch`. We can get from `branch` to +`leaf` through `branch.children`, but there's no way to get from `leaf` to `branch`. `leaf` has no reference to `branch` and doesn't know they are related. We'd like `leaf` to know that `branch` is its parent. From 3864b384d49640d79a59e93eaf4857c5c29302b0 Mon Sep 17 00:00:00 2001 From: "Carol (Nichols || Goulding)" Date: Mon, 18 Sep 2017 13:44:57 -0400 Subject: [PATCH 10/18] Clarify size of enum varants; fixes #886 --- second-edition/src/ch15-01-box.md | 14 ++++++++------ 1 file changed, 8 insertions(+), 6 deletions(-) diff --git a/second-edition/src/ch15-01-box.md b/second-edition/src/ch15-01-box.md index a898eefbb2..7281610ecc 100644 --- a/second-edition/src/ch15-01-box.md +++ b/second-edition/src/ch15-01-box.md @@ -242,7 +242,7 @@ enum Message { To determine how much space to allocate for a `Message` value, Rust goes through each of the variants to see which variant needs the most space. Rust -sees that `Message::Quit` does not need any space, `Message::Move` needs enough +sees that `Message::Quit` doesn't need any space, `Message::Move` needs enough space to store two `i32` values, and so forth. Since only one variant will end up being used, the most space a `Message` value will need is the space it would take to store the largest of its variants. @@ -313,11 +313,13 @@ fn main() { order to have a known size The `Cons` variant will need the size of an `i32` plus the space to store a -`usize`, since a box always has the size of a `usize`, no matter what it's -pointing to. The `Nil` variant stores no values and doesn't need any space. By -using a box, we've broken the infinite, recursive chain so the compiler is able -to figure out the size it needs to store a `List` value. Figure 15-7 shows what -the `Cons` variant looks like now: +`usize`, since a box is a pointer that is always a `usize`, no matter what it's +pointing to. The `Nil` variant stores no values, so it needs less space than +the `Cons` variant. We now know that any `List` value will take up the size of +an `i32` plus the size of a `usize` amount of data. By using a box, we've +broken the infinite, recursive chain so the compiler is able to figure out the +size it needs to store a `List` value. Figure 15-7 shows what the `Cons` +variant looks like now: A finite Cons list From d0abf1f2165bd1ef5543bb9e64eeb2d5970595f4 Mon Sep 17 00:00:00 2001 From: "Carol (Nichols || Goulding)" Date: Mon, 18 Sep 2017 13:53:19 -0400 Subject: [PATCH 11/18] Take care of all the .clones called on smart pointers Fixes #729. --- second-edition/src/ch15-04-rc.md | 4 ++-- second-edition/src/ch16-02-message-passing.md | 2 +- second-edition/src/ch16-03-shared-state.md | 4 ++-- second-edition/src/ch20-05-sending-requests-via-channels.md | 2 +- second-edition/src/ch20-06-graceful-shutdown-and-cleanup.md | 2 +- 5 files changed, 7 insertions(+), 7 deletions(-) diff --git a/second-edition/src/ch15-04-rc.md b/second-edition/src/ch15-04-rc.md index 927d1d99e2..8f7e928c80 100644 --- a/second-edition/src/ch15-04-rc.md +++ b/second-edition/src/ch15-04-rc.md @@ -192,10 +192,10 @@ a bit. /Carol --> fn main() { let a = Rc::new(Cons(5, Rc::new(Cons(10, Rc::new(Nil))))); println!("count after creating a = {}", Rc::strong_count(&a)); - let b = Cons(3, a.clone()); + let b = Cons(3, Rc::clone(&a)); println!("count after creating b = {}", Rc::strong_count(&a)); { - let c = Cons(4, a.clone()); + let c = Cons(4, Rc::clone(&a)); println!("count after creating c = {}", Rc::strong_count(&a)); } println!("count after c goes out of scope = {}", Rc::strong_count(&a)); diff --git a/second-edition/src/ch16-02-message-passing.md b/second-edition/src/ch16-02-message-passing.md index a3401735bc..6b79c6e5ef 100644 --- a/second-edition/src/ch16-02-message-passing.md +++ b/second-edition/src/ch16-02-message-passing.md @@ -264,7 +264,7 @@ cloning the transmitting half of the channel, as shown in Listing 16-11: // ...snip... let (tx, rx) = mpsc::channel(); -let tx1 = tx.clone(); +let tx1 = mpsc::Sender::clone(&tx); thread::spawn(move || { let vals = vec![ String::from("hi"), diff --git a/second-edition/src/ch16-03-shared-state.md b/second-edition/src/ch16-03-shared-state.md index d5aa110cd1..2c2be03fa8 100644 --- a/second-edition/src/ch16-03-shared-state.md +++ b/second-edition/src/ch16-03-shared-state.md @@ -291,7 +291,7 @@ fn main() { let mut handles = vec![]; for _ in 0..10 { - let counter = counter.clone(); + let counter = Rc::clone(&counter); let handle = thread::spawn(move || { let mut num = counter.lock().unwrap(); @@ -379,7 +379,7 @@ fn main() { let mut handles = vec![]; for _ in 0..10 { - let counter = counter.clone(); + let counter = Arc::clone(&counter); let handle = thread::spawn(move || { let mut num = counter.lock().unwrap(); diff --git a/second-edition/src/ch20-05-sending-requests-via-channels.md b/second-edition/src/ch20-05-sending-requests-via-channels.md index 10fb20bf75..798a48e351 100644 --- a/second-edition/src/ch20-05-sending-requests-via-channels.md +++ b/second-edition/src/ch20-05-sending-requests-via-channels.md @@ -203,7 +203,7 @@ impl ThreadPool { let mut workers = Vec::with_capacity(size); for id in 0..size { - workers.push(Worker::new(id, receiver.clone())); + workers.push(Worker::new(id, Arc::clone(&receiver))); } ThreadPool { diff --git a/second-edition/src/ch20-06-graceful-shutdown-and-cleanup.md b/second-edition/src/ch20-06-graceful-shutdown-and-cleanup.md index c05a316b3c..7541bf1383 100644 --- a/second-edition/src/ch20-06-graceful-shutdown-and-cleanup.md +++ b/second-edition/src/ch20-06-graceful-shutdown-and-cleanup.md @@ -494,7 +494,7 @@ impl ThreadPool { let mut workers = Vec::with_capacity(size); for id in 0..size { - workers.push(Worker::new(id, receiver.clone())); + workers.push(Worker::new(id, Arc::clone(&receiver))); } ThreadPool { From 90af43b9902a72b0f46c6deb9b9ec123a33ded97 Mon Sep 17 00:00:00 2001 From: "Carol (Nichols || Goulding)" Date: Mon, 18 Sep 2017 14:00:30 -0400 Subject: [PATCH 12/18] Document and show drop order Fixes #717. --- second-edition/src/ch15-03-drop.md | 20 ++++++++++++-------- 1 file changed, 12 insertions(+), 8 deletions(-) diff --git a/second-edition/src/ch15-03-drop.md b/second-edition/src/ch15-03-drop.md index d5685437f5..aa0a376438 100644 --- a/second-edition/src/ch15-03-drop.md +++ b/second-edition/src/ch15-03-drop.md @@ -58,13 +58,14 @@ struct CustomSmartPointer { impl Drop for CustomSmartPointer { fn drop(&mut self) { - println!("Dropping CustomSmartPointer!"); + println!("Dropping CustomSmartPointer with data `{}`!", self.data); } } fn main() { - let c = CustomSmartPointer { data: String::from("some data") }; - println!("CustomSmartPointer created."); + let c = CustomSmartPointer { data: String::from("my stuff") }; + let d = CustomSmartPointer { data: String::from("other stuff") }; + println!("CustomSmartPointers created."); } ``` @@ -92,14 +93,17 @@ call the `drop` method explicitly. When we run this program, we'll see the following output: ```text -CustomSmartPointer created. -Dropping CustomSmartPointer! +CustomSmartPointers created. +Dropping CustomSmartPointer with data `other stuff`! +Dropping CustomSmartPointer with data `my stuff`! ``` Rust automatically called `drop` for us when our instance went out of scope, -calling the code we specified. This is just to give you a visual guide to how -the drop method works, but usually you would specify the cleanup code that your -type needs to run rather than a print message. +calling the code we specified. Variables are dropped in the reverse order of +the order in which they were created, so `d` was dropped before `c`. This is +just to give you a visual guide to how the drop method works, but usually you +would specify the cleanup code that your type needs to run rather than a print +message. From aa13bc1ecf85fbb4a8eb9dfc42f55034c65e10be Mon Sep 17 00:00:00 2001 From: "Carol (Nichols || Goulding)" Date: Mon, 18 Sep 2017 14:24:50 -0400 Subject: [PATCH 13/18] Dance around the exact size of a box Fixes #606. --- second-edition/src/ch15-01-box.md | 14 ++++++-------- 1 file changed, 6 insertions(+), 8 deletions(-) diff --git a/second-edition/src/ch15-01-box.md b/second-edition/src/ch15-01-box.md index 7281610ecc..a7c21fa285 100644 --- a/second-edition/src/ch15-01-box.md +++ b/second-edition/src/ch15-01-box.md @@ -277,9 +277,8 @@ directly, we're going to store the value indirectly by storing a pointer to the value instead. Because a `Box` is a pointer, Rust always knows how much space a `Box` -needs: a pointer takes up a `usize` amount of space. The value of the `usize` -will be the address of the heap data. The heap data can be any size, but the -address to the start of that heap data will always fit in a `usize`. +needs: a pointer's size doesn't change based on the amount of data it's +pointing to. So we can put a `Box` inside the `Cons` variant instead of another `List` value directly. The `Box` will point to the next `List` value that will be on the @@ -312,11 +311,10 @@ fn main() { Listing 15-6: Definition of `List` that uses `Box` in order to have a known size -The `Cons` variant will need the size of an `i32` plus the space to store a -`usize`, since a box is a pointer that is always a `usize`, no matter what it's -pointing to. The `Nil` variant stores no values, so it needs less space than -the `Cons` variant. We now know that any `List` value will take up the size of -an `i32` plus the size of a `usize` amount of data. By using a box, we've +The `Cons` variant will need the size of an `i32` plus the space to store the +box's pointer data. The `Nil` variant stores no values, so it needs less space +than the `Cons` variant. We now know that any `List` value will take up the +size of an `i32` plus the size of a box's pointer data. By using a box, we've broken the infinite, recursive chain so the compiler is able to figure out the size it needs to store a `List` value. Figure 15-7 shows what the `Cons` variant looks like now: From 398ad3b7d55595bf1de54d3b301157a307f030bf Mon Sep 17 00:00:00 2001 From: "Carol (Nichols || Goulding)" Date: Mon, 18 Sep 2017 14:27:03 -0400 Subject: [PATCH 14/18] Tweak reference cycle section title to be less misleading Fixes #575. --- second-edition/src/ch15-06-reference-cycles.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/second-edition/src/ch15-06-reference-cycles.md b/second-edition/src/ch15-06-reference-cycles.md index b267b8ed72..38fa57535c 100644 --- a/second-edition/src/ch15-06-reference-cycles.md +++ b/second-edition/src/ch15-06-reference-cycles.md @@ -1,4 +1,4 @@ -## Creating Reference Cycles and Leaking Memory is Safe +## Reference Cycles Can Leak Memory Rust's memory safety guarantees make it *difficult* to accidentally create memory that's never cleaned up, known as a *memory leak*, but not impossible. From 066622810206cc424a6f8974e48fee1727ffd17f Mon Sep 17 00:00:00 2001 From: "Carol (Nichols || Goulding)" Date: Mon, 18 Sep 2017 14:49:31 -0400 Subject: [PATCH 15/18] Fix what is dereferencing what how, ref auto dereferencing on methods Connects to #538. --- second-edition/src/ch15-05-interior-mutability.md | 6 +++++- 1 file changed, 5 insertions(+), 1 deletion(-) diff --git a/second-edition/src/ch15-05-interior-mutability.md b/second-edition/src/ch15-05-interior-mutability.md index 406baa883a..4c61a73379 100644 --- a/second-edition/src/ch15-05-interior-mutability.md +++ b/second-edition/src/ch15-05-interior-mutability.md @@ -490,7 +490,11 @@ We wrap the list `a` in an `Rc` so that when we create lists `b` and `c`, they can both refer to `a`, the same as we did in Listing 15-13. Once we have the lists in `a`, `b`, and `c` created, we add 10 to the value in -`value` by dereferencing the `Rc` and calling `borrow_mut` on the `RefCell`. +`value`. We do this by calling `borrow_mut` on `value`, which uses the +automatic dereferencing feature we discussed in Chapter 5 ("Where's the `->` +Operator?") to dereference the `Rc` to the inner `RefCell` value. The +`borrow_mut` method returns a `RefMut` smart pointer, and we use the +dereference operator on it and change the inner value. When we print out `a`, `b`, and `c`, we can see that they all have the modified value of 15 rather than 5: From b06188623adc74eacc91997dca22046a31c367b8 Mon Sep 17 00:00:00 2001 From: "Carol (Nichols || Goulding)" Date: Mon, 18 Sep 2017 16:18:57 -0400 Subject: [PATCH 16/18] Spellingz --- second-edition/dictionary.txt | 1 + second-edition/src/ch15-01-box.md | 2 +- 2 files changed, 2 insertions(+), 1 deletion(-) diff --git a/second-edition/dictionary.txt b/second-edition/dictionary.txt index 7f1dbeea5d..4ae9ca8af0 100644 --- a/second-edition/dictionary.txt +++ b/second-edition/dictionary.txt @@ -70,6 +70,7 @@ ctrl Ctrl customizable CustomSmartPointer +CustomSmartPointers deallocate deallocated deallocating diff --git a/second-edition/src/ch15-01-box.md b/second-edition/src/ch15-01-box.md index a7c21fa285..20ec476a3a 100644 --- a/second-edition/src/ch15-01-box.md +++ b/second-edition/src/ch15-01-box.md @@ -23,7 +23,7 @@ They're most often used in these situations: We're going to demonstrate the first case in the rest of this section. To elaborate on the other two situations a bit more: in the second case, -transfering ownership of a large amount of data can take a long time because +transferring ownership of a large amount of data can take a long time because the data gets copied around on the stack. To improve performance in this situation, we can store the large amount of data on the heap in a box. Then, only the small amount of pointer data is copied around on the stack, and the From 68267b982a226fa252e9afa1a5029396ccf5fa03 Mon Sep 17 00:00:00 2001 From: "Carol (Nichols || Goulding)" Date: Mon, 18 Sep 2017 16:47:45 -0400 Subject: [PATCH 17/18] fancy quotes --- second-edition/src/ch15-00-smart-pointers.md | 24 ++-- second-edition/src/ch15-01-box.md | 68 +++++------ second-edition/src/ch15-02-deref.md | 68 +++++------ second-edition/src/ch15-03-drop.md | 46 ++++---- second-edition/src/ch15-04-rc.md | 54 ++++----- .../src/ch15-05-interior-mutability.md | 108 +++++++++--------- .../src/ch15-06-reference-cycles.md | 82 ++++++------- 7 files changed, 225 insertions(+), 225 deletions(-) diff --git a/second-edition/src/ch15-00-smart-pointers.md b/second-edition/src/ch15-00-smart-pointers.md index e65347f6d2..d1417a5f0a 100644 --- a/second-edition/src/ch15-00-smart-pointers.md +++ b/second-edition/src/ch15-00-smart-pointers.md @@ -1,21 +1,21 @@ # Smart Pointers A *pointer* is a general concept for a variable that contains an address in -memory. This address refers to, or "points at", some other data. The most +memory. This address refers to, or “points at”, some other data. The most common kind of pointer in Rust is a *reference*, which we learned about in Chapter 4. References are indicated by the `&` symbol and borrow the value that -they point to. They don't have any special abilities other than referring to -data. They also don't have any overhead, so they're used the most often. +they point to. They don’t have any special abilities other than referring to +data. They also don’t have any overhead, so they’re used the most often. *Smart pointers*, on the other hand, are data structures that act like a pointer, but they also have additional metadata and capabilities. The concept -of smart pointers isn't unique to Rust; it originated in C++ and exists in -other languages as well. The different smart pointers defined in Rust's +of smart pointers isn’t unique to Rust; it originated in C++ and exists in +other languages as well. The different smart pointers defined in Rust’s standard library provide extra functionality beyond what references provide. -One example that we'll explore in this chapter is the *reference counting* +One example that we’ll explore in this chapter is the *reference counting* smart pointer type, which enables you to have multiple owners of data. The reference counting smart pointer keeps track of how many owners there are, and -when there aren't any remaining, the smart pointer takes care of cleaning up +when there aren’t any remaining, the smart pointer takes care of cleaning up the data. Along the way, we’ll cover the *interior mutability* pattern where an immutable -type exposes an API for mutating an interior value. We'll also discuss +type exposes an API for mutating an interior value. We’ll also discuss *reference cycles*, how they can leak memory, and how to prevent them. Let’s dive in! diff --git a/second-edition/src/ch15-01-box.md b/second-edition/src/ch15-01-box.md index 20ec476a3a..f49e1602a1 100644 --- a/second-edition/src/ch15-01-box.md +++ b/second-edition/src/ch15-01-box.md @@ -3,25 +3,25 @@ The most straightforward smart pointer is a *box*, whose type is written `Box`. Boxes allow you to store data on the heap rather than the stack. What remains on the stack is the pointer to the heap data. Refer back to Chapter 4 -if you'd like to review the difference between the stack and the heap. +if you’d like to review the difference between the stack and the heap. -Boxes don't have performance overhead other than their data being on the heap -instead of on the stack, but they don't have a lot of extra abilities either. -They're most often used in these situations: +Boxes don’t have performance overhead other than their data being on the heap +instead of on the stack, but they don’t have a lot of extra abilities either. +They’re most often used in these situations: -- When you have a type whose size can't be known at compile time, and you want +- When you have a type whose size can’t be known at compile time, and you want to use a value of that type in a context that needs to know an exact size - When you have a large amount of data and you want to transfer ownership but - ensure the data won't be copied when you do so -- When you want to own a value and only care that it's a type that implements a + ensure the data won’t be copied when you do so +- When you want to own a value and only care that it’s a type that implements a particular trait rather than knowing the concrete type itself -We're going to demonstrate the first case in the rest of this section. To +We’re going to demonstrate the first case in the rest of this section. To elaborate on the other two situations a bit more: in the second case, transferring ownership of a large amount of data can take a long time because the data gets copied around on the stack. To improve performance in this @@ -33,7 +33,7 @@ know that what you learn here will be applied again in Chapter 17! ### Using a `Box` to Store Data on the Heap -Before we get into a use case for `Box`, let's get familiar with the syntax +Before we get into a use case for `Box`, let’s get familiar with the syntax and how to interact with values stored within a `Box`. Listing 15-1 shows how to use a box to store an `i32` on the heap: @@ -60,9 +60,9 @@ and the data it points to (stored on the heap). Putting a single value on the heap isn’t very useful, so you won’t use boxes by themselves in the way that Listing 15-1 does very often. Having values like a -single `i32` on the stack, where they're stored by default is more appropriate -in the majority of cases. Let's get into a case where boxes allow us to define -types that we wouldn't be allowed to if we didn't have boxes. +single `i32` on the stack, where they’re stored by default is more appropriate +in the majority of cases. Let’s get into a case where boxes allow us to define +types that we wouldn’t be allowed to if we didn’t have boxes. @@ -80,15 +80,15 @@ finding it hard to visualize --> Rust needs to know at compile time how much space a type takes up. One kind of -type whose size can't be known at compile time is a *recursive type* where a +type whose size can’t be known at compile time is a *recursive type* where a value can have as part of itself another value of the same type. This nesting -of values could theoretically continue infinitely, so Rust doesn't know how +of values could theoretically continue infinitely, so Rust doesn’t know how much space a value of a recursive type needs. Boxes have a known size, however, so by inserting a box in a recursive type definition, we are allowed to have recursive types. -Let's explore the *cons list*, a data type common in functional programming -languages, to illustrate this concept. The cons list type we're going to define +Let’s explore the *cons list*, a data type common in functional programming +languages, to illustrate this concept. The cons list type we’re going to define is straightforward except for the recursion, so the concepts in this example will be useful any time you get into more complex situations involving recursive types. @@ -126,7 +126,7 @@ only a value called `Nil` without a next item. > or “nil” concept from Chapter 6, which is an invalid or absent value. Note that while functional programming languages use cons lists frequently, -this isn't a commonly used data structure in Rust. Most of the time when you +this isn’t a commonly used data structure in Rust. Most of the time when you have a list of items in Rust, `Vec` is a better choice. Other, more complex recursive data types *are* useful in various situations in Rust, but by starting with the cons list, we can explore how boxes let us define a recursive @@ -142,8 +142,8 @@ realistic example would be quite a bit more complicated and obscure why a box is useful even more. /Carol --> Listing 15-2 contains an enum definition for a cons list. Note that this -won’t compile quite yet because this is type doesn't have a known size, which -we'll demonstrate: +won’t compile quite yet because this is type doesn’t have a known size, which +we’ll demonstrate: @@ -161,7 +161,7 @@ enum List { Listing 15-2: The first attempt of defining an enum to represent a cons list data structure of `i32` values -> Note: We're choosing to implement a cons list that only holds `i32` values +> Note: We’re choosing to implement a cons list that only holds `i32` values > for the purposes of this example. We could have implemented it using > generics, as we discussed in Chapter 10, in order to define a cons list type > that could store values of any type. @@ -219,7 +219,7 @@ to a Rust type, but it's not allowed in Rust. We have to use box to make the variant hold a pointer to the next value, not the actual value itself. We've tried to clarify throughout this section. /Carol --> -The error says this type 'has infinite size'. The reason is the way we've +The error says this type ‘has infinite size’. The reason is the way we’ve defined `List` is with a variant that is recursive: it holds another value of itself directly. This means Rust can’t figure out how much space it needs in order to store a `List` value. Let’s break this down a bit: first let’s look at @@ -242,7 +242,7 @@ enum Message { To determine how much space to allocate for a `Message` value, Rust goes through each of the variants to see which variant needs the most space. Rust -sees that `Message::Quit` doesn't need any space, `Message::Move` needs enough +sees that `Message::Quit` doesn’t need any space, `Message::Move` needs enough space to store two `i32` values, and so forth. Since only one variant will end up being used, the most space a `Message` value will need is the space it would take to store the largest of its variants. @@ -272,18 +272,18 @@ helpful suggestion: make `List` representable ``` -In this suggestion, "indirection" means that instead of storing a value -directly, we're going to store the value indirectly by storing a pointer to +In this suggestion, “indirection” means that instead of storing a value +directly, we’re going to store the value indirectly by storing a pointer to the value instead. Because a `Box` is a pointer, Rust always knows how much space a `Box` -needs: a pointer's size doesn't change based on the amount of data it's +needs: a pointer’s size doesn’t change based on the amount of data it’s pointing to. So we can put a `Box` inside the `Cons` variant instead of another `List` value directly. The `Box` will point to the next `List` value that will be on the heap, rather than inside the `Cons` variant. Conceptually, we still have a list -created by lists "holding" other lists, but the way this concept is implemented +created by lists “holding” other lists, but the way this concept is implemented is now more like the items being next to one another rather than inside one another. @@ -312,9 +312,9 @@ fn main() { order to have a known size The `Cons` variant will need the size of an `i32` plus the space to store the -box's pointer data. The `Nil` variant stores no values, so it needs less space +box’s pointer data. The `Nil` variant stores no values, so it needs less space than the `Cons` variant. We now know that any `List` value will take up the -size of an `i32` plus the size of a box's pointer data. By using a box, we've +size of an `i32` plus the size of a box’s pointer data. By using a box, we’ve broken the infinite, recursive chain so the compiler is able to figure out the size it needs to store a `List` value. Figure 15-7 shows what the `Cons` variant looks like now: @@ -329,19 +329,19 @@ pointer? --> -Boxes only provide the indirection and heap allocation; they don't have any -other special abilities like those we'll see with the other smart pointer -types. They also don't have any performance overhead that these special +Boxes only provide the indirection and heap allocation; they don’t have any +other special abilities like those we’ll see with the other smart pointer +types. They also don’t have any performance overhead that these special abilities incur, so they can be useful in cases like the cons list where the -indirection is the only feature we need. We'll look at more use cases for boxes +indirection is the only feature we need. We’ll look at more use cases for boxes in Chapter 17, too. The `Box` type is a smart pointer because it implements the `Deref` trait, which allows `Box` values to be treated like references. When a `Box` value goes out of scope, the heap data that the box is pointing to is cleaned -up as well because of the `Box` type's `Drop` trait implementation. Let's +up as well because of the `Box` type’s `Drop` trait implementation. Let’s explore these two types in more detail; these traits are going to be even more -important to the functionality provided by the other smart pointer types we'll +important to the functionality provided by the other smart pointer types we’ll be discussing in the rest of this chapter. -Let's first take a look at how `*` works with regular references, then try and -define our own type like `Box` and see why `*` doesn't work like a -reference. We'll explore how implementing the `Deref` trait makes it possible -for smart pointers to work in a similar way as references. Finally, we'll look +Let’s first take a look at how `*` works with regular references, then try and +define our own type like `Box` and see why `*` doesn’t work like a +reference. We’ll explore how implementing the `Deref` trait makes it possible +for smart pointers to work in a similar way as references. Finally, we’ll look at the *deref coercion* feature of Rust and how that lets us work with either references or smart pointers. @@ -37,7 +37,7 @@ more like we're following an arrow (the pointer) to find the value. Let us know if this explanation is still unclear. /Carol --> A regular reference is a type of pointer, and one way to think of a pointer is -that it's an arrow to a value stored somewhere else. In Listing 15-8, let's +that it’s an arrow to a value stored somewhere else. In Listing 15-8, let’s create a reference to an `i32` value then use the dereference operator to follow the reference to the data: @@ -75,7 +75,7 @@ to the value that the reference is pointing to (hence *de-reference*). Once we de-reference `y`, we have access to the integer value `y` is pointing to that we can compare with `5`. -If we try to write `assert_eq!(5, y);` instead, we'll get this compilation +If we try to write `assert_eq!(5, y);` instead, we’ll get this compilation error: ```text @@ -90,8 +90,8 @@ not satisfied `{integer}` ``` -Comparing a reference to a number with a number isn't allowed because they're -different types. We have to use `*` to follow the reference to the value it's +Comparing a reference to a number with a number isn’t allowed because they’re +different types. We have to use `*` to follow the reference to the value it’s pointing to. ### Using `Box` Like a Reference @@ -118,19 +118,19 @@ fn main() { The only part of Listing 15-8 that we changed was to set `y` to be an instance of a box pointing to the value in `x` rather than a reference pointing to the value of `x`. In the last assertion, we can use the dereference operator to -follow the box's pointer in the same way that we did when `y` was a reference. -Let's explore what is special about `Box` that enables us to do this by +follow the box’s pointer in the same way that we did when `y` was a reference. +Let’s explore what is special about `Box` that enables us to do this by defining our own box type. ### Defining Our Own Smart Pointer -Let's build a smart pointer similar to the `Box` type that the standard -library has provided for us, in order to experience that smart pointers don't -behave like references by default. Then we'll learn about how to add the +Let’s build a smart pointer similar to the `Box` type that the standard +library has provided for us, in order to experience that smart pointers don’t +behave like references by default. Then we’ll learn about how to add the ability to use the dereference operator. `Box` is ultimately defined as a tuple struct with one element, so Listing -15-10 defines a `MyBox` type in the same way. We'll also define a `new` +15-10 defines a `MyBox` type in the same way. We’ll also define a `new` function to match the `new` function defined on `Box`: Filename: src/main.rs @@ -152,9 +152,9 @@ want our type to be able to hold values of any type. `MyBox` is a tuple struct with one element of type `T`. The `MyBox::new` function takes one parameter of type `T` and returns a `MyBox` instance that holds the value passed in. -Let's try adding the code from Listing 15-9 to the code in Listing 15-10 and -changing `main` to use the `MyBox` type we've defined instead of `Box`. -The code in Listing 15-11 won't compile because Rust doesn't know how to +Let’s try adding the code from Listing 15-9 to the code in Listing 15-10 and +changing `main` to use the `MyBox` type we’ve defined instead of `Box`. +The code in Listing 15-11 won’t compile because Rust doesn’t know how to dereference `MyBox`: Filename: src/main.rs @@ -182,14 +182,14 @@ error: type `MyBox<{integer}>` cannot be dereferenced | ^^ ``` -Our `MyBox` type can't be dereferenced because we haven't implemented that +Our `MyBox` type can’t be dereferenced because we haven’t implemented that ability on our type. To enable dereferencing with the `*` operator, we can implement the `Deref` trait. ### Implementing the `Deref` Trait Defines How To Treat a Type Like a Reference As we discussed in Chapter 10, in order to implement a trait, we need to -provide implementations for the trait's required methods. The `Deref` trait, +provide implementations for the trait’s required methods. The `Deref` trait, provided by the standard library, requires implementing one method named `deref` that borrows `self` and returns a reference to the inner data. Listing 15-12 contains an implementation of `Deref` to add to the definition of `MyBox`: @@ -213,7 +213,7 @@ impl Deref for MyBox { The `type Target = T;` syntax defines an associated type for this trait to use. Associated types are a slightly different way of declaring a generic parameter -that you don't need to worry about too much for now; we'll cover it in more +that you don’t need to worry about too much for now; we’ll cover it in more detail in Chapter 19. Rust substitutes the `*` operator with a call to the `deref` method and then a -plain dereference so that we don't have to think about when we have to call the +plain dereference so that we don’t have to think about when we have to call the `deref` method or not. This feature of Rust lets us write code that functions identically whether we have a regular reference or a type that implements `Deref`. @@ -272,17 +272,17 @@ describe it as implicit. /Carol --> and methods. Deref coercion converts a reference to a type that implements `Deref` into a reference to a type that `Deref` can convert the original type into. Deref coercion happens automatically when we pass a reference to a value -of a particular type as an argument to a function or method that doesn't match -the type of the parameter in the function or method definition, and there's a +of a particular type as an argument to a function or method that doesn’t match +the type of the parameter in the function or method definition, and there’s a sequence of calls to the `deref` method that will convert the type we provided into the type that the parameter needs. Deref coercion was added to Rust so that programmers writing function and -method calls don't need to add as many explicit references and dereferences +method calls don’t need to add as many explicit references and dereferences with `&` and `*`. This feature also lets us write more code that can work for either references or smart pointers. -To illustrate deref coercion in action, let's use the `MyBox` type we +To illustrate deref coercion in action, let’s use the `MyBox` type we defined in Listing 15-10 as well as the implementation of `Deref` that we added in Listing 15-12. Listing 15-13 shows the definition of a function that has a string slice parameter: @@ -337,16 +337,16 @@ fn main() { Listing 15-14: Calling `hello` with a reference to a `MyBox`, which works because of deref coercion -Here we're calling the `hello` function with the argument `&m`, which is a +Here we’re calling the `hello` function with the argument `&m`, which is a reference to a `MyBox` value. Because we implemented the `Deref` trait on `MyBox` in Listing 15-12, Rust can turn `&MyBox` into `&String` by calling `deref`. The standard library provides an implementation of `Deref` on `String` that returns a string slice, which we can see in the API documentation for `Deref`. Rust calls `deref` again to turn the `&String` into -`&str`, which matches the `hello` function's definition. +`&str`, which matches the `hello` function’s definition. -If Rust didn't implement deref coercion, in order to call `hello` with a value -of type `&MyBox`, we'd have to write the code in Listing 15-15 instead +If Rust didn’t implement deref coercion, in order to call `hello` with a value +of type `&MyBox`, we’d have to write the code in Listing 15-15 instead of the code in Listing 15-14: Filename: src/main.rs @@ -380,7 +380,7 @@ fn main() { } ``` -Listing 15-15: The code we'd have to write if Rust didn't +Listing 15-15: The code we’d have to write if Rust didn’t have deref coercion The `(*m)` is dereferencing the `MyBox` into a `String`. Then the `&` @@ -392,7 +392,7 @@ automatically. When the `Deref` trait is defined for the types involved, Rust will analyze the types and use `Deref::deref` as many times as it needs in order to get a -reference to match the parameter's type. This is resolved at compile time, so +reference to match the parameter’s type. This is resolved at compile time, so there is no run-time penalty for taking advantage of deref coercion! ### How Deref Coercion Interacts with Mutability @@ -432,11 +432,11 @@ The last case is trickier: Rust will also coerce a mutable reference to an immutable one. The reverse is *not* possible though: immutable references will never coerce to mutable ones. Because of the borrowing rules, if you have a mutable reference, that mutable reference must be the only reference to that -data (otherwise, the program wouldn't compile). Converting one mutable +data (otherwise, the program wouldn’t compile). Converting one mutable reference to one immutable reference will never break the borrowing rules. Converting an immutable reference to a mutable reference would require that there was only one immutable reference to that data, and the borrowing rules -don't guarantee that. Therefore, Rust can't make the assumption that converting +don’t guarantee that. Therefore, Rust can’t make the assumption that converting an immutable reference to a mutable reference is possible. -This means we don't need to be careful about placing clean up code everywhere +This means we don’t need to be careful about placing clean up code everywhere in a program that an instance of a particular type is finished with, but we -still won't leak resources! +still won’t leak resources! We specify the code to run when a value goes out of scope by implementing the `Drop` trait. The `Drop` trait requires us to implement one method named `drop` that takes a mutable reference to `self`. In order to be able to see when Rust -calls `drop`, let's implement `drop` with `println!` statements for now. +calls `drop`, let’s implement `drop` with `println!` statements for now. @@ -72,11 +72,11 @@ fn main() { Listing 15-8: A `CustomSmartPointer` struct that implements the `Drop` trait, where we would put our clean up code. -The `Drop` trait is included in the prelude, so we don't need to import it. We +The `Drop` trait is included in the prelude, so we don’t need to import it. We implement the `Drop` trait on `CustomSmartPointer`, and provide an implementation for the `drop` method that calls `println!`. The body of the -`drop` function is where you'd put any logic that you wanted to run when an -instance of your type goes out of scope. We're choosing to print out some text +`drop` function is where you’d put any logic that you wanted to run when an +instance of your type goes out of scope. We’re choosing to print out some text here in order to demonstrate when Rust will call `drop`. In `main`, we create a new instance of `CustomSmartPointer` and then print out `CustomSmartPointer created.`. At the end of `main`, our instance of `CustomSmartPointer` will go out of scope, and Rust will call the code we put -in the `drop` method, printing our final message. Note that we didn't need to +in the `drop` method, printing our final message. Note that we didn’t need to call the `drop` method explicitly. -When we run this program, we'll see the following output: +When we run this program, we’ll see the following output: ```text CustomSmartPointers created. @@ -115,13 +115,13 @@ drop method and why?--> Rust inserts the call to `drop` automatically when a value goes out of scope, -and it's not straightforward to disable this functionality. Disabling `drop` -isn't usually necessary; the whole point of the `Drop` trait is that it's taken +and it’s not straightforward to disable this functionality. Disabling `drop` +isn’t usually necessary; the whole point of the `Drop` trait is that it’s taken care of automatically for us. Occasionally you may find that you want to clean up a value early. One example is when using smart pointers that manage locks; you may want to force the `drop` method that releases the lock to run so that -other code in the same scope can acquire the lock. First, let's see what -happens if we try to call the `Drop` trait's `drop` method ourselves by +other code in the same scope can acquire the lock. First, let’s see what +happens if we try to call the `Drop` trait’s `drop` method ourselves by modifying the `main` function from Listing 15-8 as shown in Listing 15-9: Code specified in a `Drop` trait implementation can be used in many ways to make cleanup convenient and safe: we could use it to create our own memory -allocator, for instance! With the `Drop` trait and Rust's ownership system, you -don't have to remember to clean up after yourself, Rust takes care of it +allocator, for instance! With the `Drop` trait and Rust’s ownership system, you +don’t have to remember to clean up after yourself, Rust takes care of it automatically. -We also don't have to worry about accidentally cleaning up values still in use +We also don’t have to worry about accidentally cleaning up values still in use because that would cause a compiler error: the ownership system that makes sure references are always valid will also make sure that `drop` only gets called once when the value is no longer being used. -Now that we've gone over `Box` and some of the characteristics of smart -pointers, let's talk about a few other smart pointers defined in the standard +Now that we’ve gone over `Box` and some of the characteristics of smart +pointers, let’s talk about a few other smart pointers defined in the standard library. diff --git a/second-edition/src/ch15-04-rc.md b/second-edition/src/ch15-04-rc.md index 8f7e928c80..c0ef54ecdd 100644 --- a/second-edition/src/ch15-04-rc.md +++ b/second-edition/src/ch15-04-rc.md @@ -4,7 +4,7 @@ In the majority of cases, ownership is clear: you know exactly which variable owns a given value. However, there are cases when a single value may have multiple owners. For example, in graph data structures, multiple edges may point to the same node, and that node is conceptually owned by all of the edges -that point to it. A node shouldn't be cleaned up unless it doesn't have any +that point to it. A node shouldn’t be cleaned up unless it doesn’t have any edges pointing to it. -In Listing 15-14, we'll change `main` so that it has an inner scope around list +In Listing 15-14, we’ll change `main` so that it has an inner scope around list `c`, so that we can see how the reference count changes when `c` goes out of -scope. At each point in the program where the reference count changes, we'll +scope. At each point in the program where the reference count changes, we’ll print out the reference count, which we can get by calling the -`Rc::strong_count` function. We'll talk about why this function is named +`Rc::strong_count` function. We’ll talk about why this function is named `strong_count` rather than `count` in the section later in this chapter about preventing reference cycles. @@ -217,14 +217,14 @@ count after c goes out of scope = 2 -We're able to see that the `Rc` in `a` has an initial reference count of one, +We’re able to see that the `Rc` in `a` has an initial reference count of one, then each time we call `clone`, the count goes up by one. When `c` goes out of -scope, the count goes down by one. We don't have to call a function to decrease +scope, the count goes down by one. We don’t have to call a function to decrease the reference count like we have to call `Rc::clone` to increase the reference count; the implementation of the `Drop` trait decreases the reference count automatically when an `Rc` value goes out of scope. -What we can't see from this example is that when `b` and then `a` go out of +What we can’t see from this example is that when `b` and then `a` go out of scope at the end of `main`, the count is then 0, and the `Rc` is cleaned up completely at that point. Using `Rc` allows a single value to have multiple owners, and the count will ensure that the value remains valid as long as any @@ -232,9 +232,9 @@ of the owners still exist. `Rc` allows us to share data between multiple parts of our program for reading only, via immutable references. If `Rc` allowed us to have multiple -mutable references too, we'd be able to violate one of the the borrowing rules +mutable references too, we’d be able to violate one of the the borrowing rules that we discussed in Chapter 4: multiple mutable borrows to the same place can cause data races and inconsistencies. But being able to mutate data is very -useful! In the next section, we'll discuss the interior mutability pattern and +useful! In the next section, we’ll discuss the interior mutability pattern and the `RefCell` type that we can use in conjunction with an `Rc` to work with this restriction on immutability. diff --git a/second-edition/src/ch15-05-interior-mutability.md b/second-edition/src/ch15-05-interior-mutability.md index 4c61a73379..6758ae12a1 100644 --- a/second-edition/src/ch15-05-interior-mutability.md +++ b/second-edition/src/ch15-05-interior-mutability.md @@ -17,31 +17,31 @@ pattern. /Carol --> *Interior mutability* is a design pattern in Rust for allowing you to mutate data even when there are immutable references to that data, normally disallowed by the borrowing rules. To do so, the pattern uses `unsafe` code inside a data -structure to bend Rust's usual rules around mutation and borrowing. We haven't +structure to bend Rust’s usual rules around mutation and borrowing. We haven’t yet covered unsafe code; we will in Chapter 19. We can choose to use types that make use of the interior mutability pattern when we can ensure that the -borrowing rules will be followed at runtime, even though the compiler can't +borrowing rules will be followed at runtime, even though the compiler can’t ensure that. The `unsafe` code involved is then wrapped in a safe API, and the outer type is still immutable. -Let's explore this by looking at the `RefCell` type that follows the +Let’s explore this by looking at the `RefCell` type that follows the interior mutability pattern. ### Enforcing Borrowing Rules at Runtime with `RefCell` Unlike `Rc`, the `RefCell` type represents single ownership over the data it holds. So, what makes `RefCell` different than a type like `Box`? -Let's recall the borrowing rules we learned in Chapter 4: +Let’s recall the borrowing rules we learned in Chapter 4: 1. At any given time, you can have *either* but not both of: * One mutable reference. * Any number of immutable references. 2. References must always be valid. -With references and `Box`, the borrowing rules' invariants are enforced at +With references and `Box`, the borrowing rules’ invariants are enforced at compile time. With `RefCell`, these invariants are enforced *at runtime*. -With references, if you break these rules, you'll get a compiler error. With -`RefCell`, if you break these rules, you'll get a `panic!`. +With references, if you break these rules, you’ll get a compiler error. With +`RefCell`, if you break these rules, you’ll get a `panic!`. @@ -51,20 +51,20 @@ The advantages to checking the borrowing rules at compile time are that errors will be caught sooner in the development process and there is no impact on runtime performance since all the analysis is completed beforehand. For those reasons, checking the borrowing rules at compile time is the best choice for -the majority of cases, which is why this is Rust's default. +the majority of cases, which is why this is Rust’s default. The advantage to checking the borrowing rules at runtime instead is that certain memory safe scenarios are then allowed, whereas they are disallowed by the compile time checks. Static analysis, like the Rust compiler, is inherently conservative. Some properties of code are impossible to detect by analyzing the code: the most famous example is the Halting Problem, which is out of scope of -this book but an interesting topic to research if you're interested. +this book but an interesting topic to research if you’re interested. -Because some analysis is impossible, if the Rust compiler can't be sure the +Because some analysis is impossible, if the Rust compiler can’t be sure the code complies with the ownership rules, it may reject a correct program; in this way, it is conservative. If Rust were to accept an incorrect program, users would not be able to trust in the guarantees Rust makes. However, if Rust @@ -75,7 +75,7 @@ understand and guarantee that. Similarly to `Rc`, `RefCell` is only for use in single-threaded scenarios and will give you a compile time error if you try in a multithreaded context. -We'll talk about how to get the functionality of `RefCell` in a +We’ll talk about how to get the functionality of `RefCell` in a multithreaded program in Chapter 16. -In listing 15-21, we're adding a `main` function that uses the definitions from +In listing 15-21, we’re adding a `main` function that uses the definitions from Listing 15-20. This code creates a list in `a`, a list in `b` that points to the list in `a`, and then modifies the list in `a` to point to `b`, which creates a reference cycle. There are `println!` statements along the way to @@ -129,7 +129,7 @@ a cycle. We do that by using the `tail` method to get a reference to the that holds a `Nil` value to the `Rc` in `b`. If we run this code, keeping the last `println!` commented out for the moment, -we'll get this output: +we’ll get this output: ```text a initial rc count = 1 @@ -157,10 +157,10 @@ you clarify that? --> However, because `a` is still referencing the `Rc` that was in `b`, that `Rc` -has a count of 1 rather than 0, so the memory the `Rc` has on the heap won't be +has a count of 1 rather than 0, so the memory the `Rc` has on the heap won’t be dropped. The memory will just sit there with a count of one, forever. -To visualize this, we've created a reference cycle that looks like Figure 15-22: +To visualize this, we’ve created a reference cycle that looks like Figure 15-22: Reference cycle of lists @@ -181,15 +181,15 @@ out? Which one would make a better first experience when running this code? /Carol --> In this specific case, right after we create the reference cycle, the program -ends. The consequences of this cycle aren't so dire. If a more complex program +ends. The consequences of this cycle aren’t so dire. If a more complex program allocates lots of memory in a cycle and holds onto it for a long time, the program would be using more memory than it needs, and might overwhelm the system and cause it to run out of available memory. -Creating reference cycles is not easily done, but it's not impossible either. +Creating reference cycles is not easily done, but it’s not impossible either. If you have `RefCell` values that contain `Rc` values or similar nested combinations of types with interior mutability and reference counting, be aware -that you have to ensure you don't create cycles yourself; you can't rely on +that you have to ensure you don’t create cycles yourself; you can’t rely on Rust to catch them. Creating a reference cycle would be a logic bug in your program that you should use automated tests, code reviews, and other software development practices to minimize. @@ -205,17 +205,17 @@ specific or helpful here; I've referenced writing tests and other things that can help mitigate logic bugs. /Carol --> Another solution is reorganizing your data structures so that some references -express ownership and some references don't. In this way, we can have cycles +express ownership and some references don’t. In this way, we can have cycles made up of some ownership relationships and some non-ownership relationships, and only the ownership relationships affect whether a value may be dropped or not. In Listing 15-20, we always want `Cons` variants to own their list, so -reorganizing the data structure isn't possible. Let's look at an example using +reorganizing the data structure isn’t possible. Let’s look at an example using graphs made up of parent nodes and child nodes to see when non-ownership relationships are an appropriate way to prevent reference cycles. ### Preventing Reference Cycles: Turn an `Rc` into a `Weak` -So far, we've shown how calling `Rc::clone` increases the `strong_count` of an +So far, we’ve shown how calling `Rc::clone` increases the `strong_count` of an `Rc` instance, and that an `Rc` instance is only cleaned up if its `strong_count` is 0. We can also create a *weak reference* to the value within an `Rc` instance by calling `Rc::downgrade` and passing a reference to the @@ -235,32 +235,32 @@ when is it stored in weak_count? --> clarify the paragraph above to address your questions. /Carol --> Strong references are how we can share ownership of an `Rc` instance. Weak -references don't express an ownership relationship. They won't cause a +references don’t express an ownership relationship. They won’t cause a reference cycle since any cycle involving some weak references will be broken once the strong reference count of values involved is 0. Because the value that `Weak` references might have been dropped, in order to do anything with the value that a `Weak` is pointing to, we have to check to make sure the value is still around. We do this by calling the `upgrade` -method on a `Weak` instance, which will return an `Option>`. We'll get +method on a `Weak` instance, which will return an `Option>`. We’ll get a result of `Some` if the `Rc` value has not been dropped yet, and `None` if the `Rc` value has been dropped. Because `upgrade` returns an `Option`, we can be sure that Rust will handle both the `Some` case and the `None` case, and -there won't be an invalid pointer. +there won’t be an invalid pointer. As an example, rather than using a list whose items know only about the next -item, we'll create a tree whose items know about their children items *and* +item, we’ll create a tree whose items know about their children items *and* their parent items. #### Creating a Tree Data Structure: a `Node` with Child Nodes -To start building this tree, we'll create a struct named `Node` that holds its +To start building this tree, we’ll create a struct named `Node` that holds its own `i32` value as well as references to its children `Node` values: Filename: src/main.rs @@ -282,7 +282,7 @@ do this, we define the `Vec` items to be values of type `Rc`. We also want to be able to modify which nodes are children of another node, so we have a `RefCell` in `children` around the `Vec`. -Next, let's use our struct definition and create one `Node` instance named +Next, let’s use our struct definition and create one `Node` instance named `leaf` with the value 3 and no children, and another instance named `branch` with the value 5 and `leaf` as one of its children, as shown in Listing 15-23: @@ -316,15 +316,15 @@ and a `branch` node with `leaf` as one of its children We clone the `Rc` in `leaf` and store that in `branch`, meaning the `Node` in `leaf` now has two owners: `leaf` and `branch`. We can get from `branch` to -`leaf` through `branch.children`, but there's no way to get from `leaf` to -`branch`. `leaf` has no reference to `branch` and doesn't know they are -related. We'd like `leaf` to know that `branch` is its parent. +`leaf` through `branch.children`, but there’s no way to get from `leaf` to +`branch`. `leaf` has no reference to `branch` and doesn’t know they are +related. We’d like `leaf` to know that `branch` is its parent. #### Adding a Reference from a Child to its Parent To make the child node aware of its parent, we need to add a `parent` field to our `Node` struct definition. The trouble is in deciding what the type of -`parent` should be. We know it can't contain an `Rc` because that would +`parent` should be. We know it can’t contain an `Rc` because that would create a reference cycle, with `leaf.parent` pointing to `branch` and `branch.children` pointing to `leaf`, which would cause their `strong_count` values to never be zero. @@ -334,11 +334,11 @@ children: if a parent node is dropped, its child nodes should be dropped as well. However, a child should not own its parent: if we drop a child node, the parent should still exist. This is a case for weak references! -So instead of `Rc`, we'll make the type of `parent` use `Weak`, specifically +So instead of `Rc`, we’ll make the type of `parent` use `Weak`, specifically a `RefCell>`. Now our `Node` struct definition looks like this: - @@ -365,7 +365,7 @@ you think? It seems repetitive to explain this every time. /Carol --> This way, a node will be able to refer to its parent node, but does not own its -parent. In Listing 15-24, let's update `main` to use this new definition so +parent. In Listing 15-24, let’s update `main` to use this new definition so that the `leaf` node will have a way to refer to its parent, `branch`: -When we print out the parent of `leaf` again, this time we'll get a `Some` +When we print out the parent of `leaf` again, this time we’ll get a `Some` variant holding `branch`: `leaf` can now access its parent! When we print out `leaf`, we also avoid the cycle that eventually ended in a stack overflow like we had in Listing 15-21: the `Weak` references are printed as `(Weak)`: @@ -452,13 +452,13 @@ children: RefCell { value: [Node { value: 3, parent: RefCell { value: (Weak) }, children: RefCell { value: [] } }] } }) ``` -The lack of infinite output indicates that this code didn't create a reference +The lack of infinite output indicates that this code didn’t create a reference cycle. We can also tell this by looking at the values we get from calling `Rc::strong_count` and `Rc::weak_count`. #### Visualizing Changes to `strong_count` and `weak_count` -Let's look at how the `strong_count` and `weak_count` values of the `Rc` +Let’s look at how the `strong_count` and `weak_count` values of the `Rc` instances change by creating a new inner scope and moving the creation of `branch` into that scope. This will let us see what happens when `branch` is created and then dropped when it goes out of scope. The modifications are shown @@ -522,10 +522,10 @@ have a strong count of 2, because `branch` now has a clone of the `Rc` of When the inner scope ends, `branch` goes out of scope and the strong count of the `Rc` decreases to 0, so its `Node` gets dropped. The weak count of 1 from -`leaf.parent` has no bearing on whether `Node` is dropped or not, so we don't +`leaf.parent` has no bearing on whether `Node` is dropped or not, so we don’t get any memory leaks! -If we try to access the parent of `leaf` after the end of the scope, we'll get +If we try to access the parent of `leaf` after the end of the scope, we’ll get `None` again. At the end of the program, the `Rc` in `leaf` has a strong count of 1 and a weak count of 0, because the variable `leaf` is now the only reference to the `Rc` again. @@ -537,7 +537,7 @@ strong and weak counts. /Carol --> All of the logic that manages the counts and value dropping is built in to `Rc` and `Weak` and their implementations of the `Drop` trait. By specifying that the relationship from a child to its parent should be a `Weak` -reference in the definition of `Node`, we're able to have parent nodes point to +reference in the definition of `Node`, we’re able to have parent nodes point to child nodes and vice versa without creating a reference cycle and memory leaks. + + + + + +In Rust, where we have the concept of ownership and borrowing, an additional +difference between references and smart pointers is that references are a kind +of pointer that only borrow data; by contrast, in many cases, smart pointers +*own* the data that they point to. + +We’ve actually already encountered a few smart pointers in this book, such as +`String` and `Vec` from Chapter 8, though we didn’t call them smart pointers +at the time. Both these types count as smart pointers because they own some +memory and allow you to manipulate it. They also have metadata (such as their +capacity) and extra capabilities or guarantees (such as `String` ensuring its +data will always be valid UTF-8). + + + + +Smart pointers are usually implemented using structs. The characteristics that +distinguish a smart pointer from an ordinary struct are that smart pointers +implement the `Deref` and `Drop` traits. The `Deref` trait allows an instance +of the smart pointer struct to behave like a reference so that we can write +code that works with either references or smart pointers. The `Drop` trait +allows us to customize the code that gets run when an instance of the smart +pointer goes out of scope. In this chapter, we’ll be discussing both of those +traits and demonstrating why they’re important to smart pointers. Given that the smart pointer pattern is a general design pattern used frequently in Rust, this chapter won’t cover every smart pointer that exists. -Many libraries have their own and you may write some yourself. The ones we -cover here are the most common ones from the standard library: - -* `Box`, for allocating values on the heap -* `Rc`, a reference counted type so data can have multiple owners -* `RefCell`, which isn’t a smart pointer itself, but manages access to the - smart pointers `Ref` and `RefMut` to enforce the borrowing rules at runtime - instead of compile time - -Along the way, we’ll also cover: - -* The *interior mutability* pattern where an immutable type exposes an API for - mutating an interior value, and the borrowing rules apply at runtime instead - of compile time -* Reference cycles, how they can leak memory, and how to prevent them +Many libraries have their own smart pointers and you can even write some +yourself. We’ll just cover the most common smart pointers from the standard +library: + + + + +* `Box` for allocating values on the heap +* `Rc`, a reference counted type that enables multiple ownership +* `Ref` and `RefMut`, accessed through `RefCell`, a type that enforces + the borrowing rules at runtime instead of compile time + + + + +Along the way, we’ll cover the *interior mutability* pattern where an immutable +type exposes an API for mutating an interior value. We’ll also discuss +*reference cycles*, how they can leak memory, and how to prevent them. Let’s dive in! ## `Box` Points to Data on the Heap and Has a Known Size The most straightforward smart pointer is a *box*, whose type is written -`Box`. Boxes allow you to put a single value on the heap (we talked about -the stack vs. the heap in Chapter 4). Listing 15-1 shows how to use a box to -store an `i32` on the heap: +`Box`. Boxes allow you to store data on the heap rather than the stack. What +remains on the stack is the pointer to the heap data. Refer back to Chapter 4 +if you’d like to review the difference between the stack and the heap. + + + + +Boxes don’t have performance overhead other than their data being on the heap +instead of on the stack, but they don’t have a lot of extra abilities either. +They’re most often used in these situations: + +- When you have a type whose size can’t be known at compile time, and you want + to use a value of that type in a context that needs to know an exact size +- When you have a large amount of data and you want to transfer ownership but + ensure the data won’t be copied when you do so +- When you want to own a value and only care that it’s a type that implements a + particular trait rather than knowing the concrete type itself + +We’re going to demonstrate the first case in the rest of this section. To +elaborate on the other two situations a bit more: in the second case, +transferring ownership of a large amount of data can take a long time because +the data gets copied around on the stack. To improve performance in this +situation, we can store the large amount of data on the heap in a box. Then, +only the small amount of pointer data is copied around on the stack, and the +data stays in one place on the heap. The third case is known as a *trait +object*, and Chapter 17 has an entire section devoted just to that topic. So +know that what you learn here will be applied again in Chapter 17! + +### Using a `Box` to Store Data on the Heap + +Before we get into a use case for `Box`, let’s get familiar with the syntax +and how to interact with values stored within a `Box`. + +Listing 15-1 shows how to use a box to store an `i32` on the heap: Filename: src/main.rs @@ -61,28 +150,104 @@ fn main() { Listing 15-1: Storing an `i32` value on the heap using a box -This will print `b = 5`. In this case, we can access the data in the box in a -similar way as we would if this data was on the stack. Just like any value that -has ownership of data, when a box goes out of scope like `b` does at the end of -`main`, it will be deallocated. The deallocation happens for both the box -(stored on the stack) and the data it points to (stored on the heap). +We define the variable `b` to have the value of a `Box` that points to the +value `5`, which is allocated on the heap. This program will print `b = 5`; in +this case, we can access the data in the box in a similar way as we would if +this data was on the stack. Just like any value that has ownership of data, +when a box goes out of scope like `b` does at the end of `main`, it will be +deallocated. The deallocation happens for both the box (stored on the stack) +and the data it points to (stored on the heap). Putting a single value on the heap isn’t very useful, so you won’t use boxes by -themselves in the way that Listing 15-1 does very often. A time when boxes are -useful is when you want to ensure that your type has a known size. For -example, consider Listing 15-2, which contains an enum definition for a *cons -list*, a type of data structure that comes from functional programming. - -A cons list is a list where each item contains a value and the next item until -the end of the list, which is signified by a value called `Nil`. Note that we -aren’t introducing the idea of “nil” or “null” that we discussed in Chapter 6, -this is just a regular enum variant name we’re using because it’s the canonical -name to use when describing the cons list data structure. Cons lists aren’t -used very often in Rust, `Vec` is a better choice most of the time, but -implementing this data structure is useful as an example. - -Here’s our first try at defining a cons list as an enum; note that this won’t -compile quite yet: +themselves in the way that Listing 15-1 does very often. Having values like a +single `i32` on the stack, where they’re stored by default is more appropriate +in the majority of cases. Let’s get into a case where boxes allow us to define +types that we wouldn’t be allowed to if we didn’t have boxes. + + + + +### Boxes Enable Recursive Types + + + + + + +Rust needs to know at compile time how much space a type takes up. One kind of +type whose size can’t be known at compile time is a *recursive type* where a +value can have as part of itself another value of the same type. This nesting +of values could theoretically continue infinitely, so Rust doesn’t know how +much space a value of a recursive type needs. Boxes have a known size, however, +so by inserting a box in a recursive type definition, we are allowed to have +recursive types. + +Let’s explore the *cons list*, a data type common in functional programming +languages, to illustrate this concept. The cons list type we’re going to define +is straightforward except for the recursion, so the concepts in this example +will be useful any time you get into more complex situations involving +recursive types. + + + + +A cons list is a list where each item in the list contains two things: the +value of the current item and the next item. The last item in the list contains +only a value called `Nil` without a next item. + +> #### More Information About the Cons List +> +> A *cons list* is a data structure that comes from the Lisp programming +> language and its dialects. In Lisp, the `cons` function (short for “construct +> function”) constructs a new list from its two arguments, which usually are a +> single value and another list. +> +> The cons function concept has made its way into more general functional +> programming jargon; “to cons x onto y” informally means to construct a new +> container instance by putting the element x at the start of this new +> container, followed by the container y. +> +> A cons list is produced by recursively calling the `cons` function. +> The canonical name to denote the base case of the recursion is `Nil`, which +> announces the end of the list. Note that this is not the same as the “null” +> or “nil” concept from Chapter 6, which is an invalid or absent value. + +Note that while functional programming languages use cons lists frequently, +this isn’t a commonly used data structure in Rust. Most of the time when you +have a list of items in Rust, `Vec` is a better choice. Other, more complex +recursive data types *are* useful in various situations in Rust, but by +starting with the cons list, we can explore how boxes let us define a recursive +data type without much distraction. + + + + +Listing 15-2 contains an enum definition for a cons list. Note that this +won’t compile quite yet because this is type doesn’t have a known size, which +we’ll demonstrate: + + + Filename: src/main.rs @@ -96,12 +261,21 @@ enum List { Listing 15-2: The first attempt of defining an enum to represent a cons list data structure of `i32` values -We’re choosing to implement a cons list that only holds `i32` values, but we -could have chosen to implement it using generics as we discussed in Chapter 10 -to define a cons list concept independent of the type of value stored in the -cons list. +> Note: We’re choosing to implement a cons list that only holds `i32` values +> for the purposes of this example. We could have implemented it using +> generics, as we discussed in Chapter 10, in order to define a cons list type +> that could store values of any type. + + + -Using a cons list to store the list `1, 2, 3` would look like this: +Using our cons list type to store the list `1, 2, 3` would look like the code +in Listing 15-3: + +Filename: src/main.rs ``` use List::{Cons, Nil}; @@ -111,35 +285,47 @@ fn main() { } ``` +Listing 15-3: Using the `List` enum to store the list `1, 2, 3` + The first `Cons` value holds `1` and another `List` value. This `List` value is another `Cons` value that holds `2` and another `List` value. This is one more `Cons` value that holds `3` and a `List` value, which is finally `Nil`, the non-recursive variant that signals the end of the list. -If we try to compile the above code, we get the error shown in Listing 15-3: +If we try to compile the above code, we get the error shown in Listing 15-4: ``` error[E0072]: recursive type `List` has infinite size --> | -1 | enum List { - | _^ starting here... -2 | | Cons(i32, List), -3 | | Nil, -4 | | } - | |_^ ...ending here: recursive type has infinite size +1 | enum List { + | ^^^^^^^^^ recursive type has infinite size +2 | Cons(i32, List), + | --------------- recursive without indirection | = help: insert indirection (e.g., a `Box`, `Rc`, or `&`) at some point to make `List` representable ``` -Listing 15-3: The error we get when attempting to define a recursive enum +Listing 15-4: The error we get when attempting to define a recursive enum + + + + +The error says this type ‘has infinite size’. The reason is the way we’ve +defined `List` is with a variant that is recursive: it holds another value of +itself directly. This means Rust can’t figure out how much space it needs in +order to store a `List` value. Let’s break this down a bit: first let’s look at +how Rust decides how much space it needs to store a value of a non-recursive +type. + +### Computing the Size of a Non-Recursive Type -The error says this type ‘has infinite size’. Why is that? It’s because we’ve -defined `List` to have a variant that is recursive: it holds another value of -itself. This means Rust can’t figure out how much space it needs in order to -store a `List` value. Let’s break this down a bit: first let’s look at how Rust -decides how much space it needs to store a value of a non-recursive type. Recall the `Message` enum we defined in Listing 6-2 when we discussed enum definitions in Chapter 6: @@ -152,28 +338,30 @@ enum Message { } ``` -When Rust needs to know how much space to allocate for a `Message` value, it -can go through each of the variants and see that `Message::Quit` does not need -any space, `Message::Move` needs enough space to store two `i32` values, and so -forth. Therefore, the most space a `Message` value will need is the space it -would take to store the largest of its variants. +To determine how much space to allocate for a `Message` value, Rust goes +through each of the variants to see which variant needs the most space. Rust +sees that `Message::Quit` doesn’t need any space, `Message::Move` needs enough +space to store two `i32` values, and so forth. Since only one variant will end +up being used, the most space a `Message` value will need is the space it would +take to store the largest of its variants. -Contrast this to what happens when the Rust compiler looks at a recursive type -like `List` in Listing 15-2. The compiler tries to figure out how much memory -is needed to store value of `List`, and starts by looking at the `Cons` +Contrast this to what happens when Rust tries to determine how much space a +recursive type like the `List` enum in Listing 15-2 needs. The compiler starts +by looking at the `Cons` variant, which holds a value of type `i32` and a value +of type `List`. Therefore, `Cons` needs an amount of space equal to the size of +an `i32` plus the size of a `List`. To figure out how much memory the `List` +type needs, the compiler looks at the variants, starting with the `Cons` variant. The `Cons` variant holds a value of type `i32` and a value of type -`List`, so `Cons` needs an amount of space equal to the size of an `i32` plus -the size of a `List`. To figure out how much memory a `List` needs, it looks at -its variants, starting with the `Cons` variant. The `Cons` variant holds a -value of type `i32` and a value of type `List`, and this continues infinitely, -as shown in Figure 15-4. +`List`, and this continues infinitely, as shown in Figure 15-5. An infinite Cons list -Figure 15-4: An infinite `List` consisting of infinite `Cons` variants +Figure 15-5: An infinite `List` consisting of infinite `Cons` variants + +### Using `Box` to Get a Recursive Type with a Known Size Rust can’t figure out how much space to allocate for recursively defined types, -so the compiler gives the error in Listing 15-3. The error did include this +so the compiler gives the error in Listing 15-4. The error does include this helpful suggestion: ``` @@ -181,13 +369,23 @@ helpful suggestion: make `List` representable ``` -Because a `Box` is a pointer, we always know how much space it needs: a -pointer takes up a `usize` amount of space. The value of the `usize` will be -the address of the heap data. The heap data can be any size, but the address to -the start of that heap data will always fit in a `usize`. So if we change our -definition from Listing 15-2 to look like the definition here in Listing 15-5, -and change `main` to use `Box::new` for the values inside the `Cons` variants -like so: +In this suggestion, “indirection” means that instead of storing a value +directly, we’re going to store the value indirectly by storing a pointer to +the value instead. + +Because a `Box` is a pointer, Rust always knows how much space a `Box` +needs: a pointer’s size doesn’t change based on the amount of data it’s +pointing to. + +So we can put a `Box` inside the `Cons` variant instead of another `List` value +directly. The `Box` will point to the next `List` value that will be on the +heap, rather than inside the `Cons` variant. Conceptually, we still have a list +created by lists “holding” other lists, but the way this concept is implemented +is now more like the items being next to one another rather than inside one +another. + +We can change the definition of the `List` enum from Listing 15-2 and the usage +of the `List` from Listing 15-3 to the code in Listing 15-6, which will compile: Filename: src/main.rs @@ -207,253 +405,540 @@ fn main() { } ``` -Listing 15-5: Definition of `List` that uses `Box` in order to have a -known size +Listing 15-6: Definition of `List` that uses `Box` in order to have a known +size -The compiler will be able to figure out the size it needs to store a `List` -value. Rust will look at `List`, and again start by looking at the `Cons` -variant. The `Cons` variant will need the size of `i32` plus the space to store -a `usize`, since a box always has the size of a `usize`, no matter what it’s -pointing to. Then Rust looks at the `Nil` variant, which does not store a -value, so `Nil` doesn’t need any space. We’ve broken the infinite, recursive -chain by adding in a box. Figure 15-6 shows what the `Cons` variant looks like -now: +The `Cons` variant will need the size of an `i32` plus the space to store the +box’s pointer data. The `Nil` variant stores no values, so it needs less space +than the `Cons` variant. We now know that any `List` value will take up the +size of an `i32` plus the size of a box’s pointer data. By using a box, we’ve +broken the infinite, recursive chain so the compiler is able to figure out the +size it needs to store a `List` value. Figure 15-7 shows what the `Cons` +variant looks like now: A finite Cons list -Figure 15-6: A `List` that is not infinitely sized since `Cons` holds a `Box` +Figure 15-7: A `List` that is not infinitely sized since `Cons` holds a `Box` + + + + +Boxes only provide the indirection and heap allocation; they don’t have any +other special abilities like those we’ll see with the other smart pointer +types. They also don’t have any performance overhead that these special +abilities incur, so they can be useful in cases like the cons list where the +indirection is the only feature we need. We’ll look at more use cases for boxes +in Chapter 17, too. + +The `Box` type is a smart pointer because it implements the `Deref` trait, +which allows `Box` values to be treated like references. When a `Box` +value goes out of scope, the heap data that the box is pointing to is cleaned +up as well because of the `Box` type’s `Drop` trait implementation. Let’s +explore these two types in more detail; these traits are going to be even more +important to the functionality provided by the other smart pointer types we’ll +be discussing in the rest of this chapter. + + + + +## Treating Smart Pointers like Regular References with the `Deref` Trait + +Implementing `Deref` trait allows us to customize the behavior of the +*dereference operator* `*`(as opposed to the multiplication or glob operator). +By implementing `Deref` in such a way that a smart pointer can be treated like +a regular reference, we can write code that operates on references and use that +code with smart pointers too. + + + + + + + +Let’s first take a look at how `*` works with regular references, then try and +define our own type like `Box` and see why `*` doesn’t work like a +reference. We’ll explore how implementing the `Deref` trait makes it possible +for smart pointers to work in a similar way as references. Finally, we’ll look +at the *deref coercion* feature of Rust and how that lets us work with either +references or smart pointers. + +### Following the Pointer to the Value with `*` + + + + + + + +A regular reference is a type of pointer, and one way to think of a pointer is +that it’s an arrow to a value stored somewhere else. In Listing 15-8, let’s +create a reference to an `i32` value then use the dereference operator to +follow the reference to the data: + + + + + + + +Filename: src/main.rs -This is the main area where boxes are useful: breaking up an infinite data -structure so that the compiler can know what size it is. We’ll look at another -case where Rust has data of unknown size in Chapter 17 when we discuss trait -objects. +``` +fn main() { + let x = 5; + let y = &x; -Even though you won’t be using boxes very often, they are a good way to -understand the smart pointer pattern. Two of the aspects of `Box` that are -commonly used with smart pointers are its implementations of the `Deref` trait -and the `Drop` trait. Let’s investigate how these traits work and how smart -pointers use them. + assert_eq!(5, x); + assert_eq!(5, *y); +} +``` -## The `Deref` Trait Allows Access to the Data Through a Reference +Listing 15-8: Using the dereference operator to follow a reference to an `i32` +value -The first important smart pointer-related trait is `Deref`, which allows us to -override `*`, the dereference operator (as opposed to the multiplication -operator or the glob operator). Overriding `*` for smart pointers makes -accessing the data behind the smart pointer convenient, and we’ll talk about -what we mean by convenient when we get to deref coercions later in this section. +The variable `x` holds an `i32` value, `5`. We set `y` equal to a reference to +`x`. We can assert that `x` is equal to `5`. However, if we want to make an +assertion about the value in `y`, we have to use `*y` to follow the reference +to the value that the reference is pointing to (hence *de-reference*). Once we +de-reference `y`, we have access to the integer value `y` is pointing to that +we can compare with `5`. -We briefly mentioned the dereference operator in Chapter 8, in the hash map -section titled “Update a Value Based on the Old Value”. We had a mutable -reference, and we wanted to change the value that the reference was pointing -to. In order to do that, first we had to dereference the reference. Here’s -another example using references to `i32` values: +If we try to write `assert_eq!(5, y);` instead, we’ll get this compilation +error: ``` -let mut x = 5; -{ - let y = &mut x; +error[E0277]: the trait bound `{integer}: std::cmp::PartialEq<&{integer}>` is +not satisfied + --> :5:19 + | +5 | if ! ( * left_val == * right_val ) { + | ^^ can't compare `{integer}` with `&{integer}` + | + = help: the trait `std::cmp::PartialEq<&{integer}>` is not implemented for + `{integer}` +``` - *y += 1 -} +Comparing a reference to a number with a number isn’t allowed because they’re +different types. We have to use `*` to follow the reference to the value it’s +pointing to. + +### Using `Box` Like a Reference + +We can rewrite the code in Listing 15-8 to use a `Box` instead of a +reference, and the de-reference operator will work the same way as shown in +Listing 15-9: + +Filename: src/main.rs -assert_eq!(6, x); ``` +fn main() { + let x = 5; + let y = Box::new(x); -We use `*y` to access the data that the mutable reference in `y` refers to, -rather than the mutable reference itself. We can then modify that data, in this -case by adding 1. + assert_eq!(5, x); + assert_eq!(5, *y); +} +``` -With references that aren’t smart pointers, there’s only one value that the -reference is pointing to, so the dereference operation is straightforward. -Smart pointers can also store metadata about the pointer or the data. When -dereferencing a smart pointer, we only want the data, not the metadata, since -dereferencing a regular reference only gives us data and not metadata. We want -to be able to use smart pointers in the same places that we can use regular -references. To enable that, we can override the behavior of the `*` operator by -implementing the `Deref` trait. +Listing 15-9: Using the dereference operator on a `Box` -Listing 15-7 has an example of overriding `*` using `Deref` on a struct we’ve -defined to hold mp3 data and metadata. `Mp3` is, in a sense, a smart pointer: -it owns the `Vec` data containing the audio. In addition, it holds some -optional metadata, in this case the artist and title of the song in the audio -data. We want to be able to conveniently access the audio data, not the -metadata, so we implement the `Deref` trait to return the audio data. -Implementing the `Deref` trait requires implementing one method named `deref` -that borrows `self` and returns the inner data: +The only part of Listing 15-8 that we changed was to set `y` to be an instance +of a box pointing to the value in `x` rather than a reference pointing to the +value of `x`. In the last assertion, we can use the dereference operator to +follow the box’s pointer in the same way that we did when `y` was a reference. +Let’s explore what is special about `Box` that enables us to do this by +defining our own box type. + +### Defining Our Own Smart Pointer + +Let’s build a smart pointer similar to the `Box` type that the standard +library has provided for us, in order to experience that smart pointers don’t +behave like references by default. Then we’ll learn about how to add the +ability to use the dereference operator. + +`Box` is ultimately defined as a tuple struct with one element, so Listing +15-10 defines a `MyBox` type in the same way. We’ll also define a `new` +function to match the `new` function defined on `Box`: Filename: src/main.rs ``` -use std::ops::Deref; +struct MyBox(T); -struct Mp3 { - audio: Vec, - artist: Option, - title: Option, +impl MyBox { + fn new(x: T) -> MyBox { + MyBox(x) + } } +``` -impl Deref for Mp3 { - type Target = Vec; +Listing 15-10: Defining a `MyBox` type - fn deref(&self) -> &Vec { - &self.audio - } -} +We define a struct named `MyBox` and declare a generic parameter `T`, since we +want our type to be able to hold values of any type. `MyBox` is a tuple struct +with one element of type `T`. The `MyBox::new` function takes one parameter of +type `T` and returns a `MyBox` instance that holds the value passed in. + +Let’s try adding the code from Listing 15-9 to the code in Listing 15-10 and +changing `main` to use the `MyBox` type we’ve defined instead of `Box`. +The code in Listing 15-11 won’t compile because Rust doesn’t know how to +dereference `MyBox`: + +Filename: src/main.rs +``` fn main() { - let my_favorite_song = Mp3 { - // we would read the actual audio data from an mp3 file - audio: vec![1, 2, 3], - artist: Some(String::from("Nirvana")), - title: Some(String::from("Smells Like Teen Spirit")), - }; - - assert_eq!(vec![1, 2, 3], *my_favorite_song); + let x = 5; + let y = MyBox::new(x); + + assert_eq!(5, x); + assert_eq!(5, *y); +} +``` + +Listing 15-11: Attempting to use `MyBox` in the same way we were able to use +references and `Box` + +The compilation error we get is: + +``` +error: type `MyBox<{integer}>` cannot be dereferenced + --> src/main.rs:14:19 + | +14 | assert_eq!(5, *y); + | ^^ +``` + +Our `MyBox` type can’t be dereferenced because we haven’t implemented that +ability on our type. To enable dereferencing with the `*` operator, we can +implement the `Deref` trait. + +### Implementing the `Deref` Trait Defines How To Treat a Type Like a Reference + +As we discussed in Chapter 10, in order to implement a trait, we need to +provide implementations for the trait’s required methods. The `Deref` trait, +provided by the standard library, requires implementing one method named +`deref` that borrows `self` and returns a reference to the inner data. Listing +15-12 contains an implementation of `Deref` to add to the definition of `MyBox`: + +Filename: src/main.rs + +``` +use std::ops::Deref; + +# struct MyBox(T); +impl Deref for MyBox { + type Target = T; + + fn deref(&self) -> &T { + &self.0 + } } ``` -Listing 15-7: An implementation of the `Deref` trait on a struct that holds mp3 -file data and metadata +Listing 15-12: Implementing `Deref` on `MyBox` -Most of this should look familiar: a struct, a trait implementation, and a -main function that creates an instance of the struct. There is one part we -haven’t explained thoroughly yet: similarly to Chapter 13 when we looked at the -Iterator trait with the `type Item`, the `type Target = T;` syntax is defining -an associated type, which is covered in more detail in Chapter 19. Don’t worry -about that part of the example too much; it is a slightly different way of -declaring a generic parameter. +The `type Target = T;` syntax defines an associated type for this trait to use. +Associated types are a slightly different way of declaring a generic parameter +that you don’t need to worry about too much for now; we’ll cover it in more +detail in Chapter 19. -In the `assert_eq!`, we’re verifying that `vec![1, 2, 3]` is the result we get -when dereferencing the `Mp3` instance with `*my_favorite_song`, which is what -happens since we implemented the `deref` method to return the audio data. If -we hadn’t implemented the `Deref` trait for `Mp3`, Rust wouldn’t compile the -code `*my_favorite_song`: we’d get an error saying type `Mp3` cannot be -dereferenced. + + -The reason this code works is that what the `*` operator is doing behind -the scenes when we call `*my_favorite_song` is: +We filled in the body of the `deref` method with `&self.0` so that `deref` +returns a reference to the value we want to access with the `*` operator. The +`main` function from Listing 15-11 that calls `*` on the `MyBox` value now +compiles and the assertions pass! + +Without the `Deref` trait, the compiler can only dereference `&` references. +The `Deref` trait’s `deref` method gives the compiler the ability to take a +value of any type that implements `Deref` and call the `deref` method in order +to get a `&` reference that it knows how to dereference. + +When we typed `*y` in Listing 15-11, what Rust actually ran behind the scenes +was this code: ``` -*(my_favorite_song.deref()) +*(y.deref()) ``` -This calls the `deref` method on `my_favorite_song`, which borrows -`my_favorite_song` and returns a reference to `my_favorite_song.audio`, since -that’s what we defined `deref` to do in Listing 15-5. `*` on references is -defined to just follow the reference and return the data, so the expansion of -`*` doesn’t recurse for the outer `*`. So we end up with data of type -`Vec`, which matches the `vec![1, 2, 3]` in the `assert_eq!` in Listing -15-5. + + + +Rust substitutes the `*` operator with a call to the `deref` method and then a +plain dereference so that we don’t have to think about when we have to call the +`deref` method or not. This feature of Rust lets us write code that functions +identically whether we have a regular reference or a type that implements +`Deref`. + +The reason the `deref` method returns a reference to a value, and why the plain +dereference outside the parentheses in `*(y.deref())` is still necessary, is +because of ownership. If the `deref` method returned the value directly instead +of a reference to the value, the value would be moved out of `self`. We don’t +want to take ownership of the inner value inside `MyBox` in this case and in +most cases where we use the dereference operator. -The reason that the return type of the `deref` method is still a reference and -why it’s necessary to dereference the result of the method is that if the -`deref` method returned just the value, using `*` would always take ownership. +Note that replacing `*` with a call to the `deref` method and then a call to +`*` happens once, each time we type a `*` in our code. The substitution of `*` +does not recurse infinitely. That’s how we end up with data of type `i32`, +which matches the `5` in the `assert_eq!` in Listing 15-11. ### Implicit Deref Coercions with Functions and Methods -Rust tends to favor explicitness over implicitness, but one case where this -does not hold true is *deref coercions* of arguments to functions and methods. -A deref coercion will automatically convert a reference to a pointer or a smart -pointer into a reference to that pointer’s contents. A deref coercion happens -when a value is passed to a function or method, and only happens if it’s needed -to get the type of the value passed in to match the type of the parameter -defined in the signature. Deref coercion was added to Rust to make calling -functions and methods not need as many explicit references and dereferences -with `&` and `*`. + + + +*Deref coercion* is a convenience that Rust performs on arguments to functions +and methods. Deref coercion converts a reference to a type that implements +`Deref` into a reference to a type that `Deref` can convert the original type +into. Deref coercion happens automatically when we pass a reference to a value +of a particular type as an argument to a function or method that doesn’t match +the type of the parameter in the function or method definition, and there’s a +sequence of calls to the `deref` method that will convert the type we provided +into the type that the parameter needs. + +Deref coercion was added to Rust so that programmers writing function and +method calls don’t need to add as many explicit references and dereferences +with `&` and `*`. This feature also lets us write more code that can work for +either references or smart pointers. + +To illustrate deref coercion in action, let’s use the `MyBox` type we +defined in Listing 15-10 as well as the implementation of `Deref` that we added +in Listing 15-12. Listing 15-13 shows the definition of a function that has a +string slice parameter: -Using our `Mp3` struct from Listing 15-5, here’s the signature of a function to -compress mp3 audio data that takes a slice of `u8`: +Filename: src/main.rs ``` -fn compress_mp3(audio: &[u8]) -> Vec { - // the actual implementation would go here +fn hello(name: &str) { + println!("Hello, {}!", name); } ``` -If Rust didn’t have deref coercion, in order to call this function with the -audio data in `my_favorite_song`, we’d have to write: +Listing 15-13: A `hello` function that has the parameter `name` of type `&str` + +We can call the `hello` function with a string slice as an argument, like +`hello("Rust");` for example. Deref coercion makes it possible for us to call +`hello` with a reference to a value of type `MyBox`, as shown in +Listing 15-14: + +Filename: src/main.rs ``` -compress_mp3(my_favorite_song.audio.as_slice()) +# use std::ops::Deref; +# +# struct MyBox(T); +# +# impl MyBox { +# fn new(x: T) -> MyBox { +# MyBox(x) +# } +# } +# +# impl Deref for MyBox { +# type Target = T; +# +# fn deref(&self) -> &T { +# &self.0 +# } +# } +# +# fn hello(name: &str) { +# println!("Hello, {}!", name); +# } +# +fn main() { + let m = MyBox::new(String::from("Rust")); + hello(&m); +} ``` -That is, we’d have to explicitly say that we want the data in the `audio` field -of `my_favorite_song` and that we want a slice referring to the whole -`Vec`. If there were a lot of places where we’d want process the `audio` -data in a similar manner, `.audio.as_slice()` would be wordy and repetitive. +Listing 15-14: Calling `hello` with a reference to a `MyBox`, which +works because of deref coercion + +Here we’re calling the `hello` function with the argument `&m`, which is a +reference to a `MyBox` value. Because we implemented the `Deref` trait +on `MyBox` in Listing 15-12, Rust can turn `&MyBox` into `&String` +by calling `deref`. The standard library provides an implementation of `Deref` +on `String` that returns a string slice, which we can see in the API +documentation for `Deref`. Rust calls `deref` again to turn the `&String` into +`&str`, which matches the `hello` function’s definition. -However, because of deref coercion and our implementation of the `Deref` trait -on `Mp3`, we can call this function with the data in `my_favorite_song` by -using this code: +If Rust didn’t implement deref coercion, in order to call `hello` with a value +of type `&MyBox`, we’d have to write the code in Listing 15-15 instead +of the code in Listing 15-14: + +Filename: src/main.rs ``` -let result = compress_mp3(&my_favorite_song); +# use std::ops::Deref; +# +# struct MyBox(T); +# +# impl MyBox { +# fn new(x: T) -> MyBox { +# MyBox(x) +# } +# } +# +# impl Deref for MyBox { +# type Target = T; +# +# fn deref(&self) -> &T { +# &self.0 +# } +# } +# +# fn hello(name: &str) { +# println!("Hello, {}!", name); +# } +# +fn main() { + let m = MyBox::new(String::from("Rust")); + hello(&(*m)[..]); +} ``` -Just an `&` and the instance, nice! We can treat our smart pointer as if it was -a regular reference. Deref coercion means that Rust can use its knowledge of -our `Deref` implementation, namely: Rust knows that `Mp3` implements the -`Deref` trait and returns `&Vec` from the `deref` method. Rust also knows -the standard library implements the `Deref` trait on `Vec` to return `&[T]` -from the `deref` method (and we can find that out too by looking at the API -documentation for `Vec`). So, at compile time, Rust will see that it can use -`Deref::deref` twice to turn `&Mp3` into `&Vec` and then into `&[T]` to -match the signature of `compress_mp3`. That means we get to do less typing! -Rust will analyze types through `Deref::deref` as many times as it needs to in -order to get a reference to match the parameter’s type, when the `Deref` trait -is defined for the types involved. The indirection is resolved at compile time, -so there is no run-time penalty for taking advantage of deref coercion. +Listing 15-15: The code we’d have to write if Rust didn’t have deref coercion + +The `(*m)` is dereferencing the `MyBox` into a `String`. Then the `&` +and `[..]` are taking a string slice of the `String` that is equal to the whole +string to match the signature of `hello`. The code without deref coercions is +harder to read, write, and understand with all of these symbols involved. Deref +coercion makes it so that Rust takes care of these conversions for us +automatically. + +When the `Deref` trait is defined for the types involved, Rust will analyze the +types and use `Deref::deref` as many times as it needs in order to get a +reference to match the parameter’s type. This is resolved at compile time, so +there is no run-time penalty for taking advantage of deref coercion! + +### How Deref Coercion Interacts with Mutability -There’s also a `DerefMut` trait for overriding `*` on `&mut T` for use in -assignment in the same fashion that we use `Deref` to override `*` on `&T`s. + + + +Similar to how we use the `Deref` trait to override `*` on immutable +references, Rust provides a `DerefMut` trait for overriding `*` on mutable +references. Rust does deref coercion when it finds types and trait implementations in three cases: + + + * From `&T` to `&U` when `T: Deref`. * From `&mut T` to `&mut U` when `T: DerefMut`. * From `&mut T` to `&U` when `T: Deref`. -The first two are the same, except for mutability: if you have a `&T`, and -`T` implements `Deref` to some type `U`, you can get a `&U` transparently. Same -for mutable references. The last one is more tricky: if you have a mutable -reference, it will also coerce to an immutable one. The other case is _not_ -possible though: immutable references will never coerce to mutable ones. - -The reason that the `Deref` trait is important to the smart pointer pattern is -that smart pointers can then be treated like regular references and used in -places that expect regular references. We don’t have to redefine methods and -functions to take smart pointers explicitly, for example. +The first two cases are the same except for mutability. The first case says +that if you have a `&T`, and `T` implements `Deref` to some type `U`, you can +get a `&U` transparently. The second case states that the same deref coercion +happens for mutable references. + +The last case is trickier: Rust will also coerce a mutable reference to an +immutable one. The reverse is *not* possible though: immutable references will +never coerce to mutable ones. Because of the borrowing rules, if you have a +mutable reference, that mutable reference must be the only reference to that +data (otherwise, the program wouldn’t compile). Converting one mutable +reference to one immutable reference will never break the borrowing rules. +Converting an immutable reference to a mutable reference would require that +there was only one immutable reference to that data, and the borrowing rules +don’t guarantee that. Therefore, Rust can’t make the assumption that converting +an immutable reference to a mutable reference is possible. + + + ## The `Drop` Trait Runs Code on Cleanup -The other trait that’s important to the smart pointer pattern is the `Drop` -trait. `Drop` lets us run some code when a value is about to go out of scope. -Smart pointers perform important cleanup when being dropped, like deallocating -memory or decrementing a reference count. More generally, data types can manage -resources beyond memory, like files or network connections, and use `Drop` to -release those resources when our code is done with them. We’re discussing -`Drop` in the context of smart pointers, though, because the functionality of -the `Drop` trait is almost always used when implementing smart pointers. - -In some other languages, we have to remember to call code to free the memory or -resource every time we finish using an instance of a smart pointer. If we -forget, the system our code is running on might get overloaded and crash. In -Rust, we can specify that some code should be run when a value goes out of -scope, and the compiler will insert this code automatically. That means we don’t -need to remember to put this code everywhere we’re done with an instance of -these types, but we still won’t leak resources! - -The way we specify code should be run when a value goes out of scope is by -implementing the `Drop` trait. The `Drop` trait requires us to implement one -method named `drop` that takes a mutable reference to `self`. - -Listing 15-8 shows a `CustomSmartPointer` struct that doesn’t actually do -anything, but we’re printing out `CustomSmartPointer created.` right after we -create an instance of the struct and `Dropping CustomSmartPointer!` when the -instance goes out of scope so that we can see when each piece of code gets run. -Instead of a `println!` statement, you’d fill in `drop` with whatever cleanup -code your smart pointer needs to run: +The second trait important to the smart pointer pattern is `Drop`, which lets +us customize what happens when a value is about to go out of scope. We can +provide an implementation for the `Drop` trait on any type, and the code we +specify can be used to release resources like files or network connections. +We’re introducing `Drop` in the context of smart pointers because the +functionality of the `Drop` trait is almost always used when implementing a +smart pointer. For example, `Box` customizes `Drop` in order to deallocate +the space on the heap that the box points to. + +In some languages, the programmer must call code to free memory or resources +every time they finish using an instance of a smart pointer. If they forget, +the system might become overloaded and crash. In Rust, we can specify that a +particular bit of code should be run whenever a value goes out of scope, and +the compiler will insert this code automatically. + + + + +This means we don’t need to be careful about placing clean up code everywhere +in a program that an instance of a particular type is finished with, but we +still won’t leak resources! + +We specify the code to run when a value goes out of scope by implementing the +`Drop` trait. The `Drop` trait requires us to implement one method named `drop` +that takes a mutable reference to `self`. In order to be able to see when Rust +calls `drop`, let’s implement `drop` with `println!` statements for now. + + + + +Listing 15-8 shows a `CustomSmartPointer` struct whose only custom +functionality is that it will print out `Dropping CustomSmartPointer!` when the +instance goes out of scope. This will demonstrate when Rust runs the `drop` +function: + + + Filename: src/main.rs @@ -464,153 +949,233 @@ struct CustomSmartPointer { impl Drop for CustomSmartPointer { fn drop(&mut self) { - println!("Dropping CustomSmartPointer!"); + println!("Dropping CustomSmartPointer with data `{}`!", self.data); } } fn main() { - let c = CustomSmartPointer { data: String::from("some data") }; - println!("CustomSmartPointer created."); - println!("Wait for it..."); + let c = CustomSmartPointer { data: String::from("my stuff") }; + let d = CustomSmartPointer { data: String::from("other stuff") }; + println!("CustomSmartPointers created."); } ``` Listing 15-8: A `CustomSmartPointer` struct that implements the `Drop` trait, -where we could put code that would clean up after the `CustomSmartPointer`. +where we would put our clean up code. + +The `Drop` trait is included in the prelude, so we don’t need to import it. We +implement the `Drop` trait on `CustomSmartPointer`, and provide an +implementation for the `drop` method that calls `println!`. The body of the +`drop` function is where you’d put any logic that you wanted to run when an +instance of your type goes out of scope. We’re choosing to print out some text +here in order to demonstrate when Rust will call `drop`. + + + -The `Drop` trait is in the prelude, so we don’t need to import it. The `drop` -method implementation calls the `println!`; this is where you’d put the actual -code needed to close the socket. In `main`, we create a new instance of -`CustomSmartPointer` then print out `CustomSmartPointer created.` to be able to -see that our code got to that point at runtime. At the end of `main`, our -instance of `CustomSmartPointer` will go out of scope. Note that we didn’t call -the `drop` method explicitly. +In `main`, we create a new instance of `CustomSmartPointer` and then print out +`CustomSmartPointer created.`. At the end of `main`, our instance of +`CustomSmartPointer` will go out of scope, and Rust will call the code we put +in the `drop` method, printing our final message. Note that we didn’t need to +call the `drop` method explicitly. -When we run this program, we’ll see: +When we run this program, we’ll see the following output: ``` -CustomSmartPointer created. -Wait for it... -Dropping CustomSmartPointer! +CustomSmartPointers created. +Dropping CustomSmartPointer with data `other stuff`! +Dropping CustomSmartPointer with data `my stuff`! +``` + +Rust automatically called `drop` for us when our instance went out of scope, +calling the code we specified. Variables are dropped in the reverse order of +the order in which they were created, so `d` was dropped before `c`. This is +just to give you a visual guide to how the drop method works, but usually you +would specify the cleanup code that your type needs to run rather than a print +message. + + + + +#### Dropping a Value Early with `std::mem::drop` + + + + +Rust inserts the call to `drop` automatically when a value goes out of scope, +and it’s not straightforward to disable this functionality. Disabling `drop` +isn’t usually necessary; the whole point of the `Drop` trait is that it’s taken +care of automatically for us. Occasionally you may find that you want to clean +up a value early. One example is when using smart pointers that manage locks; +you may want to force the `drop` method that releases the lock to run so that +other code in the same scope can acquire the lock. First, let’s see what +happens if we try to call the `Drop` trait’s `drop` method ourselves by +modifying the `main` function from Listing 15-8 as shown in Listing 15-9: + + + + +Filename: src/main.rs + +``` +fn main() { + let c = CustomSmartPointer { data: String::from("some data") }; + println!("CustomSmartPointer created."); + c.drop(); + println!("CustomSmartPointer dropped before the end of main."); +} +``` + +Listing 15-9: Attempting to call the `drop` method from the `Drop` trait +manually to clean up early + +If we try to compile this, we’ll get this error: + ``` +error[E0040]: explicit use of destructor method + --> src/main.rs:15:7 + | +15 | c.drop(); + | ^^^^ explicit destructor calls not allowed +``` + +This error message says we’re not allowed to explicitly call `drop`. The error +message uses the term *destructor*, which is the general programming term for a +function that cleans up an instance. A *destructor* is analogous to a +*constructor* that creates an instance. The `drop` function in Rust is one +particular destructor. + +Rust doesn’t let us call `drop` explicitly because Rust would still +automatically call `drop` on the value at the end of `main`, and this would be +a *double free* error since Rust would be trying to clean up the same value +twice. -printed to the screen, which shows that Rust automatically called `drop` for us -when our instance went out of scope. +Because we can’t disable the automatic insertion of `drop` when a value goes +out of scope, and we can’t call the `drop` method explicitly, if we need to +force a value to be cleaned up early, we can use the `std::mem::drop` function. -We can use the `std::mem::drop` function to drop a value earlier than when it -goes out of scope. This isn’t usually necessary; the whole point of the `Drop` -trait is that it’s taken care of automatically for us. We’ll see an example of -a case when we’ll need to drop a value earlier than when it goes out of scope -in Chapter 16 when we’re talking about concurrency. For now, let’s just see -that it’s possible, and `std::mem::drop` is in the prelude so we can just call -`drop` as shown in Listing 15-9: +The `std::mem::drop` function is different than the `drop` method in the `Drop` +trait. We call it by passing the value we want to force to be dropped early as +an argument. `std::mem::drop` is in the prelude, so we can modify `main` from +Listing 15-8 to call the `drop` function as shown in Listing 15-10: Filename: src/main.rs ``` +# struct CustomSmartPointer { +# data: String, +# } +# +# impl Drop for CustomSmartPointer { +# fn drop(&mut self) { +# println!("Dropping CustomSmartPointer!"); +# } +# } +# fn main() { let c = CustomSmartPointer { data: String::from("some data") }; println!("CustomSmartPointer created."); drop(c); - println!("Wait for it..."); + println!("CustomSmartPointer dropped before the end of main."); } ``` -Listing 15-9: Calling `std::mem::drop` to explicitly drop a value before it +Listing 15-10: Calling `std::mem::drop` to explicitly drop a value before it goes out of scope -Running this code will print the following, showing that the destructor code is -called since `Dropping CustomSmartPointer!` is printed between -`CustomSmartPointer created.` and `Wait for it...`: +Running this code will print the following: ``` CustomSmartPointer created. Dropping CustomSmartPointer! -Wait for it... +CustomSmartPointer dropped before the end of main. ``` -Note that we aren’t allowed to call the `drop` method that we defined directly: -if we replaced `drop(c)` in Listing 15-9 with `c.drop()`, we’ll get a compiler -error that says `explicit destructor calls not allowed`. We’re not allowed to -call `Drop::drop` directly because when Rust inserts its call to `Drop::drop` -automatically when the value goes out of scope, then the value would get -dropped twice. Dropping a value twice could cause an error or corrupt memory, -so Rust doesn’t let us. Instead, we can use `std::mem::drop`, whose definition -is: + + -``` -pub mod std { - pub mod mem { - pub fn drop(x: T) { } - } -} -``` +The `Dropping CustomSmartPointer!` is printed between `CustomSmartPointer +created.` and `CustomSmartPointer dropped before the end of main.`, showing +that the `drop` method code is called to drop `c` at that point. -This function is generic over any type `T`, so we can pass any value to it. The -function doesn’t actually have anything in its body, so it doesn’t use its -parameter. The reason this empty function is useful is that `drop` takes -ownership of its parameter, which means the value in `x` gets dropped at the -end of this function when `x` goes out of scope. + + -Code specified in a `Drop` trait implementation can be used for many reasons to +Code specified in a `Drop` trait implementation can be used in many ways to make cleanup convenient and safe: we could use it to create our own memory -allocator, for instance! By using the `Drop` trait and Rust’s ownership system, -we don’t have to remember to clean up after ourselves since Rust takes care of -it automatically. We’ll get compiler errors if we write code that would clean -up a value that’s still in use, since the ownership system that makes sure +allocator, for instance! With the `Drop` trait and Rust’s ownership system, you +don’t have to remember to clean up after yourself, Rust takes care of it +automatically. + +We also don’t have to worry about accidentally cleaning up values still in use +because that would cause a compiler error: the ownership system that makes sure references are always valid will also make sure that `drop` only gets called -one time when the value is no longer being used. +once when the value is no longer being used. Now that we’ve gone over `Box` and some of the characteristics of smart pointers, let’s talk about a few other smart pointers defined in the standard -library that add different kinds of useful functionality. +library. ## `Rc`, the Reference Counted Smart Pointer -In the majority of cases, ownership is very clear: you know exactly which -variable owns a given value. However, this isn’t always the case; sometimes, -you may actually need multiple owners. For this, Rust has a type called -`Rc`. Its name is an abbreviation for *reference counting*. Reference -counting means keeping track of the number of references to a value in order to -know if a value is still in use or not. If there are zero references to a -value, we know we can clean up the value without any references becoming -invalid. - -To think about this in terms of a real-world scenario, it’s like a TV in a -family room. When one person comes in the room to watch TV, they turn it on. -Others can also come in the room and watch the TV. When the last person leaves -the room, they’ll turn the TV off since it’s no longer being used. If someone -turns off the TV while others are still watching it, though, the people -watching the TV would get mad! - -`Rc` is for use when we want to allocate some data on the heap for multiple +In the majority of cases, ownership is clear: you know exactly which variable +owns a given value. However, there are cases when a single value may have +multiple owners. For example, in graph data structures, multiple edges may +point to the same node, and that node is conceptually owned by all of the edges +that point to it. A node shouldn’t be cleaned up unless it doesn’t have any +edges pointing to it. + + + + +In order to enable multiple ownership, Rust has a type called `Rc`. Its name +is an abbreviation for reference counting. *Reference counting* means keeping +track of the number of references to a value in order to know if a value is +still in use or not. If there are zero references to a value, the value can be +cleaned up without any references becoming invalid. + +Imagine it like a TV in a family room. When one person enters to watch TV, they +turn it on. Others can come into the room and watch the TV. When the last +person leaves the room, they turn the TV off because it’s no longer being used. +If someone turns the TV off while others are still watching it, there’d be +uproar from the remaining TV watchers! + +`Rc` is used when we want to allocate some data on the heap for multiple parts of our program to read, and we can’t determine at compile time which part -of our program using this data will finish using it last. If we knew which part -would finish last, we could make that part the owner of the data and the normal -ownership rules enforced at compile time would kick in. +will finish using the data last. If we did know which part would finish last, +we could just make that the owner of the data and the normal ownership rules +enforced at compile time would kick in. -Note that `Rc` is only for use in single-threaded scenarios; the next -chapter on concurrency will cover how to do reference counting in -multithreaded programs. If you try to use `Rc` with multiple threads, -you’ll get a compile-time error. +Note that `Rc` is only for use in single-threaded scenarios; Chapter 16 on +concurrency will cover how to do reference counting in multithreaded programs. ### Using `Rc` to Share Data -Let’s return to our cons list example from Listing 15-5. In Listing 15-11, we’re -going to try to use `List` as we defined it using `Box`. First we’ll create -one list instance that contains 5 and then 10. Next, we want to create two more -lists: one that starts with 3 and continues on to our first list containing 5 -and 10, then another list that starts with 4 and *also* continues on to our -first list containing 5 and 10. In other words, we want two lists that both -share ownership of the third list, which conceptually will be something like -Figure 15-10: +Let’s return to our cons list example from Listing 15-6, as we defined it using +`Box`. This time, we want to create two lists that both share ownership of a +third list, which conceptually will look something like Figure 15-11: Two lists that share ownership of a third list -Figure 15-10: Two lists, `b` and `c`, sharing ownership of a third list, `a` +Figure 15-11: Two lists, `b` and `c`, sharing ownership of a third list, `a` + +We’ll create list `a` that contains 5 and then 10, then make two more lists: +`b` that starts with 3 and `c` that starts with 4. Both `b` and `c` lists will +then continue on to the first `a` list containing 5 and 10. In other words, +both lists will try to share the first list containing 5 and 10. Trying to implement this using our definition of `List` with `Box` won’t -work, as shown in Listing 15-11: +work, as shown in Listing 15-12: Filename: src/main.rs @@ -631,8 +1196,8 @@ fn main() { } ``` -Listing 15-11: Having two lists using `Box` that try to share ownership of a -third list won’t work +Listing 15-12: Demonstrating we’re not allowed to have two lists using `Box` +that try to share ownership of a third list If we compile this, we get this error: @@ -649,17 +1214,32 @@ error[E0382]: use of moved value: `a` implement the `Copy` trait ``` -The `Cons` variants own the data they hold, so when we create the `b` list it -moves `a` to be owned by `b`. Then when we try to use `a` again when creating -`c`, we’re not allowed to since `a` has been moved. +The `Cons` variants own the data they hold, so when we create the `b` list, `a` +is moved into `b` and `b` owns `a`. Then, when we try to use `a` again when +creating `c`, we’re not allowed to because `a` has been moved. We could change the definition of `Cons` to hold references instead, but then -we’d have to specify lifetime parameters and we’d have to construct elements of -a list such that every element lives at least as long as the list itself. -Otherwise, the borrow checker won’t even let us compile the code. - -Instead, we can change our definition of `List` to use `Rc` instead of -`Box` as shown here in Listing 15-12: +we’d have to specify lifetime parameters. By specifying lifetime parameters, +we’d be specifying that every element in the list will live at least as long as +the list itself. The borrow checker wouldn’t let us compile `let a = Cons(10, +&Nil);` for example, since the temporary `Nil` value would be dropped before +`a` could take a reference to it. + +Instead, we’ll change our definition of `List` to use `Rc` in place of +`Box` as shown here in Listing 15-13. Each `Cons` variant now holds a value +and an `Rc` pointing to a `List`. When we create `b`, instead of taking +ownership of `a`, we clone the `Rc` that `a` is holding, which increases the +number of references from 1 to 2 and lets `a` and `b` share ownership of the +data in that `Rc`. We also clone `a` when creating `c`, which increases the +number of references from 2 to 3. Every time we call `Rc::clone`, the reference +count to the data within the `Rc` is increased, and the data won’t be cleaned +up unless there are zero references to it: + + + Filename: src/main.rs @@ -674,94 +1254,149 @@ use std::rc::Rc; fn main() { let a = Rc::new(Cons(5, Rc::new(Cons(10, Rc::new(Nil))))); - let b = Cons(3, a.clone()); - let c = Cons(4, a.clone()); + let b = Cons(3, Rc::clone(&a)); + let c = Cons(4, Rc::clone(&a)); } ``` -Listing 15-12: A definition of `List` that uses `Rc` +Listing 15-13: A definition of `List` that uses `Rc` -Note that we need to add a `use` statement for `Rc` because it’s not in the -prelude. In `main`, we create the list holding 5 and 10 and store it in a new -`Rc` in `a`. Then when we create `b` and `c`, we call the `clone` method on `a`. +We need to add a `use` statement to bring `Rc` into scope because it’s not in +the prelude. In `main`, we create the list holding 5 and 10 and store it in a +new `Rc` in `a`. Then when we create `b` and `c`, we call the `Rc::clone` +function and pass a reference to the `Rc` in `a` as an argument. + +We could have called `a.clone()` rather than `Rc::clone(&a)`, but Rust +convention is to use `Rc::clone` in this case. The implementation of `clone` +doesn’t make a deep copy of all the data like most types’ implementations of +`clone` do. `Rc::clone` only increments the reference count, which doesn’t take +very much time. Deep copies of data can take a lot of time, so by using +`Rc::clone` for reference counting, we can visually distinguish between the +deep copy kinds of clones that might have a large impact on runtime performance +and memory usage and the types of clones that increase the reference count that +have a comparatively small impact on runtime performance and don’t allocate new +memory. ### Cloning an `Rc` Increases the Reference Count -We’ve seen the `clone` method previously, where we used it for making a -complete copy of some data. With `Rc`, though, it doesn’t make a full copy. -`Rc` holds a *reference count*, that is, a count of how many clones exist. -Let’s change `main` as shown in Listing 15-13 to have an inner scope around -where we create `c`, and to print out the results of the `Rc::strong_count` -associated function at various points. `Rc::strong_count` returns the reference -count of the `Rc` value we pass to it, and we’ll talk about why this function -is named `strong_count` in the section later in this chapter about preventing -reference cycles. +Let’s change our working example from Listing 15-13 so that we can see the +reference counts changing as we create and drop references to the `Rc` in `a`. + + + + +In Listing 15-14, we’ll change `main` so that it has an inner scope around list +`c`, so that we can see how the reference count changes when `c` goes out of +scope. At each point in the program where the reference count changes, we’ll +print out the reference count, which we can get by calling the +`Rc::strong_count` function. We’ll talk about why this function is named +`strong_count` rather than `count` in the section later in this chapter about +preventing reference cycles. + + + Filename: src/main.rs ``` +# enum List { +# Cons(i32, Rc), +# Nil, +# } +# +# use List::{Cons, Nil}; +# use std::rc::Rc; +# fn main() { let a = Rc::new(Cons(5, Rc::new(Cons(10, Rc::new(Nil))))); - println!("rc = {}", Rc::strong_count(&a)); - let b = Cons(3, a.clone()); - println!("rc after creating b = {}", Rc::strong_count(&a)); + println!("count after creating a = {}", Rc::strong_count(&a)); + let b = Cons(3, Rc::clone(&a)); + println!("count after creating b = {}", Rc::strong_count(&a)); { - let c = Cons(4, a.clone()); - println!("rc after creating c = {}", Rc::strong_count(&a)); + let c = Cons(4, Rc::clone(&a)); + println!("count after creating c = {}", Rc::strong_count(&a)); } - println!("rc after c goes out of scope = {}", Rc::strong_count(&a)); + println!("count after c goes out of scope = {}", Rc::strong_count(&a)); } ``` -Listing 15-13: Printing out the reference count +Listing 15-14: Printing out the reference count This will print out: ``` -rc = 1 -rc after creating b = 2 -rc after creating c = 3 -rc after c goes out of scope = 2 +count after creating a = 1 +count after creating b = 2 +count after creating c = 3 +count after c goes out of scope = 2 ``` -We’re able to see that `a` has an initial reference count of one. Then each -time we call `clone`, the count goes up by one. When `c` goes out of scope, the -count is decreased by one, which happens in the implementation of the `Drop` -trait for `Rc`. What we can’t see in this example is that when `b` and then -`a` go out of scope at the end of `main`, the count of references to the list -containing 5 and 10 is then 0, and the list is dropped. This strategy lets us -have multiple owners, as the count will ensure that the value remains valid as -long as any of the owners still exist. + + + +We’re able to see that the `Rc` in `a` has an initial reference count of one, +then each time we call `clone`, the count goes up by one. When `c` goes out of +scope, the count goes down by one. We don’t have to call a function to decrease +the reference count like we have to call `Rc::clone` to increase the reference +count; the implementation of the `Drop` trait decreases the reference count +automatically when an `Rc` value goes out of scope. -In the beginning of this section, we said that `Rc` only allows you to share -data for multiple parts of your program to read through immutable references to -the `T` value the `Rc` contains. If `Rc` let us have a mutable reference, -we’d run into the problem that the borrowing rules disallow that we discussed -in Chapter 4: two mutable borrows to the same place can cause data races and -inconsistencies. But mutating data is very useful! In the next section, we’ll -discuss the interior mutability pattern and the `RefCell` type that we can -use in conjunction with an `Rc` to work with this restriction on -immutability. +What we can’t see from this example is that when `b` and then `a` go out of +scope at the end of `main`, the count is then 0, and the `Rc` is cleaned up +completely at that point. Using `Rc` allows a single value to have multiple +owners, and the count will ensure that the value remains valid as long as any +of the owners still exist. + +`Rc` allows us to share data between multiple parts of our program for +reading only, via immutable references. If `Rc` allowed us to have multiple +mutable references too, we’d be able to violate one of the the borrowing rules +that we discussed in Chapter 4: multiple mutable borrows to the same place can +cause data races and inconsistencies. But being able to mutate data is very +useful! In the next section, we’ll discuss the interior mutability pattern and +the `RefCell` type that we can use in conjunction with an `Rc` to work +with this restriction on immutability. ## `RefCell` and the Interior Mutability Pattern + + + + + + *Interior mutability* is a design pattern in Rust for allowing you to mutate -data even though there are immutable references to that data, which would -normally be disallowed by the borrowing rules. The interior mutability pattern -involves using `unsafe` code inside a data structure to bend Rust’s usual rules -around mutation and borrowing. We haven’t yet covered unsafe code; we will in -Chapter 19. The interior mutability pattern is used when you can ensure that -the borrowing rules will be followed at runtime, even though the compiler can’t +data even when there are immutable references to that data, normally disallowed +by the borrowing rules. To do so, the pattern uses `unsafe` code inside a data +structure to bend Rust’s usual rules around mutation and borrowing. We haven’t +yet covered unsafe code; we will in Chapter 19. We can choose to use types that +make use of the interior mutability pattern when we can ensure that the +borrowing rules will be followed at runtime, even though the compiler can’t ensure that. The `unsafe` code involved is then wrapped in a safe API, and the outer type is still immutable. Let’s explore this by looking at the `RefCell` type that follows the interior mutability pattern. -### `RefCell` has Interior Mutability +### Enforcing Borrowing Rules at Runtime with `RefCell` Unlike `Rc`, the `RefCell` type represents single ownership over the data -that it holds. So, what makes `RefCell` different than a type like `Box`? +it holds. So, what makes `RefCell` different than a type like `Box`? Let’s recall the borrowing rules we learned in Chapter 4: 1. At any given time, you can have *either* but not both of: @@ -774,162 +1409,405 @@ compile time. With `RefCell`, these invariants are enforced *at runtime*. With references, if you break these rules, you’ll get a compiler error. With `RefCell`, if you break these rules, you’ll get a `panic!`. -Static analysis, like the Rust compiler performs, is inherently conservative. -There are properties of code that are impossible to detect by analyzing the -code: the most famous is the Halting Problem, which is out of scope of this -book but an interesting topic to research if you’re interested. - -Because some analysis is impossible, the Rust compiler does not try to even -guess if it can’t be sure, so it’s conservative and sometimes rejects correct -programs that would not actually violate Rust’s guarantees. Put another way, if -Rust accepts an incorrect program, people would not be able to trust in the -guarantees Rust makes. If Rust rejects a correct program, the programmer will -be inconvenienced, but nothing catastrophic can occur. `RefCell` is useful -when you know that the borrowing rules are respected, but the compiler can’t -understand that that’s true. - -Similarly to `Rc`, `RefCell` is only for use in single-threaded -scenarios. We’ll talk about how to get the functionality of `RefCell` in a -multithreaded program in the next chapter on concurrency. For now, all you -need to know is that if you try to use `RefCell` in a multithreaded -context, you’ll get a compile time error. - -With references, we use the `&` and `&mut` syntax to create references and -mutable references, respectively. But with `RefCell`, we use the `borrow` -and `borrow_mut` methods, which are part of the safe API that `RefCell` has. -`borrow` returns the smart pointer type `Ref`, and `borrow_mut` returns the -smart pointer type `RefMut`. These two types implement `Deref` so that we can -treat them as if they’re regular references. `Ref` and `RefMut` track the -borrows dynamically, and their implementation of `Drop` releases the borrow -dynamically. - -Listing 15-14 shows what it looks like to use `RefCell` with functions that -borrow their parameters immutably and mutably. Note that the `data` variable is -declared as immutable with `let data` rather than `let mut data`, yet -`a_fn_that_mutably_borrows` is allowed to borrow the data mutably and make -changes to the data! - -Filename: src/main.rs + + + +The advantages to checking the borrowing rules at compile time are that errors +will be caught sooner in the development process and there is no impact on +runtime performance since all the analysis is completed beforehand. For those +reasons, checking the borrowing rules at compile time is the best choice for +the majority of cases, which is why this is Rust’s default. + +The advantage to checking the borrowing rules at runtime instead is that +certain memory safe scenarios are then allowed, whereas they are disallowed by +the compile time checks. Static analysis, like the Rust compiler, is inherently +conservative. Some properties of code are impossible to detect by analyzing the +code: the most famous example is the Halting Problem, which is out of scope of +this book but an interesting topic to research if you’re interested. + + + + +Because some analysis is impossible, if the Rust compiler can’t be sure the +code complies with the ownership rules, it may reject a correct program; in +this way, it is conservative. If Rust were to accept an incorrect program, +users would not be able to trust in the guarantees Rust makes. However, if Rust +rejects a correct program, the programmer will be inconvenienced, but nothing +catastrophic can occur. `RefCell` is useful when you yourself are sure that +your code follows the borrowing rules, but the compiler is not able to +understand and guarantee that. + +Similarly to `Rc`, `RefCell` is only for use in single-threaded scenarios +and will give you a compile time error if you try in a multithreaded context. +We’ll talk about how to get the functionality of `RefCell` in a +multithreaded program in Chapter 16. + + + + +To recap the reasons to choose `Box`, `Rc`, or `RefCell`: + +- `Rc` enables multiple owners of the same data; `Box` and `RefCell` + have single owners. +- `Box` allows immutable or mutable borrows checked at compile time; `Rc` + only allows immutable borrows checked at compile time; `RefCell` allows + immutable or mutable borrows checked at runtime. +- Because `RefCell` allows mutable borrows checked at runtime, we can mutate + the value inside the `RefCell` even when the `RefCell` is itself + immutable. + +The last reason is the *interior mutability* pattern. Let’s look at a case when +interior mutability is useful and discuss how this is possible. + +### Interior Mutability: A Mutable Borrow to an Immutable Value + +A consequence of the borrowing rules is that when we have an immutable value, +we can’t borrow it mutably. For example, this code won’t compile: ``` -use std::cell::RefCell; - -fn a_fn_that_immutably_borrows(a: &i32) { - println!("a is {}", a); +fn main() { + let x = 5; + let y = &mut x; } +``` + +If we try to compile this, we’ll get this error: -fn a_fn_that_mutably_borrows(b: &mut i32) { - *b += 1; +``` +error[E0596]: cannot borrow immutable local variable `x` as mutable + --> src/main.rs:3:18 + | +2 | let x = 5; + | - consider changing this to `mut x` +3 | let y = &mut x; + | ^ cannot borrow mutably +``` + +However, there are situations where it would be useful for a value to be able +to mutate itself in its methods, but to other code, the value would appear to +be immutable. Code outside the value’s methods would not be able to mutate the +value. `RefCell` is one way to get the ability to have interior mutability. +`RefCell` isn’t getting around the borrowing rules completely, but the +borrow checker in the compiler allows this interior mutability and the +borrowing rules are checked at runtime instead. If we violate the rules, we’ll +get a `panic!` instead of a compiler error. + +Let’s work through a practical example where we can use `RefCell` to make it +possible to mutate an immutable value and see why that’s useful. + +#### A Use Case for Interior Mutability: Mock Objects + +A *test double* is the general programming concept for a type that stands in +the place of another type during testing. *Mock objects* are specific types of +test doubles that record what happens during a test so that we can assert that +the correct actions took place. + +While Rust doesn’t have objects in the exact same sense that other languages +have objects, and Rust doesn’t have mock object functionality built into the +standard library like some other languages do, we can definitely create a +struct that will serve the same purposes as a mock object. + +Here’s the scenario we’d like to test: we’re creating a library that tracks a +value against a maximum value, and sends messages based on how close to the +maximum value the current value is. This could be used for keeping track of a +user’s quota for the number of API calls they’re allowed to make, for example. + +Our library is only going to provide the functionality of tracking how close to +the maximum a value is, and what the messages should be at what times. +Applications that use our library will be expected to provide the actual +mechanism for sending the messages: the application could choose to put a +message in the application, send an email, send a text message, or something +else. Our library doesn’t need to know about that detail; all it needs is +something that implements a trait we’ll provide called `Messenger`. Listing +15-15 shows our library code: + +Filename: src/lib.rs + +``` +pub trait Messenger { + fn send(&self, msg: &str); } -fn demo(r: &RefCell) { - a_fn_that_immutably_borrows(&r.borrow()); - a_fn_that_mutably_borrows(&mut r.borrow_mut()); - a_fn_that_immutably_borrows(&r.borrow()); +pub struct LimitTracker<'a, T: 'a + Messenger> { + messenger: &'a T, + value: usize, + max: usize, } -fn main() { - let data = RefCell::new(5); - demo(&data); +impl<'a, T> LimitTracker<'a, T> + where T: Messenger { + pub fn new(messenger: &T, max: usize) -> LimitTracker { + LimitTracker { + messenger, + value: 0, + max, + } + } + + pub fn set_value(&mut self, value: usize) { + self.value = value; + + let percentage_of_max = self.value as f64 / self.max as f64; + + if percentage_of_max >= 0.75 && percentage_of_max < 0.9 { + self.messenger.send("Warning: You've used up over 75% of your quota!"); + } else if percentage_of_max >= 0.9 && percentage_of_max < 1.0 { + self.messenger.send("Urgent warning: You've used up over 90% of your quota!"); + } else if percentage_of_max >= 1.0 { + self.messenger.send("Error: You are over your quota!"); + } + } } ``` -Listing 15-14: Using `RefCell`, `borrow`, and `borrow_mut` +Listing 15-15: A library to keep track of how close to a maximum value a value +is, and warn when the value is at certain levels -This example prints: +One important part of this code is that the `Messenger` trait has one method, +`send`, that takes an immutable reference to `self` and text of the message. +This is the interface our mock object will need to have. The other important +part is that we want to test the behavior of the `set_value` method on the +`LimitTracker`. We can change what we pass in for the `value` parameter, but +`set_value` doesn’t return anything for us to make assertions on. What we want +to be able to say is that if we create a `LimitTracker` with something that +implements the `Messenger` trait and a particular value for `max`, when we pass +different numbers for `value`, the messenger gets told to send the appropriate +messages. + +What we need is a mock object that, instead of actually sending an email or +text message when we call `send`, will only keep track of the messages it’s +told to send. We can create a new instance of the mock object, create a +`LimitTracker` that uses the mock object, call the `set_value` method on +`LimitTracker`, then check that the mock object has the messages we expect. +Listing 15-16 shows an attempt of implementing a mock object to do just that, +but that the borrow checker won’t allow: + +Filename: src/lib.rs ``` -a is 5 -a is 6 -``` +#[cfg(test)] +mod tests { + use super::*; -In `main`, we’ve created a new `RefCell` containing the value 5, and stored -in the variable `data`, declared without the `mut` keyword. We then call the -`demo` function with an immutable reference to `data`: as far as `main` is -concerned, `data` is immutable! + struct MockMessenger { + sent_messages: Vec, + } -In the `demo` function, we get an immutable reference to the value inside the -`RefCell` by calling the `borrow` method, and we call -`a_fn_that_immutably_borrows` with that immutable reference. More -interestingly, we can get a *mutable* reference to the value inside the -`RefCell` with the `borrow_mut` method, and the function -`a_fn_that_mutably_borrows` is allowed to change the value. We can see that the -next time we call `a_fn_that_immutably_borrows` that prints out the value, it’s -6 instead of 5. + impl MockMessenger { + fn new() -> MockMessenger { + MockMessenger { sent_messages: vec![] } + } + } -### Borrowing Rules are Checked at Runtime on `RefCell` + impl Messenger for MockMessenger { + fn send(&self, message: &str) { + self.sent_messages.push(String::from(message)); + } + } -Recall from Chapter 4 that because of the borrowing rules, this code using -regular references that tries to create two mutable borrows in the same scope -won’t compile: + #[test] + fn it_sends_an_over_75_percent_warning_message() { + let mock_messenger = MockMessenger::new(); + let mut limit_tracker = LimitTracker::new(&mock_messenger, 100); -``` -let mut s = String::from("hello"); + limit_tracker.set_value(80); -let r1 = &mut s; -let r2 = &mut s; + assert_eq!(mock_messenger.sent_messages.len(), 1); + } +} ``` -We’ll get this compiler error: +Listing 15-16: An attempt to implement a `MockMessenger` that isn’t allowed by +the borrow checker + +This test code defines a `MockMessenger` struct that has a `sent_messages` +field with a `Vec` of `String` values to keep track of the messages it’s told +to send. We also defined an associated function `new` to make it convenient to +create new `MockMessenger` values that start with an empty list of messages. We +then implement the `Messenger` trait for `MockMessenger` so that we can give a +`MockMessenger` to a `LimitTracker`. In the definition of the `send` method, we +take the message passed in as a parameter and store it in the `MockMessenger` +list of `sent_messages`. + +In the test, we’re testing what happens when the `LimitTracker` is told to set +`value` to something that’s over 75% of the `max` value. First, we create a new +`MockMessenger`, which will start with an empty list of messages. Then we +create a new `LimitTracker` and give it a reference to the new `MockMessenger` +and a `max` value of 100. We call the `set_value` method on the `LimitTracker` +with a value of 80, which is more than 75% of 100. Then we assert that the list +of messages that the `MockMessenger` is keeping track of should now have one +message in it. + +There’s one problem with this test, however: ``` -error[E0499]: cannot borrow `s` as mutable more than once at a time - --> - | -5 | let r1 = &mut s; - | - first mutable borrow occurs here -6 | let r2 = &mut s; - | ^ second mutable borrow occurs here -7 | } - | - first borrow ends here +error[E0596]: cannot borrow immutable field `self.sent_messages` as mutable + --> src/lib.rs:46:13 + | +45 | fn send(&self, message: &str) { + | ----- use `&mut self` here to make mutable +46 | self.sent_messages.push(String::from(message)); + | ^^^^^^^^^^^^^^^^^^ cannot mutably borrow immutable field ``` -In contrast, using `RefCell` and calling `borrow_mut` twice in the same -scope *will* compile, but it’ll panic at runtime instead. This code: +We can’t modify the `MockMessenger` to keep track of the messages because the +`send` method takes an immutable reference to `self`. We also can’t take the +suggestion from the error text to use `&mut self` instead because then the +signature of `send` wouldn’t match the signature in the `Messenger` trait +definition (feel free to try and see what error message you get). + +This is where interior mutability can help! We’re going to store the +`sent_messages` within a `RefCell`, and then the `send` message will be able to +modify `sent_messages` to store the messages we’ve seen. Listing 15-17 shows +what that looks like: + +Filename: src/lib.rs ``` -use std::cell::RefCell; +#[cfg(test)] +mod tests { + use super::*; + use std::cell::RefCell; -fn main() { - let s = RefCell::new(String::from("hello")); + struct MockMessenger { + sent_messages: RefCell>, + } + + impl MockMessenger { + fn new() -> MockMessenger { + MockMessenger { sent_messages: RefCell::new(vec![]) } + } + } + + impl Messenger for MockMessenger { + fn send(&self, message: &str) { + self.sent_messages.borrow_mut().push(String::from(message)); + } + } + + #[test] + fn it_sends_an_over_75_percent_warning_message() { + // ...snip... +# let mock_messenger = MockMessenger::new(); +# let mut limit_tracker = LimitTracker::new(&mock_messenger, 100); +# limit_tracker.set_value(75); + + assert_eq!(mock_messenger.sent_messages.borrow().len(), 1); + } +} +``` + +Listing 15-17: Using `RefCell` to be able to mutate an inner value while the +outer value is considered immutable + +The `sent_messages` field is now of type `RefCell>` instead of +`Vec`. In the `new` function, we create a new `RefCell` instance around +the empty vector. + +For the implementation of the `send` method, the first parameter is still an +immutable borrow of `self`, which matches the trait definition. We call +`borrow_mut` on the `RefCell` in `self.sent_messages` to get a mutable +reference to the value inside the `RefCell`, which is the vector. Then we can +call `push` on the mutable reference to the vector in order to keep track of +the messages seen during the test. + +The last change we have to make is in the assertion: in order to see how many +items are in the inner vector, we call `borrow` on the `RefCell` to get an +immutable reference to the vector. + +Now that we’ve seen how to use `RefCell`, let’s dig into how it works! - let r1 = s.borrow_mut(); - let r2 = s.borrow_mut(); +#### `RefCell` Keeps Track of Borrows at Runtime + +When creating immutable and mutable references we use the `&` and `&mut` +syntax, respectively. With `RefCell`, we use the `borrow` and `borrow_mut` +methods, which are part of the safe API that belongs to `RefCell`. The +`borrow` method returns the smart pointer type `Ref`, and `borrow_mut` returns +the smart pointer type `RefMut`. Both types implement `Deref` so we can treat +them like regular references. + + + + +The `RefCell` keeps track of how many `Ref` and `RefMut` smart pointers are +currently active. Every time we call `borrow`, the `RefCell` increases its +count of how many immutable borrows are active. When a `Ref` value goes out of +scope, the count of immutable borrows goes down by one. Just like the compile +time borrowing rules, `RefCell` lets us have many immutable borrows or one +mutable borrow at any point in time. + +If we try to violate these rules, rather than getting a compiler error like we +would with references, the implementation of `RefCell` will `panic!` at +runtime. Listing 15-18 shows a modification to the implementation of `send` +from Listing 15-17 where we’re deliberately trying to create two mutable +borrows active for the same scope in order to illustrate that `RefCell` +prevents us from doing this at runtime: + +Filename: src/lib.rs + +``` +impl Messenger for MockMessenger { + fn send(&self, message: &str) { + let mut one_borrow = self.sent_messages.borrow_mut(); + let mut two_borrow = self.sent_messages.borrow_mut(); + + one_borrow.push(String::from(message)); + two_borrow.push(String::from(message)); + } } ``` -compiles but panics with the following error when we `cargo run`: +Listing 15-18: Creating two mutable references in the same scope to see that +`RefCell` will panic + +We create a variable `one_borrow` for the `RefMut` smart pointer returned from +`borrow_mut`. Then we create another mutable borrow in the same way in the +variable `two_borrow`. This makes two mutable references in the same scope, +which isn’t allowed. If we run the tests for our library, this code will +compile without any errors, but the test will fail: ``` - Finished dev [unoptimized + debuginfo] target(s) in 0.83 secs - Running `target/debug/refcell` -thread 'main' panicked at 'already borrowed: BorrowMutError', -/stable-dist-rustc/build/src/libcore/result.rs:868 +---- tests::it_sends_an_over_75_percent_warning_message stdout ---- + thread 'tests::it_sends_an_over_75_percent_warning_message' panicked at + 'already borrowed: BorrowMutError', src/libcore/result.rs:906:4 note: Run with `RUST_BACKTRACE=1` for a backtrace. ``` -This runtime `BorrowMutError` is similar to the compiler error: it says we’ve -already borrowed `s` mutably once, so we’re not allowed to borrow it again. We -aren’t getting around the borrowing rules, we’re just choosing to have Rust -enforce them at runtime instead of compile time. You could choose to use -`RefCell` everywhere all the time, but in addition to having to type -`RefCell` a lot, you’d find out about possible problems later (possibly in -production rather than during development). Also, checking the borrowing rules -while your program is running has a performance penalty. - -### Multiple Owners of Mutable Data by Combining `Rc` and `RefCell` - -So why would we choose to make the tradeoffs that using `RefCell` involves? -Well, remember when we said that `Rc` only lets you have an immutable -reference to `T`? Given that `RefCell` is immutable, but has interior -mutability, we can combine `Rc` and `RefCell` to get a type that’s both -reference counted and mutable. Listing 15-15 shows an example of how to do -that, again going back to our cons list from Listing 15-5. In this example, -instead of storing `i32` values in the cons list, we’ll be storing -`Rc>` values. We want to store that type so that we can have an -owner of the value that’s not part of the list (the multiple owners -functionality that `Rc` provides), and so we can mutate the inner `i32` -value (the interior mutability functionality that `RefCell` provides): +We can see that the code panicked with the message `already borrowed: +BorrowMutError`. This is how `RefCell` handles violations of the borrowing +rules at runtime. + +Catching borrowing errors at runtime rather than compile time means that we’d +find out that we made a mistake in our code later in the development process-- +and possibly not even until our code was deployed to production. There’s also a +small runtime performance penalty our code will incur as a result of keeping +track of the borrows at runtime rather than compile time. However, using +`RefCell` made it possible for us to write a mock object that can modify itself +to keep track of the messages it has seen while we’re using it in a context +where only immutable values are allowed. We can choose to use `RefCell` +despite its tradeoffs to get more abilities than regular references give us. + +### Having Multiple Owners of Mutable Data by Combining `Rc` and `RefCell` + +A common way to use `RefCell` is in combination with `Rc`. Recall that +`Rc` lets us have multiple owners of some data, but it only gives us +immutable access to that data. If we have an `Rc` that holds a `RefCell`, +then we can get a value that can have multiple owners *and* that we can mutate! + + + + +For example, recall the cons list example from Listing 15-13 where we used +`Rc` to let us have multiple lists share ownership of another list. Because +`Rc` only holds immutable values, we aren’t able to change any of the values +in the list once we’ve created them. Let’s add in `RefCell` to get the +ability to change the values in the lists. Listing 15-19 shows that by using a +`RefCell` in the `Cons` definition, we’re allowed to modify the value stored +in all the lists: Filename: src/main.rs @@ -947,83 +1825,89 @@ use std::cell::RefCell; fn main() { let value = Rc::new(RefCell::new(5)); - let a = Cons(value.clone(), Rc::new(Nil)); - let shared_list = Rc::new(a); + let a = Rc::new(Cons(Rc::clone(&value), Rc::new(Nil))); - let b = Cons(Rc::new(RefCell::new(6)), shared_list.clone()); - let c = Cons(Rc::new(RefCell::new(10)), shared_list.clone()); + let b = Cons(Rc::new(RefCell::new(6)), Rc::clone(&a)); + let c = Cons(Rc::new(RefCell::new(10)), Rc::clone(&a)); *value.borrow_mut() += 10; - println!("shared_list after = {:?}", shared_list); + println!("a after = {:?}", a); println!("b after = {:?}", b); println!("c after = {:?}", c); } ``` -Listing 15-15: Using `Rc>` to create a `List` that we can mutate +Listing 15-19: Using `Rc>` to create a `List` that we can mutate + +We create a value that’s an instance of `Rc` and store it in a +variable named `value` so we can access it directly later. Then we create a +`List` in `a` with a `Cons` variant that holds `value`. We need to clone +`value` so that both `a` and `value` have ownership of the inner `5` value, +rather than transferring ownership from `value` to `a` or having `a` borrow +from `value`. -We’re creating a value, which is an instance of `Rc>`. We’re -storing it in a variable named `value` because we want to be able to access it -directly later. Then we create a `List` in `a` that has a `Cons` variant that -holds `value`, and `value` needs to be cloned since we want `value` to also -have ownership in addition to `a`. Then we wrap `a` in an `Rc` so that we -can create lists `b` and `c` that start differently but both refer to `a`, -similarly to what we did in Listing 15-12. + + -Once we have the lists in `shared_list`, `b`, and `c` created, then we add 10 -to the 5 in `value` by dereferencing the `Rc` and calling `borrow_mut` on -the `RefCell`. +We wrap the list `a` in an `Rc` so that when we create lists `b` and +`c`, they can both refer to `a`, the same as we did in Listing 15-13. -When we print out `shared_list`, `b`, and `c`, we can see that they all have -the modified value of 15: +Once we have the lists in `a`, `b`, and `c` created, we add 10 to the value in +`value`. We do this by calling `borrow_mut` on `value`, which uses the +automatic dereferencing feature we discussed in Chapter 5 (“Where’s the `->` +Operator?”) to dereference the `Rc` to the inner `RefCell` value. The +`borrow_mut` method returns a `RefMut` smart pointer, and we use the +dereference operator on it and change the inner value. + +When we print out `a`, `b`, and `c`, we can see that they all have the modified +value of 15 rather than 5: ``` -shared_list after = Cons(RefCell { value: 15 }, Nil) +a after = Cons(RefCell { value: 15 }, Nil) b after = Cons(RefCell { value: 6 }, Cons(RefCell { value: 15 }, Nil)) c after = Cons(RefCell { value: 10 }, Cons(RefCell { value: 15 }, Nil)) ``` -This is pretty neat! By using `RefCell`, we can have an outwardly immutable +This is pretty neat! By using `RefCell`, we have an outwardly immutable `List`, but we can use the methods on `RefCell` that provide access to its -interior mutability to be able to modify our data when we need to. The runtime -checks of the borrowing rules that `RefCell` does protect us from data -races, and we’ve decided that we want to trade a bit of speed for the -flexibility in our data structures. - -`RefCell` is not the only standard library type that provides interior -mutability. `Cell` is similar but instead of giving references to the inner -value like `RefCell` does, the value is copied in and out of the `Cell`. -`Mutex` offers interior mutability that is safe to use across threads, and -we’ll be discussing its use in the next chapter on concurrency. Check out the -standard library docs for more details on the differences between these types. - -## Creating Reference Cycles and Leaking Memory is Safe - -Rust makes a number of guarantees that we’ve talked about, for example that -we’ll never have a null value, and data races will be disallowed at compile -time. Rust’s memory safety guarantees make it more difficult to create memory -that never gets cleaned up, which is known as a *memory leak*. Rust does not -make memory leaks *impossible*, however, preventing memory leaks is *not* one -of Rust’s guarantees. In other words, memory leaks are memory safe. - -By using `Rc` and `RefCell`, it is possible to create cycles of -references where items refer to each other in a cycle. This is bad because the -reference count of each item in the cycle will never reach 0, and the values -will never be dropped. Let’s take a look at how that might happen and how to -prevent it. - -In Listing 15-16, we’re going to use another variation of the `List` definition -from Listing 15-5. We’re going back to storing an `i32` value as the first -element in the `Cons` variant. The second element in the `Cons` variant is now -`RefCell>`: instead of being able to modify the `i32` value this time, -we want to be able to modify which `List` a `Cons` variant is pointing to. -We’ve also added a `tail` method to make it convenient for us to access the -second item, if we have a `Cons` variant: +interior mutability so we can modify our data when we need to. The runtime +checks of the borrowing rules protect us from data races, and it’s sometimes +worth trading a bit of speed for this flexibility in our data structures. + +The standard library has other types that provide interior mutability, too, +like `Cell`, which is similar except that instead of giving references to +the inner value, the value is copied in and out of the `Cell`. There’s also +`Mutex`, which offers interior mutability that’s safe to use across threads, +and we’ll be discussing its use in the next chapter on concurrency. Check out +the standard library docs for more details on the differences between these +types. + +## Reference Cycles Can Leak Memory + +Rust’s memory safety guarantees make it *difficult* to accidentally create +memory that’s never cleaned up, known as a *memory leak*, but not impossible. +Entirely preventing memory leaks is not one of Rust’s guarantees in the same +way that disallowing data races at compile time is, meaning memory leaks are +memory safe in Rust. We can see this with `Rc` and `RefCell`: it’s +possible to create references where items refer to each other in a cycle. This +creates memory leaks because the reference count of each item in the cycle will +never reach 0, and the values will never be dropped. + +### Creating a Reference Cycle + +Let’s take a look at how a reference cycle might happen and how to prevent it, +starting with the definition of the `List` enum and a `tail` method in Listing +15-20: Filename: src/main.rs ``` +use std::rc::Rc; +use std::cell::RefCell; +use List::{Cons, Nil}; + #[derive(Debug)] enum List { Cons(i32, RefCell>), @@ -1040,37 +1924,71 @@ impl List { } ``` -Listing 15-16: A cons list definition that holds a `RefCell` so that we can +Listing 15-20: A cons list definition that holds a `RefCell` so that we can modify what a `Cons` variant is referring to -Next, in Listing 15-17, we’re going to create a `List` value in the variable -`a` that initially is a list of `5, Nil`. Then we’ll create a `List` value in -the variable `b` that is a list of the value 10 and then points to the list in -`a`. Finally, we’ll modify `a` so that it points to `b` instead of `Nil`, which -will then create a cycle: +We’re using another variation of the `List` definition from Listing 15-6. The +second element in the `Cons` variant is now `RefCell>`, meaning that +instead of having the ability to modify the `i32` value like we did in Listing +15-19, we want to be able to modify which `List` a `Cons` variant is pointing +to. We’ve also added a `tail` method to make it convenient for us to access the +second item, if we have a `Cons` variant. + + + + +In listing 15-21, we’re adding a `main` function that uses the definitions from +Listing 15-20. This code creates a list in `a`, a list in `b` that points to +the list in `a`, and then modifies the list in `a` to point to `b`, which +creates a reference cycle. There are `println!` statements along the way to +show what the reference counts are at various points in this process. + + + Filename: src/main.rs ``` -use List::{Cons, Nil}; -use std::rc::Rc; -use std::cell::RefCell; - +# use List::{Cons, Nil}; +# use std::rc::Rc; +# use std::cell::RefCell; +# #[derive(Debug)] +# enum List { +# Cons(i32, RefCell>), +# Nil, +# } +# +# impl List { +# fn tail(&self) -> Option<&RefCell>> { +# match *self { +# Cons(_, ref item) => Some(item), +# Nil => None, +# } +# } +# } +# fn main() { - let a = Rc::new(Cons(5, RefCell::new(Rc::new(Nil)))); println!("a initial rc count = {}", Rc::strong_count(&a)); println!("a next item = {:?}", a.tail()); - let b = Rc::new(Cons(10, RefCell::new(a.clone()))); + let b = Rc::new(Cons(10, RefCell::new(Rc::clone(&a)))); println!("a rc count after b creation = {}", Rc::strong_count(&a)); println!("b initial rc count = {}", Rc::strong_count(&b)); println!("b next item = {:?}", b.tail()); if let Some(ref link) = a.tail() { - *link.borrow_mut() = b.clone(); + *link.borrow_mut() = Rc::clone(&b); } println!("b rc count after changing a = {}", Rc::strong_count(&b)); @@ -1082,72 +2000,153 @@ fn main() { } ``` -Listing 15-17: Creating a reference cycle of two `List` values pointing to -each other +Listing 15-21: Creating a reference cycle of two `List` values pointing to each +other + +We create an `Rc` instance holding a `List` value in the variable `a` with an +initial list of `5, Nil`. We then create an `Rc` instance holding another +`List` value in the variable `b` that contains the value 10, then points to the +list in `a`. + +Finally, we modify `a` so that it points to `b` instead of `Nil`, which creates +a cycle. We do that by using the `tail` method to get a reference to the +`RefCell` in `a`, which we put in the variable `link`. Then we use the +`borrow_mut` method on the `RefCell` to change the value inside from an `Rc` +that holds a `Nil` value to the `Rc` in `b`. + +If we run this code, keeping the last `println!` commented out for the moment, +we’ll get this output: + +``` +a initial rc count = 1 +a next item = Some(RefCell { value: Nil }) +a rc count after b creation = 2 +b initial rc count = 1 +b next item = Some(RefCell { value: Cons(5, RefCell { value: Nil }) }) +b rc count after changing a = 2 +a rc count after changing a = 2 +``` + +We can see that the reference count of the `Rc` instances in both `a` and `b` +are 2 after we change the list in `a` to point to `b`. At the end of `main`, +Rust will try and drop `b` first, which will decrease the count in each of the +`Rc` instances in `a` and `b` by one. + + + + + + -We use the `tail` method to get a reference to the `RefCell` in `a`, which we -put in the variable `link`. Then we use the `borrow_mut` method on the -`RefCell` to change the value inside from an `Rc` that holds a `Nil` value to -the `Rc` in `b`. We’ve created a reference cycle that looks like Figure 15-18: +However, because `a` is still referencing the `Rc` that was in `b`, that `Rc` +has a count of 1 rather than 0, so the memory the `Rc` has on the heap won’t be +dropped. The memory will just sit there with a count of one, forever. + +To visualize this, we’ve created a reference cycle that looks like Figure 15-22: Reference cycle of lists -Figure 15-18: A reference cycle of lists `a` and `b` pointing to each other - -If you uncomment the last `println!`, Rust will try and print this cycle out -with `a` pointing to `b` pointing to `a` and so forth until it overflows the -stack. - -Looking at the results of the `println!` calls before the last one, we’ll see -that the reference count of both `a` and `b` are 2 after we change `a` to point -to `b`. At the end of `main`, Rust will try and drop `b` first, which will -decrease the count of the `Rc` by one. However, because `a` is still -referencing that `Rc`, its count is 1 rather than 0, so the memory the `Rc` has -on the heap won’t be dropped. It’ll just sit there with a count of one, -forever. In this specific case, the program ends right away, so it’s not a -problem, but in a more complex program that allocates lots of memory in a cycle -and holds onto it for a long time, this would be a problem. The program would -be using more memory than it needs to be, and might overwhelm the system and -cause it to run out of memory available to use. - -Now, as you can see, creating reference cycles is difficult and inconvenient in -Rust. But it’s not impossible: preventing memory leaks in the form of reference -cycles is not one of the guarantees Rust makes. If you have `RefCell` values -that contain `Rc` values or similar nested combinations of types with -interior mutability and reference counting, be aware that you’ll have to ensure -that you don’t create cycles. In the example in Listing 15-14, the solution -would probably be to not write code that could create cycles like this, since -we do want `Cons` variants to own the list they point to. - -With data structures like graphs, it’s sometimes necessary to have references -that create cycles in order to have parent nodes point to their children and -children nodes point back in the opposite direction to their parents, for -example. If one of the directions is expressing ownership and the other isn’t, -one way of being able to model the relationship of the data without creating -reference cycles and memory leaks is using `Weak`. Let’s explore that next! - -### Prevent Reference Cycles: Turn an `Rc` into a `Weak` - -The Rust standard library provides `Weak`, a smart pointer type for use in -situations that have cycles of references but only one direction expresses -ownership. We’ve been showing how cloning an `Rc` increases the -`strong_count` of references; `Weak` is a way to reference an `Rc` that -does not increment the `strong_count`: instead it increments the `weak_count` -of references to an `Rc`. When an `Rc` goes out of scope, the inner value will -get dropped if the `strong_count` is 0, even if the `weak_count` is not 0. To -be able to get the value from a `Weak`, we first have to upgrade it to an -`Option>` by using the `upgrade` method. The result of upgrading a -`Weak` will be `Some` if the `Rc` value has not been dropped yet, and `None` -if the `Rc` value has been dropped. Because `upgrade` returns an `Option`, we -know Rust will make sure we handle both the `Some` case and the `None` case and -we won’t be trying to use an invalid pointer. - -Instead of the list in Listing 15-17 where each item knows only about the -next item, let’s say we want a tree where the items know about their children -items *and* their parent items. - -Let’s start just with a struct named `Node` that holds its own `i32` value as -well as references to its children `Node` values: +Figure 15-22: A reference cycle of lists `a` and `b` pointing to each other + +If you uncomment the last `println!` and run the program, Rust will try and +print this cycle out with `a` pointing to `b` pointing to `a` and so forth +until it overflows the stack. + + + + +In this specific case, right after we create the reference cycle, the program +ends. The consequences of this cycle aren’t so dire. If a more complex program +allocates lots of memory in a cycle and holds onto it for a long time, the +program would be using more memory than it needs, and might overwhelm the +system and cause it to run out of available memory. + +Creating reference cycles is not easily done, but it’s not impossible either. +If you have `RefCell` values that contain `Rc` values or similar nested +combinations of types with interior mutability and reference counting, be aware +that you have to ensure you don’t create cycles yourself; you can’t rely on +Rust to catch them. Creating a reference cycle would be a logic bug in your +program that you should use automated tests, code reviews, and other software +development practices to minimize. + + + + +Another solution is reorganizing your data structures so that some references +express ownership and some references don’t. In this way, we can have cycles +made up of some ownership relationships and some non-ownership relationships, +and only the ownership relationships affect whether a value may be dropped or +not. In Listing 15-20, we always want `Cons` variants to own their list, so +reorganizing the data structure isn’t possible. Let’s look at an example using +graphs made up of parent nodes and child nodes to see when non-ownership +relationships are an appropriate way to prevent reference cycles. + +### Preventing Reference Cycles: Turn an `Rc` into a `Weak` + +So far, we’ve shown how calling `Rc::clone` increases the `strong_count` of an +`Rc` instance, and that an `Rc` instance is only cleaned up if its +`strong_count` is 0. We can also create a *weak reference* to the value within +an `Rc` instance by calling `Rc::downgrade` and passing a reference to the +`Rc`. When we call `Rc::downgrade`, we get a smart pointer of type `Weak`. +Instead of increasing the `strong_count` in the `Rc` instance by one, calling +`Rc::downgrade` increases the `weak_count` by one. The `Rc` type uses +`weak_count` to keep track of how many `Weak` references exist, similarly to +`strong_count`. The difference is the `weak_count` does not need to be 0 in +order for the `Rc` instance to be cleaned up. + + + + +Strong references are how we can share ownership of an `Rc` instance. Weak +references don’t express an ownership relationship. They won’t cause a +reference cycle since any cycle involving some weak references will be broken +once the strong reference count of values involved is 0. + + + + +Because the value that `Weak` references might have been dropped, in order +to do anything with the value that a `Weak` is pointing to, we have to check +to make sure the value is still around. We do this by calling the `upgrade` +method on a `Weak` instance, which will return an `Option>`. We’ll get +a result of `Some` if the `Rc` value has not been dropped yet, and `None` if +the `Rc` value has been dropped. Because `upgrade` returns an `Option`, we can +be sure that Rust will handle both the `Some` case and the `None` case, and +there won’t be an invalid pointer. + +As an example, rather than using a list whose items know only about the next +item, we’ll create a tree whose items know about their children items *and* +their parent items. + +#### Creating a Tree Data Structure: a `Node` with Child Nodes + +To start building this tree, we’ll create a struct named `Node` that holds its +own `i32` value as well as references to its children `Node` values: Filename: src/main.rs @@ -1162,17 +2161,28 @@ struct Node { } ``` -We want to be able to have a `Node` own its children, and we also want to be -able to have variables own each node so we can access them directly. That’s why -the items in the `Vec` are `Rc` values. We want to be able to modify what -nodes are another node’s children, so that’s why we have a `RefCell` in -`children` around the `Vec`. In Listing 15-19, let’s create one instance of -`Node` named `leaf` with the value 3 and no children, and another instance -named `branch` with the value 5 and `leaf` as one of its children: +We want a `Node` to own its children, and we want to be able to share that +ownership with variables so we can access each `Node` in the tree directly. To +do this, we define the `Vec` items to be values of type `Rc`. We also +want to be able to modify which nodes are children of another node, so we have +a `RefCell` in `children` around the `Vec`. + +Next, let’s use our struct definition and create one `Node` instance named +`leaf` with the value 3 and no children, and another instance named `branch` +with the value 5 and `leaf` as one of its children, as shown in Listing 15-23: Filename: src/main.rs ``` +# use std::rc::Rc; +# use std::cell::RefCell; +# +# #[derive(Debug)] +# struct Node { +# value: i32, +# children: RefCell>>, +# } +# fn main() { let leaf = Rc::new(Node { value: 3, @@ -1181,29 +2191,41 @@ fn main() { let branch = Rc::new(Node { value: 5, - children: RefCell::new(vec![leaf.clone()]), + children: RefCell::new(vec![Rc::clone(&leaf)]), }); } ``` -Listing 15-19: Creating a `leaf` node and a `branch` node where `branch` has -`leaf` as one of its children but `leaf` has no reference to `branch` +Listing 15-23: Creating a `leaf` node with no children and a `branch` node with +`leaf` as one of its children + +We clone the `Rc` in `leaf` and store that in `branch`, meaning the `Node` in +`leaf` now has two owners: `leaf` and `branch`. We can get from `branch` to +`leaf` through `branch.children`, but there’s no way to get from `leaf` to +`branch`. `leaf` has no reference to `branch` and doesn’t know they are +related. We’d like `leaf` to know that `branch` is its parent. + +#### Adding a Reference from a Child to its Parent -The `Node` in `leaf` now has two owners: `leaf` and `branch`, since we clone -the `Rc` in `leaf` and store that in `branch`. The `Node` in `branch` knows -it’s related to `leaf` since `branch` has a reference to `leaf` in -`branch.children`. However, `leaf` doesn’t know that it’s related to `branch`, -and we’d like `leaf` to know that `branch` is its parent. +To make the child node aware of its parent, we need to add a `parent` field to +our `Node` struct definition. The trouble is in deciding what the type of +`parent` should be. We know it can’t contain an `Rc` because that would +create a reference cycle, with `leaf.parent` pointing to `branch` and +`branch.children` pointing to `leaf`, which would cause their `strong_count` +values to never be zero. -To do that, we’re going to add a `parent` field to our `Node` struct -definition, but what should the type of `parent` be? We know it can’t contain -an `Rc`, since `leaf.parent` would point to `branch` and `branch.children` -contains a pointer to `leaf`, which makes a reference cycle. Neither `leaf` nor -`branch` would get dropped since they would always refer to each other and -their reference counts would never be zero. +Thinking about the relationships another way, a parent node should own its +children: if a parent node is dropped, its child nodes should be dropped as +well. However, a child should not own its parent: if we drop a child node, the +parent should still exist. This is a case for weak references! -So instead of `Rc`, we’re going to make the type of `parent` use `Weak`, -specifically a `RefCell>`: +So instead of `Rc`, we’ll make the type of `parent` use `Weak`, specifically +a `RefCell>`. Now our `Node` struct definition looks like this: + + + Filename: src/main.rs @@ -1219,14 +2241,35 @@ struct Node { } ``` -This way, a node will be able to refer to its parent node if it has one, -but it does not own its parent. A parent node will be dropped even if -it has child nodes referring to it, as long as it doesn’t have a parent -node as well. Now let’s update `main` to look like Listing 15-20: + + + +This way, a node will be able to refer to its parent node, but does not own its +parent. In Listing 15-24, let’s update `main` to use this new definition so +that the `leaf` node will have a way to refer to its parent, `branch`: + + + Filename: src/main.rs ``` +# use std::rc::{Rc, Weak}; +# use std::cell::RefCell; +# +# #[derive(Debug)] +# struct Node { +# value: i32, +# parent: RefCell>, +# children: RefCell>>, +# } +# fn main() { let leaf = Rc::new(Node { value: 3, @@ -1239,7 +2282,7 @@ fn main() { let branch = Rc::new(Node { value: 5, parent: RefCell::new(Weak::new()), - children: RefCell::new(vec![leaf.clone()]), + children: RefCell::new(vec![Rc::clone(&leaf)]), }); *leaf.parent.borrow_mut() = Rc::downgrade(&branch); @@ -1248,30 +2291,45 @@ fn main() { } ``` -Listing 15-20: A `leaf` node and a `branch` node where `leaf` has a `Weak` -reference to its parent, `branch` +Listing 15-24: A `leaf` node with a `Weak` reference to its parent node, +`branch` -Creating the `leaf` node looks similar; since it starts out without a parent, -we create a new `Weak` reference instance. When we try to get a reference to -the parent of `leaf` by using the `upgrade` method, we’ll get a `None` value, -as shown by the first `println!` that outputs: + + +Creating the `leaf` node looks similar to how creating the `leaf` node looked +in Listing 15-23, with the exception of the `parent` field: `leaf` starts out +without a parent, so we create a new, empty `Weak` reference instance. + +At this point, when we try to get a reference to the parent of `leaf` by using +the `upgrade` method, we get a `None` value. We see this in the output from the +first `println!`: ``` leaf parent = None ``` -Similarly, `branch` will also have a new `Weak` reference, since `branch` does -not have a parent node. We still make `leaf` be one of the children of -`branch`. Once we have a new `Node` instance in `branch`, we can modify `leaf` -to have a `Weak` reference to `branch` for its parent. We use the `borrow_mut` -method on the `RefCell` in the `parent` field of `leaf`, then we use the -`Rc::downgrade` function to create a `Weak` reference to `branch` from the `Rc` -in `branch.` + + + +When we create the `branch` node, it will also have a new `Weak` reference, +since `branch` does not have a parent node. We still have `leaf` as one of the +children of `branch`. Once we have the `Node` instance in `branch`, we can +modify `leaf` to give it a `Weak` reference to its parent. We use the +`borrow_mut` method on the `RefCell` in the `parent` field of `leaf`, then we +use the `Rc::downgrade` function to create a `Weak` reference to `branch` from +the `Rc` in `branch.` + + + When we print out the parent of `leaf` again, this time we’ll get a `Some` -variant holding `branch`. Also notice we don’t get a cycle printed out that -eventually ends in a stack overflow like we did in Listing 15-14: the `Weak` -references are just printed as `(Weak)`: +variant holding `branch`: `leaf` can now access its parent! When we print out +`leaf`, we also avoid the cycle that eventually ended in a stack overflow like +we had in Listing 15-21: the `Weak` references are printed as `(Weak)`: ``` leaf parent = Some(Node { value: 5, parent: RefCell { value: (Weak) }, @@ -1279,12 +2337,17 @@ children: RefCell { value: [Node { value: 3, parent: RefCell { value: (Weak) }, children: RefCell { value: [] } }] } }) ``` -The fact that we don’t get infinite output (or at least until the stack -overflows) is one way we can see that we don’t have a reference cycle in this -case. Another way we can tell is by looking at the values we get from calling -`Rc::strong_count` and `Rc::weak_count`. In Listing 15-21, let’s create a new -inner scope and move the creation of `branch` in there, so that we can see what -happens when `branch` is created and then dropped when it goes out of scope: +The lack of infinite output indicates that this code didn’t create a reference +cycle. We can also tell this by looking at the values we get from calling +`Rc::strong_count` and `Rc::weak_count`. + +#### Visualizing Changes to `strong_count` and `weak_count` + +Let’s look at how the `strong_count` and `weak_count` values of the `Rc` +instances change by creating a new inner scope and moving the creation of +`branch` into that scope. This will let us see what happens when `branch` is +created and then dropped when it goes out of scope. The modifications are shown +in Listing 15-25: Filename: src/main.rs @@ -1306,7 +2369,7 @@ fn main() { let branch = Rc::new(Node { value: 5, parent: RefCell::new(Weak::new()), - children: RefCell::new(vec![leaf.clone()]), + children: RefCell::new(vec![Rc::clone(&leaf)]), }); *leaf.parent.borrow_mut() = Rc::downgrade(&branch); @@ -1332,53 +2395,60 @@ fn main() { } ``` -Listing 15-21: Creating `branch` in an inner scope and examining strong and -weak reference counts of `leaf` and `branch` +Listing 15-25: Creating `branch` in an inner scope and examining strong and +weak reference counts -Right after creating `leaf`, its strong count is 1 (for `leaf` itself) and its -weak count is 0. In the inner scope, after we create `branch` and associate -`leaf` and `branch`, `branch` will have a strong count of 1 (for `branch` -itself) and a weak count of 1 (for `leaf.parent` pointing to `branch` with a -`Weak`). `leaf` will have a strong count of 2, since `branch` now has a -clone the `Rc` of `leaf` stored in `branch.children`. `leaf` still has a weak -count of 0. +Once `leaf` is created, its `Rc` has a strong count of 1 and a weak count of 0. +In the inner scope we create `branch` and associate it with `leaf`, at which +point the `Rc` in `branch` will have a strong count of 1 and a weak count of 1 +(for `leaf.parent` pointing to `branch` with a `Weak`). Here `leaf` will +have a strong count of 2, because `branch` now has a clone of the `Rc` of +`leaf` stored in `branch.children`, but will still have a weak count of 0. -When the inner scope ends, `branch` goes out of scope, and its strong count -decreases to 0, so its `Node` gets dropped. The weak count of 1 from -`leaf.parent` has no bearing on whether `Node` gets dropped or not, so we don’t -have a memory leak! +When the inner scope ends, `branch` goes out of scope and the strong count of +the `Rc` decreases to 0, so its `Node` gets dropped. The weak count of 1 from +`leaf.parent` has no bearing on whether `Node` is dropped or not, so we don’t +get any memory leaks! If we try to access the parent of `leaf` after the end of the scope, we’ll get -`None` again like we did before `leaf` had a parent. At the end of the program, -`leaf` has a strong count of 1 and a weak count of 0, since `leaf` is now the -only thing pointing to it again. - -All of the logic managing the counts and whether a value should be dropped or -not was managed by `Rc` and `Weak` and their implementations of the `Drop` -trait. By specifying that the relationship from a child to its parent should be -a `Weak` reference in the definition of `Node`, we’re able to have parent -nodes point to child nodes and vice versa without creating a reference cycle -and memory leaks. +`None` again. At the end of the program, the `Rc` in `leaf` has a strong count +of 1 and a weak count of 0, because the variable `leaf` is now the only +reference to the `Rc` again. + + + + +All of the logic that manages the counts and value dropping is built in to +`Rc` and `Weak` and their implementations of the `Drop` trait. By specifying +that the relationship from a child to its parent should be a `Weak` +reference in the definition of `Node`, we’re able to have parent nodes point to +child nodes and vice versa without creating a reference cycle and memory leaks. + + + ## Summary -We’ve now covered how you can use different kinds of smart pointers to choose -different guarantees and tradeoffs than those Rust makes with regular +This chapter covered how you can use smart pointers to make different +guarantees and tradeoffs than those Rust makes by default with regular references. `Box` has a known size and points to data allocated on the heap. `Rc` keeps track of the number of references to data on the heap so that data can have multiple owners. `RefCell` with its interior mutability gives -us a type that can be used where we need an immutable type, and enforces the -borrowing rules at runtime instead of at compile time. +us a type that can be used when we need an immutable type but need the ability +to change an inner value of that type, and enforces the borrowing rules at +runtime instead of at compile time. -We’ve also discussed the `Deref` and `Drop` traits that enable a lot of smart -pointers’ functionality. We explored how it’s possible to create a reference -cycle that would cause a memory leak, and how to prevent reference cycles by -using `Weak`. +We also discussed the `Deref` and `Drop` traits that enable a lot of the +functionality of smart pointers. We explored reference cycles that can cause +memory leaks, and how to prevent them using `Weak`. -If this chapter has piqued your interest and you now want to implement your own -smart pointers, check out The Nomicon at -*https://doc.rust-lang.org/stable/nomicon/vec.html* for even more useful -information. +If this chapter has piqued your interest and you want to implement your own +smart pointers, check out “The Nomicon” at +*https://doc.rust-lang.org/stable/nomicon/* for even more useful information. Next, let’s talk about concurrency in Rust. We’ll even learn about a few new -smart pointers that can help us with it. +smart pointers.