Skip to content
Merged
Show file tree
Hide file tree
Changes from 21 commits
Commits
Show all changes
28 commits
Select commit Hold shift + click to select a range
47db6d9
Update types-and-traits.rst
rcseacord Dec 11, 2025
95e8f54
Apply suggestion from @rcseacord
rcseacord Dec 11, 2025
be9cb93
Update types-and-traits.rst
rcseacord Dec 11, 2025
edc14a2
Update types-and-traits.rst
rcseacord Dec 12, 2025
8d5c834
fix: add bibliography entry;
manhatsu Dec 12, 2025
205fc0f
chore: put citations in table
manhatsu Dec 12, 2025
c6bd716
fix: implement "Copy" to use enum in union
manhatsu Dec 12, 2025
f4a49c9
fix: add main function
manhatsu Dec 12, 2025
0872f3c
Update types-and-traits.rst
rcseacord Dec 12, 2025
a935fe0
Update types-and-traits.rst
rcseacord Dec 12, 2025
7e2875f
Update types-and-traits.rst
rcseacord Dec 12, 2025
8b7306a
Update types-and-traits.rst
rcseacord Dec 12, 2025
a7e8ff0
Update types-and-traits.rst
rcseacord Dec 12, 2025
fc9d450
Update types-and-traits.rst
rcseacord Dec 12, 2025
6d67edd
chore: enable line breaks
manhatsu Dec 16, 2025
4da5850
Update gui_UnionFieldValidity.rst.inc
rcseacord Dec 16, 2025
03d3c63
Update src/coding-guidelines/types-and-traits/gui_UnionFieldValidity.…
rcseacord Dec 16, 2025
a07d1aa
Remove git cruft
PLeVasseur Dec 17, 2025
507bd94
Update src/coding-guidelines/types-and-traits/gui_UnionFieldValidity.…
rcseacord Dec 17, 2025
428f952
chore: rename with unique ID
Dec 19, 2025
99551c2
fix: bibliography style
Dec 19, 2025
9438de5
chore: align with new format
manhatsu Dec 24, 2025
d70df8d
Refactor IntOrBool union and its methods
rcseacord Dec 24, 2025
3829fc0
Refactor compliant example for clarity and safety
rcseacord Dec 25, 2025
a5fe947
Fix typos and improve example functions in RST
rcseacord Dec 25, 2025
98a225f
Refactor comments in gui_0cuTYG8RVYjg.rst.inc
rcseacord Dec 25, 2025
1a36bfe
Update src/coding-guidelines/types-and-traits/gui_0cuTYG8RVYjg.rst.inc
rcseacord Dec 25, 2025
945d8f0
Fix formatting in get_bytes and get_u32 functions
rcseacord Dec 25, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
290 changes: 290 additions & 0 deletions src/coding-guidelines/types-and-traits/gui_0cuTYG8RVYjg.rst.inc
Original file line number Diff line number Diff line change
@@ -0,0 +1,290 @@
.. SPDX-License-Identifier: MIT OR Apache-2.0
SPDX-FileCopyrightText: The Coding Guidelines Subcommittee Contributors

.. default-domain:: coding-guidelines

.. guideline:: Ensure reads of union fields produce valid values for the field's type
:id: gui_0cuTYG8RVYjg
:category: required
:status: draft
:release: unknown
:fls: fls_oFIRXBPXu6Zv
:decidability: undecidable
:scope: system
:tags: defect, safety, undefined-behavior

Ensure that the underlying bytes constitute a valid value for that field's type when reading from a union field.
Reading a union field whose bytes do not represent a valid value for the field's type is undefined behavior.

Before accessing a union field, verify that that the union was either:

* last written through that field, or
* written through a field whose bytes are valid when reinterpreted as the target field's type

If the active field is uncertain, use explicit validity checks.

.. rationale::
:id: rat_UnionFieldValidityReason
:status: draft

Similar to C, unions allow multiple fields to occupy the same memory.
Unlike enumeration types, unions do not track which field is currently active.
You must ensure that when a field is read that
the underlying bytes are valid for that field's type :cite:`gui_0cuTYG8RVYjg:RUST-REF-UNION`.

Every type has a *validity invariant* — a set of constraints that all values of
that type must satisfy :cite:`gui_0cuTYG8RVYjg:UCG-VALIDITY`.
Reading a union field performs a *typed read*,
which asserts that the bytes are valid for the target type.

Examples of validity requirements for common types:

* **bool**: Must be ``0`` (false) or ``1`` (true). Any other value (e.g., ``3``) is invalid.
* **char**: Must be a valid Unicode scalar value (``0x0`` to ``0xD7FF`` or ``0xE000`` to ``0x10FFFF``).
* **References**: Must be non-null and properly aligned.
* **Enums**: Must hold a valid discriminant value.
* **Floating point**: All bit patterns are valid for the ``f32`` or ``f64`` types.
* **Integers**: All bit patterns are valid for integer types.

Reading an invalid value is undefined behavior.

.. non_compliant_example::
:id: non_compl_ex_UnionBool
:status: draft

This noncompliant example reads an invalid bit pattern from a Boolean union field.
The value ``3`` is not a valid value of type ``bool`` (only ``0`` and ``1`` are valid).

.. code-block:: rust

union IntOrBool {
i: u8,
b: bool,
}

fn main() {
let u = IntOrBool { i: 3 };

// Undefined behavior reading an invalid value from a union field of type 'bool'
unsafe { u.b }; // Noncompliant
}

.. non_compliant_example::
:id: non_compl_ex_UnionChar
:status: draft

This noncompliant example reads an invalid Unicode value from a ``union`` field of type ``char`` .

.. code-block:: rust

union IntOrChar {
i: u32,
c: char,
}

fn main() {
// '0xD800' is a surrogate and not a valid Unicode scalar value
let u = IntOrChar { i: 0xD800 };

// Reading an invalid Unicode value from a union field of type 'char'
unsafe { u.c }; // Noncompliant
}

.. non_compliant_example::
:id: non_compl_ex_UnionEnum
:status: draft

This noncompliant example reads an invalid discriminant from a union field of 'Color' enumeration type.

.. code-block:: rust

#[repr(u8)]
#[derive(Copy, Clone)]
enum Color {
Red = 0,
Green = 1,
Blue = 2,
}

union IntOrColor {
i: u8,
c: Color,
}

fn main() {
let u = IntOrColor { i: 42 };

// Undefined behavior reading an invalid discriminant from the 'Color' enumeration type
unsafe { u.c }; // Noncompliant
}

.. non_compliant_example::
:id: non_compl_ex_UnionRef
:status: draft

This noncompliant example reads a reference from a union containing a null pointer.
A similar problem occurs when reading a misaligned pointer.

.. code-block:: rust

union PtrOrRef {
p: *const i32,
r: &'static i32,
}

fn main() {
let u = PtrOrRef { p: std::ptr::null() };

// Undefined behavior reading a null value from a reference field of a union
unsafe { u.r }; // Noncompliant
}

.. compliant_example::
:id: compl_ex_UnionTrackField
:status: draft

This compliant example tracks the active field explicitly to ensure valid reads.

.. code-block:: rust

union IntOrBool {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about this? The goal is to show this in a situation / light that mimics code we might see used.

  1. We make the union #[repr(C)] as most of the time we'll be doing this to interact over FFI
  2. We put // SAFETY: comments above usages of unsafe; still kept in there that we're therefore compliant
  3. We use assert_eq!() instead of println!() so that we can test for equivalence
#[repr(C)]
#[derive(Copy, Clone)]
union IntOrBoolData {
    i: u8,
    b: bool,
}

/// Tracks which field of the union is currently active.
#[derive(Clone, Copy, PartialEq, Eq)]
enum ActiveField {
    Int,
    Bool,
}

/// A union wrapper that tracks the active field at runtime.
pub struct IntOrBool {
    data: IntOrBoolData,
    active: ActiveField,
}

impl IntOrBool {
    pub fn from_int(value: u8) -> Self {
        Self {
            data: IntOrBoolData { i: value },
            active: ActiveField::Int,
        }
    }

    pub fn from_bool(value: bool) -> Self {
        Self {
            data: IntOrBoolData { b: value },
            active: ActiveField::Bool,
        }
    }

    pub fn set_int(&mut self, value: u8) {
        self.data.i = value;
        self.active = ActiveField::Int;
    }

    pub fn set_bool(&mut self, value: bool) {
        self.data.b = value;
        self.active = ActiveField::Bool;
    }

    /// Returns the integer value if that field is active.
    pub fn as_int(&self) -> Option<u8> {
        match self.active {
            // SAFETY: We only read `i` when we know it was last written as `i`, thus compliant
            ActiveField::Int => Some(unsafe { self.data.i }),
            ActiveField::Bool => None,
        }
    }

    /// Returns the boolean value if that field is active.
    pub fn as_bool(&self) -> Option<bool> {
        match self.active {
            // SAFETY: We only read `b` when we know it was last written as `b`, thus compliant
            ActiveField::Bool => Some(unsafe { self.data.b }),
            ActiveField::Int => None,
        }
    }
}

fn main() {
    let mut value = IntOrBool::from_bool(true);
    assert_eq!(value.as_bool(), Some(true));
    assert_eq!(value.as_int(), None);

    value.set_int(42);
    assert_eq!(value.as_bool(), None);
    assert_eq!(value.as_int(), Some(42));
}

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this program have to do something so it's not all optimized away? Maybe return value?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess it doesn't matter. It's hard to make this code do something meaningful. A good optimizer is just going to return a constant value.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

resolved by d70df8d

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this program have to do something so it's not all optimized away?

Assertions are never optimized away, by design. So whatever this program asserts, it will always be tested.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, there are also debug assertions that are removed from the release build.

i: u8,
b: bool,
}

enum ActiveField {
Int,
Bool,
}

struct SafeUnion {
data: IntOrBool,
active: ActiveField,
}

impl SafeUnion {
fn new_int(value: u8) -> Self {
Self {
data: IntOrBool { i: value },
active: ActiveField::Int,
}
}

fn new_bool(value: bool) -> Self {
Self {
data: IntOrBool { b: value },
active: ActiveField::Bool,
}
}

fn get_bool(&self) -> Option<bool> {
match self.active {
// Compliant: only read bool when we know it was written as bool
ActiveField::Bool => Some(unsafe { self.data.b }),
ActiveField::Int => None,
}
}
}

fn main() {
let union_bool = SafeUnion::new_bool(true);
let union_int = SafeUnion::new_int(42);

println!("Bool union as bool: {:?}", union_bool.get_bool()); // Some(true)
println!("Int union as bool: {:?}", union_int.get_bool()); // None
}

.. compliant_example::
:id: compl_ex_UnionSameField
:status: draft

This compliant solution reads from the same field that was written.

.. code-block:: rust

union IntOrBool {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about this?

  1. We make the union #[repr(C)] as most of the time we'll be doing this to interact over FFI
  2. We put // SAFETY: comments above usages of unsafe; still kept in there that we're therefore compliant
  3. We use assert_eq!() instead of println!() so that we can test for equivalence
#[repr(C)]
union IntOrBool {
    i: u8,
    b: bool,
}
fn main() {
    let u = IntOrBool { b: true };
    
    // SAFETY: Read the same field that was written, thus compliant
    assert_eq!(unsafe { u.b }, true); // compliant
}

i: u8,
b: bool,
}

fn main() {
let u = IntOrBool { b: true };

// Read the same field that was written
println!("bool value: {}", unsafe { u.b }); // compliant
}

.. compliant_example::
:id: compl_ex_UnionValidReinterpret
:status: draft

This compliant example reinterprets the value as a different types where all bit patterns are valid.

.. code-block:: rust

union IntBytes {
i: u32,
bytes: [u8; 4],
}

fn main() {
let u = IntBytes { i: 0x12345678 };

// All bit patterns are valid for [u8; 4]
println!("bytes: {:?}", unsafe { u.bytes }); // compliant

let u2 = IntBytes { bytes: [0x11, 0x22, 0x33, 0x44] };

// All bit patterns are valid for 'u32'
println!("integer: 0x{:08X}", unsafe { u2.i }); // compliant
}

.. compliant_example::
:id: compl_ex_UnionValidateBool
:status: draft

This compliant example validates bytes before reading as a constrained type.

.. code-block:: rust

union IntOrBool {
i: u8,
b: bool,
}

fn try_read_bool(u: &IntOrBool) -> Option<bool> {
// Read as integer (always valid for 'u8')
let raw = unsafe { u.i };

// Validate before interpreting as a value of type 'bool'
match raw {
0 => Some(false),
1 => Some(true),
_ => None, // Invalid Boolean value
}
}

fn main() {
let u1 = IntOrBool { i: 1 };
let u2 = IntOrBool { i: 3 };

// Validates before reading as value of type 'bool'
println!("u1 as bool: {:?}", try_read_bool(&u1)); // Some(true)
println!("u2 as bool: {:?}", try_read_bool(&u2)); // None
}

.. bibliography::
:id: bib_WNCi5njUWLuZ
:status: draft

.. list-table::
:header-rows: 0
:widths: auto
:class: bibliography-table

* - :bibentry:`gui_0cuTYG8RVYjg:RUST-REF-UNION`
- The Rust Reference. "Unions." https://doc.rust-lang.org/reference/items/unions.html.

* - :bibentry:`gui_0cuTYG8RVYjg:UCG-VALIDITY`
- Rust Unsafe Code Guidelines. "Validity and Safety Invariant." https://rust-lang.github.io/unsafe-code-guidelines/glossary.html#validity-and-safety-invariant.

1 change: 1 addition & 0 deletions src/coding-guidelines/types-and-traits/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -7,3 +7,4 @@ Types and Traits
================

.. include:: gui_xztNdXA2oFNC.rst.inc
.. include:: gui_0cuTYG8RVYjg.rst.inc