Skip to content
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 3 additions & 0 deletions .github/workflows/bsd.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -69,6 +69,7 @@ jobs:
# gmake -C regression test-parallel JOBS=2
# gmake -C regression/cbmc test-paths-lifo
# env PATH=$PATH:`pwd`/src/solvers gmake -C regression/cbmc test-cprover-smt2
# gmake -C regression/cbmc test-wide-pointer-encoding
# # gmake -C jbmc/regression test-parallel JOBS=2

- name: Save ccache
Expand Down Expand Up @@ -137,6 +138,7 @@ jobs:
# gmake -C regression test-parallel JOBS=2
# gmake -C regression/cbmc test-paths-lifo
# env PATH=$PATH:`pwd`/src/solvers gmake -C regression/cbmc test-cprover-smt2
# gmake -C regression/cbmc test-wide-pointer-encoding
# # gmake -C jbmc/regression test-parallel JOBS=2

- name: Save ccache
Expand Down Expand Up @@ -208,6 +210,7 @@ jobs:
# gmake -C regression test-parallel JOBS=2
# gmake -C regression/cbmc test-paths-lifo
# env PATH=$PATH:`pwd`/src/solvers gmake -C regression/cbmc test-cprover-smt2
# gmake -C regression/cbmc test-wide-pointer-encoding
# # gmake -C jbmc/regression test-parallel JOBS=2

- name: Save ccache
Expand Down
3 changes: 3 additions & 0 deletions .github/workflows/pull-request-checks.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -75,6 +75,7 @@ jobs:
make -C regression test-parallel JOBS=${{env.linux-vcpus}} LIBS="$PWD/riss.git/release/lib/libriss-coprocessor.a -lpthread" IPASIR=$PWD/riss.git/riss
make -C regression/cbmc test-paths-lifo
make -C regression/cbmc test-cprover-smt2
make -C regression/cbmc test-wide-pointer-encoding
make -C jbmc/regression test-parallel JOBS=${{env.linux-vcpus}}
- name: Check cleanup
run: |
Expand Down Expand Up @@ -161,6 +162,7 @@ jobs:
make -C regression test-parallel JOBS=${{env.linux-vcpus}}
make -C regression/cbmc test-paths-lifo
make -C regression/cbmc test-cprover-smt2
make -C regression/cbmc test-wide-pointer-encoding
make -C jbmc/regression test-parallel JOBS=${{env.linux-vcpus}}

- name: Save ccache
Expand Down Expand Up @@ -369,6 +371,7 @@ jobs:
make -C regression test-parallel JOBS=${{env.linux-vcpus}}
make -C regression/cbmc test-paths-lifo
make -C regression/cbmc test-cprover-smt2
make -C regression/cbmc test-wide-pointer-encoding
make -C jbmc/regression test-parallel JOBS=${{env.linux-vcpus}}

- name: Save ccache
Expand Down
149 changes: 149 additions & 0 deletions doc/architectural/wide-pointer-encoding.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,149 @@
# Wide Pointer Encoding

## Overview

The `--wide-pointer-encoding` option extends CBMC's pointer bitvector
representation to include a flat integer address alongside the standard
object/offset encoding. This enables precise modeling of pointer-to-integer
casts, integer-to-pointer casts, pointer comparison, and address reuse
after free.

## Pointer Layout

Standard encoding (64-bit platform):
```
[offset(56) | object(8)] = 64 bits
```

Wide encoding (64-bit platform):
```
[object(64) | offset(64) | address(64)] = 192 bits
```

Each component has the full platform pointer width, removing the
`--object-bits` limitation entirely.

## What It Fixes

### Issue #881: Object-bits limitation

The standard encoding splits the pointer width between object bits and
offset bits. With the default 8 object bits, only 256 objects can be
tracked, and offsets are limited to 56 bits. The wide encoding gives
each component the full 64 bits.

### Issue #8200: Cast-to-pointer misses error

```c
char *x = "";
char *ptr = (char *)0x55a8a2e6b007;
assert(ptr != x); // Standard: SUCCESS (wrong). Wide: FAILURE (correct).
```

The wide encoding compares pointers by their flat address, not by
object identity. Two pointers from different objects can have the
same address.

### Issue #8161: Inconsistent pointer comparison

```c
int *n = &m;
if((unsigned long)n >= (unsigned long)(-4095))
assert(...); // Standard: inconsistent. Wide: consistent.
```

Pointer-to-integer casts return the flat address, making arithmetic
on pointer-derived integers consistent with plain integer arithmetic.

### Issue #8103: Integer-to-pointer casting

```c
uint64_t addr = constrained_to_be_within(buffer);
uint8_t *p = (void *)addr;
assert(*p == expected); // Standard: FAILURE (wrong). Wide: SUCCESS (correct).
```

The wide encoding adds address-based dereference dispatch: when a
pointer comes from an integer-to-pointer cast, the dereference reads
from the object whose address range contains the pointer's flat address.

### Issue #2117: Address reuse after free

```c
int *x = malloc(sizeof(int));
free(x);
int *y = malloc(sizeof(int));
if(x == y) assert(0); // Standard: unreachable. Wide: reachable.
```

The wide encoding allows `malloc` to return the same address after
`free` by relaxing the non-overlapping constraint for dynamic objects
and preventing the simplifier from assuming different dynamic objects
have different addresses.

## Additional Features

### Stack Layout Modeling (`--model-stack-layout`)

When combined with `--wide-pointer-encoding`, this option places stack
variables adjacently with architecture-appropriate growth direction
(downward for x86/ARM, upward for HPPA):

```c
int a, b, c;
assert(&a > &b); // Stack growth direction
assert(&a - &b == sizeof(b)); // Adjacent placement
char *p = (char *)&a;
p[sizeof(a)] = 42;
assert(*(char *)&b == 42); // Buffer overflow modeling
```

## Technical Details

### Pointer Equality and Comparison

With the wide encoding, `p == q` compares flat addresses (not object
identity). Relational comparisons (`<`, `>`, `<=`, `>=`) also use
flat addresses, making cross-object comparisons well-defined.

### Pointer-to-Integer Cast (P2I)

Returns the flat address component of the pointer. This is the
byte-level address that the program would see on a real machine.

### Integer-to-Pointer Cast (I2P)

Creates a pointer with fresh object/offset variables. Forward
constraints link the object to the address: `obj == i => base[i] + off == addr`.
Backward constraints are added lazily via a refinement loop in
`dec_solve`: if the SAT model assigns an object inconsistent with
the address range, backward constraints are added and the formula
is re-solved.

### Byte Operations

Byte-extract and byte-update on pointer types operate on the 64-bit
address component only. The object/offset bits are internal encoding
and not byte-visible. Pointer arrays store 64-bit addresses per
element, matching symex's byte-level element size.

### Non-Overlapping Constraints

Objects are ordered by index with a chain constraint:
`base[0] + size[0] <= base[1] <= ...` (O(n) instead of O(n²)).
For consecutive dynamic objects of the same size, address reuse is
allowed: `base[next] == base[prev] OR base[next] >= end[prev]`.

### Performance

The wide encoding creates 3× wider pointer bitvectors, increasing
the SAT formula size. Typical overhead is 1.5× on the regression
suite. Programs with many integer-to-pointer casts may be slower
due to the backward constraint refinement.

## Command-Line Options

- `--wide-pointer-encoding`: Enable the wide pointer encoding.
- `--model-stack-layout`: Place stack variables adjacently with
architecture-appropriate growth direction (requires
`--wide-pointer-encoding`).
7 changes: 7 additions & 0 deletions doc/man/cbmc.1
Original file line number Diff line number Diff line change
Expand Up @@ -374,6 +374,13 @@ diagnostic information
show points\-to sets for
pointer dereference. Requires \fB\-\-json\-ui\fR.
.TP
\fB\-\-wide\-pointer\-encoding\fR
use flat addresses for pointer comparison and integer\-to\-pointer casts
.TP
\fB\-\-model\-stack\-layout\fR
place stack variables adjacently with architecture\-appropriate growth
direction (requires \fB\-\-wide\-pointer\-encoding\fR)
.TP
\fB\-\-program\-only\fR
only show program expression
.TP
Expand Down
7 changes: 7 additions & 0 deletions doc/man/jbmc.1
Original file line number Diff line number Diff line change
Expand Up @@ -289,6 +289,13 @@ diagnostic information
show points\-to sets for
pointer dereference. Requires \fB\-\-json\-ui\fR.
.TP
\fB\-\-wide\-pointer\-encoding\fR
use flat addresses for pointer comparison and integer\-to\-pointer casts
.TP
\fB\-\-model\-stack\-layout\fR
place stack variables adjacently with architecture\-appropriate growth
direction (requires \fB\-\-wide\-pointer\-encoding\fR)
.TP
\fB\-\-program\-only\fR
only show program expression
.TP
Expand Down
11 changes: 11 additions & 0 deletions regression/cbmc/CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -32,6 +32,17 @@ add_test_pl_profile(
"CORE"
)

# Run the whole suite with the 192-bit flat-address pointer encoding. Tests
# whose output legitimately differs (or that are intractable) under this
# encoding are tagged `no-wide-pointer-encoding`; thorough tests are excluded
# as for the other solver variants.
add_test_pl_profile(
"cbmc-wide-pointer-encoding"
"$<TARGET_FILE:cbmc> --wide-pointer-encoding"
"-C;-X;smt-backend;-X;no-wide-pointer-encoding;-X;thorough-paths;-X;thorough-smt-backend;-X;thorough-cprover-smt-backend;-X;thorough-z3-smt-backend;${gcc_only_string}-s;wide-pointer-encoding"
"CORE"
)

# solver appears on the PATH in Windows already
if(NOT "${CMAKE_SYSTEM_NAME}" STREQUAL "Windows")
set_property(
Expand Down
7 changes: 7 additions & 0 deletions regression/cbmc/Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -34,6 +34,13 @@ test-new-smt-backend:
-X no-new-smt \
-s new-smt-backend $(GCC_ONLY)

test-wide-pointer-encoding:
@../test.pl -e -p -c "../../../src/cbmc/cbmc --wide-pointer-encoding" \
-X smt-backend -X no-wide-pointer-encoding \
-X thorough-paths -X thorough-smt-backend \
-X thorough-cprover-smt-backend -X thorough-z3-smt-backend \
-s wide-pointer-encoding $(GCC_ONLY)

tests.log: ../test.pl test

clean:
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
CORE wide-pointer-encoding
main.i
--wide-pointer-encoding --32 --little-endian
^EXIT=0$
^SIGNAL=0$
^VERIFICATION SUCCESSFUL$
--
^warning: ignoring
15 changes: 15 additions & 0 deletions regression/cbmc/Pointer_array4/test-wide-pointer-encoding.desc
Original file line number Diff line number Diff line change
@@ -0,0 +1,15 @@
CORE no-new-smt broken-smt-backend wide-pointer-encoding
main.i
--wide-pointer-encoding --32
^EXIT=10$
^SIGNAL=0$
^\[main.overflow.1\] line 8 arithmetic overflow on signed - in iThird - iFirst: FAILURE$
^\*\* 1 of 2 failed
^VERIFICATION FAILED$
--
^warning: ignoring
--
Wide pointer encoding: addresses can be in the upper half of the
32-bit address space, so (int)pointer can be negative and the
signed subtraction iThird - iFirst can overflow. This is correct
behavior — casting pointers to signed int is implementation-defined.
2 changes: 1 addition & 1 deletion regression/cbmc/Pointer_array4/test.desc
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
CORE
CORE no-wide-pointer-encoding
main.i
--32
^EXIT=0$
Expand Down
11 changes: 11 additions & 0 deletions regression/cbmc/Pointer_array6/test-wide-pointer-encoding.desc
Original file line number Diff line number Diff line change
@@ -0,0 +1,11 @@
CORE wide-pointer-encoding
main.c
--wide-pointer-encoding --no-malloc-may-fail
^EXIT=0$
^SIGNAL=0$
^VERIFICATION SUCCESSFUL$
--
--
Wide pointer encoding: byte-level operations on pointer arrays
see the 192-bit encoding instead of the 64-bit flat address.
Requires endianness map fix for pointer byte representation.
Original file line number Diff line number Diff line change
@@ -0,0 +1,18 @@
CORE no-new-smt broken-smt-backend paths-lifo-expected-failure wide-pointer-encoding
main.c
--wide-pointer-encoding --no-malloc-may-fail
^\[main.assertion.3\] line 21 always false for different objects: SUCCESS$
^\[main.assertion.5\] line 28 assertion __CPROVER_POINTER_OBJECT\(p\) == __CPROVER_POINTER_OBJECT\(p1\): FAILURE$
^\*\* 11 of 59 failed
^VERIFICATION FAILED$
^EXIT=10$
^SIGNAL=0$
--
^warning: ignoring
--
Wide pointer encoding with address-based relational comparison: two
distinct live allocations are strictly ordered, so the cross-object
comparison p1 < p2 is provable (assertion.3 SUCCESS, vs FAILURE with the
default encoding). Relational comparison uses flat addresses, so
assuming p < p1+1 does not imply same_object(p, p1) (assertion.5
FAILURE).
2 changes: 1 addition & 1 deletion regression/cbmc/Pointer_comparison3/test.desc
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
CORE no-new-smt
CORE no-new-smt no-wide-pointer-encoding
main.c
--no-malloc-may-fail
^\[main.assertion.3\] line 21 always false for different objects: FAILURE$
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,19 @@
CORE broken-smt-backend no-new-smt wide-pointer-encoding
main.c
--wide-pointer-encoding
^EXIT=10$
^SIGNAL=0$
^\[main.assertion.1\] line 6 correct: SUCCESS
^\[main.pointer.1\] line 8 same object violation in array - other_array: FAILURE$
^\[main.assertion.3\] line 13 undefined behavior: FAILURE$
^\[main.assertion.8\] line 28 end plus 2 is nondet: (UNKNOWN|FAILURE)$
^\[main.pointer_arithmetic.\d+\] line 28 pointer relation: pointer outside object bounds in p: FAILURE$
^\*\* [56] of \d+ failed
^VERIFICATION FAILED$
--
^warning: ignoring
^CONVERSION ERROR$
--
Wide pointer encoding variant: fewer overflow failures because
pointer subtraction overflow is checked on the 56-bit offset
component, not the full 64-bit packed representation.
2 changes: 1 addition & 1 deletion regression/cbmc/Pointer_difference2/test.desc
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
CORE broken-smt-backend no-new-smt
CORE broken-smt-backend no-new-smt no-wide-pointer-encoding
main.c

^\[main.assertion.1\] line 6 correct: SUCCESS
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,11 @@
KNOWNBUG thorough-paths no-new-smt wide-pointer-encoding
test.c
--wide-pointer-encoding --no-simplify --unwind 300 --object-bits 8
^EXIT=6$
^SIGNAL=0$
--
^warning: ignoring
--
Wide pointer encoding: the object-bits limitation is removed
(full 64-bit object width), so the "too many addressed objects"
error is never triggered. This test is not applicable.
2 changes: 1 addition & 1 deletion regression/cbmc/address_space_size_limit1/test.desc
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
CORE thorough-paths no-new-smt
CORE thorough-paths no-new-smt no-wide-pointer-encoding
test.c
--no-simplify --unwind 300 --object-bits 8
too many addressed objects
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,13 @@
CORE wide-pointer-encoding broken-cprover-smt-backend no-new-smt
test_struct_raw_bytes.c
--wide-pointer-encoding
^EXIT=10$
^SIGNAL=0$
^VERIFICATION FAILED$
^\*\* 11 of
--
^.*expecting SUCCESS: FAILURE$
^.*dereference .*: FAILURE$
--
Wide pointer encoding: same results as standard encoding for
struct byte operations.
3 changes: 2 additions & 1 deletion regression/cbmc/memory_allocation1/test.desc
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
CORE broken-smt-backend
CORE broken-smt-backend no-wide-pointer-encoding
main.c

^EXIT=10$
Expand All @@ -12,3 +12,4 @@ main.c
^warning: ignoring
--
This test only fails when using SMT solvers as back-end on Windows.
Broken under --wide-pointer-encoding: __CPROVER_allocated_memory has no body. Needs further work.
Loading
Loading