Skip to content

assert failed: cx_core.is_none() #5239

@koivunej

Description

@koivunej

Version

tokio 1.21.1, with grep results prettified:

We use rust 1.62.1.

Platform

Initially detected from a CI run on:

  • amazon ec2
  • Linux hostname 5.10.144-127.601.amzn2.x86_64 #1 SMP Thu Sep 29 01:11:59 UTC 2022 x86_64 x86_64 x86_64 GNU/Linux

Reproduced on:

  • ubuntu 22.04
  • Linux hostname 5.15.0-52-generic #58-Ubuntu SMP Thu Oct 13 08:03:55 UTC 2022 x86_64 x86_64 x86_64 GNU/Linux
  • AMD Ryzen 9 3900XT 12-Core Processor

Description

Assertion failure message with release build:

thread 'mgmt request worker' panicked at 'assertion failed: cx_core.is_none()', /home/joonas/.cargo/registry/src/github.zerozr99.workers.dev-1ecc6299db9ec823/tokio-1.21.1/src/runtime/scheduler/multi_thread/worker.rs:263:21

In our codebase this is after a few tries reproducable under load locally for me. Load as in while true; do cargo clean && cargo build; done for example running in the background. I can try out some patches if needed.

I haven't been able to find an MVCE.

Full steps to reproduce in our codebase
# install all dependencies from repository README.md:
# https://github.com/koivunej/neon/tree/tokio_assertion_failure#running-local-installation

git clone --recursive --branch tokio_assertion_failure https://github.com/koivunej/neon.git

# release build is needed to reproduce
BUILD_TYPE=release CARGO_BUILD_FLAGS="--features=testing,profiling" make -s -j4

# install more dependencies
PYTHON_KEYRING_BACKEND=keyring.backends.null.Keyring ./scripts/pysync

# add external load in another terminal to this
while true; do NEON_BIN=target/release ./scripts/pytest test_runner/regress/test_gc_aggressive.py::test_gc_aggressive; done

Expect to see:

FAILED test_runner/regress/test_gc_aggressive.py::test_gc_aggressive - requests.exceptions.ConnectionError: ('Connection aborted.', RemoteDisconnected('Remote end closed connection without response'))

Then you will find the assertion failure in test_output/test_gc_aggressive/repo/pageserver.log. I have also copied the full stacktrace to the next <details>.

RUST_BACKTRACE=full of the assertion failure
thread 'mgmt request worker' panicked at 'assertion failed: cx_core.is_none()', /home/joonas/.cargo/registry/src/github.zerozr99.workers.dev-1ecc6299db9ec823/tokio-1.21.1/src/runtime/scheduler/multi_thread/worker.rs:263:21
stack backtrace:
   0:     0x56374720a37d - std::backtrace_rs::backtrace::libunwind::trace::h8e036432725b1c57
                               at /rustc/e092d0b6b43f2de967af0887873151bb1c0b18d3/library/std/src/../../backtrace/src/backtrace/libunwind.rs:93:5
   1:     0x56374720a37d - std::backtrace_rs::backtrace::trace_unsynchronized::h4f83092254c85869
                               at /rustc/e092d0b6b43f2de967af0887873151bb1c0b18d3/library/std/src/../../backtrace/src/backtrace/mod.rs:66:5
   2:     0x56374720a37d - std::sys_common::backtrace::_print_fmt::h9728b5e056a3ece3
                               at /rustc/e092d0b6b43f2de967af0887873151bb1c0b18d3/library/std/src/sys_common/backtrace.rs:66:5
   3:     0x56374720a37d - <std::sys_common::backtrace::_print::DisplayBacktrace as core::fmt::Display>::fmt::h48bb4bd2928827d2
                               at /rustc/e092d0b6b43f2de967af0887873151bb1c0b18d3/library/std/src/sys_common/backtrace.rs:45:22
   4:     0x563747232e9c - core::fmt::write::h909e69a2c24f44cc
                               at /rustc/e092d0b6b43f2de967af0887873151bb1c0b18d3/library/core/src/fmt/mod.rs:1196:17
   5:     0x563747202061 - std::io::Write::write_fmt::h7f4b8ab8af89e9ef
                               at /rustc/e092d0b6b43f2de967af0887873151bb1c0b18d3/library/std/src/io/mod.rs:1654:15
   6:     0x56374720bcf5 - std::sys_common::backtrace::_print::hff4838ebf14a2171
                               at /rustc/e092d0b6b43f2de967af0887873151bb1c0b18d3/library/std/src/sys_common/backtrace.rs:48:5
   7:     0x56374720bcf5 - std::sys_common::backtrace::print::h2499280374189ad9
                               at /rustc/e092d0b6b43f2de967af0887873151bb1c0b18d3/library/std/src/sys_common/backtrace.rs:35:9
   8:     0x56374720bcf5 - std::panicking::default_hook::{{closure}}::h8b270fc55eeb284e
                               at /rustc/e092d0b6b43f2de967af0887873151bb1c0b18d3/library/std/src/panicking.rs:295:22
   9:     0x56374720b969 - std::panicking::default_hook::h3217e229d6e9d13c
                               at /rustc/e092d0b6b43f2de967af0887873151bb1c0b18d3/library/std/src/panicking.rs:314:9
  10:     0x56374720c3d8 - std::panicking::rust_panic_with_hook::h9acb8048b738d2e0
                               at /rustc/e092d0b6b43f2de967af0887873151bb1c0b18d3/library/std/src/panicking.rs:698:17
  11:     0x56374720c249 - std::panicking::begin_panic_handler::{{closure}}::h70f3b839526af6dc
                               at /rustc/e092d0b6b43f2de967af0887873151bb1c0b18d3/library/std/src/panicking.rs:586:13
  12:     0x56374720a834 - std::sys_common::backtrace::__rust_end_short_backtrace::h1ecf2cee857fbe0a
                               at /rustc/e092d0b6b43f2de967af0887873151bb1c0b18d3/library/std/src/sys_common/backtrace.rs:138:18
  13:     0x56374720bfb9 - rust_begin_unwind
                               at /rustc/e092d0b6b43f2de967af0887873151bb1c0b18d3/library/std/src/panicking.rs:584:5
  14:     0x563747230b63 - core::panicking::panic_fmt::h9f8393e7fd56d655
                               at /rustc/e092d0b6b43f2de967af0887873151bb1c0b18d3/library/core/src/panicking.rs:142:14
  15:     0x5637472309ad - core::panicking::panic::h021666fc6a0f7b6b
                               at /rustc/e092d0b6b43f2de967af0887873151bb1c0b18d3/library/core/src/panicking.rs:48:5
  16:     0x56374710c22b - <tokio::runtime::scheduler::multi_thread::worker::block_in_place::Reset as core::ops::drop::Drop>::drop::{{closure}}::hd65847a1090ca025
  17:     0x5637471062c5 - <tokio::runtime::scheduler::multi_thread::worker::block_in_place::Reset as core::ops::drop::Drop>::drop::h42ae149038909fb7
  18:     0x56374697512e - core::ptr::drop_in_place<tokio::runtime::scheduler::multi_thread::worker::block_in_place::Reset>::h1e6f731fa79d34ba
                               at /rustc/e092d0b6b43f2de967af0887873151bb1c0b18d3/library/core/src/ptr/mod.rs:486:1
  19:     0x56374697512e - tokio::runtime::scheduler::multi_thread::worker::block_in_place::hda495eb5ef5a1acd
                               at /home/joonas/.cargo/registry/src/github.zerozr99.workers.dev-1ecc6299db9ec823/tokio-1.21.1/src/runtime/scheduler/multi_thread/worker.rs:340:5
  20:     0x563746b08340 - tokio::task::blocking::block_in_place::ha97b73b75ce70862
                               at /home/joonas/.cargo/registry/src/github.zerozr99.workers.dev-1ecc6299db9ec823/tokio-1.21.1/src/task/blocking.rs:77:9
  21:     0x563746b08340 - pageserver::tenant::Tenant::gc_iteration::{{closure}}::hcc45b24d96148799
                               at /home/joonas/src/neon/neon/pageserver/src/tenant.rs:530:9
  22:     0x563746b08340 - <core::future::from_generator::GenFuture<T> as core::future::future::Future>::poll::h308288025478c0c0
                               at /rustc/e092d0b6b43f2de967af0887873151bb1c0b18d3/library/core/src/future/mod.rs:91:19
  23:     0x56374687c68c - <tracing::instrument::Instrumented<T> as core::future::future::Future>::poll::hda287b8f128780d0
                               at /home/joonas/.cargo/registry/src/github.zerozr99.workers.dev-1ecc6299db9ec823/tracing-0.1.37/src/instrument.rs:272:9
  24:     0x563746b0fcfd - pageserver::http::routes::timeline_gc_handler::{{closure}}::h0e56b6cccdfe75f6
                               at /home/joonas/src/neon/neon/pageserver/src/http/routes.rs:849:91
  25:     0x563746b0fcfd - <core::future::from_generator::GenFuture<T> as core::future::future::Future>::poll::h4dee783785ea8184
                               at /rustc/e092d0b6b43f2de967af0887873151bb1c0b18d3/library/core/src/future/mod.rs:91:19
  26:     0x56374678f3b7 - <core::pin::Pin<P> as core::future::future::Future>::poll::h5dbc8583f5dbf765
                               at /rustc/e092d0b6b43f2de967af0887873151bb1c0b18d3/library/core/src/future/future.rs:124:9
  27:     0x56374678f3b7 - routerify::route::Route<B,E>::process::{{closure}}::h7fffd52673600116
                               at /home/joonas/.cargo/registry/src/github.zerozr99.workers.dev-1ecc6299db9ec823/routerify-3.0.0/src/route/mod.rs:105:32
  28:     0x56374678f3b7 - <core::future::from_generator::GenFuture<T> as core::future::future::Future>::poll::ha39752ecfad407be
                               at /rustc/e092d0b6b43f2de967af0887873151bb1c0b18d3/library/core/src/future/mod.rs:91:19
  29:     0x56374678f3b7 - routerify::router::Router<B,E>::process::{{closure}}::hc3d490240cd467ff
                               at /home/joonas/.cargo/registry/src/github.zerozr99.workers.dev-1ecc6299db9ec823/routerify-3.0.0/src/router/mod.rs:308:89
  30:     0x56374678f3b7 - <core::future::from_generator::GenFuture<T> as core::future::future::Future>::poll::h88afc17f6a7162c2
                               at /rustc/e092d0b6b43f2de967af0887873151bb1c0b18d3/library/core/src/future/mod.rs:91:19
  31:     0x56374678f3b7 - <routerify::service::request_service::RequestService<B,E> as tower_service::Service<http::request::Request<hyper::body::body::Body>>>::call::{{closure}}::hf419aede28588ee7
                               at /home/joonas/.cargo/registry/src/github.zerozr99.workers.dev-1ecc6299db9ec823/routerify-3.0.0/src/service/request_service.rs:56:72
  32:     0x5637467b93a5 - <core::future::from_generator::GenFuture<T> as core::future::future::Future>::poll::h2de32919bd847725
                               at /rustc/e092d0b6b43f2de967af0887873151bb1c0b18d3/library/core/src/future/mod.rs:91:19
  33:     0x5637467d596e - <core::pin::Pin<P> as core::future::future::Future>::poll::h3faa950168332df5
                               at /rustc/e092d0b6b43f2de967af0887873151bb1c0b18d3/library/core/src/future/future.rs:124:9
  34:     0x5637467d596e - <hyper::proto::h1::dispatch::Server<S,hyper::body::body::Body> as hyper::proto::h1::dispatch::Dispatch>::poll_msg::hd5117f65306c4294
                               at /home/joonas/.cargo/registry/src/github.zerozr99.workers.dev-1ecc6299db9ec823/hyper-0.14.20/src/proto/h1/dispatch.rs:491:35
  35:     0x5637467d596e - hyper::proto::h1::dispatch::Dispatcher<D,Bs,I,T>::poll_write::hc55c2ea65eaff573
                               at /home/joonas/.cargo/registry/src/github.zerozr99.workers.dev-1ecc6299db9ec823/hyper-0.14.20/src/proto/h1/dispatch.rs:297:43
  36:     0x5637467d596e - hyper::proto::h1::dispatch::Dispatcher<D,Bs,I,T>::poll_loop::h214e07f7181a2707
                               at /home/joonas/.cargo/registry/src/github.zerozr99.workers.dev-1ecc6299db9ec823/hyper-0.14.20/src/proto/h1/dispatch.rs:161:21
  37:     0x5637467d30fd - hyper::proto::h1::dispatch::Dispatcher<D,Bs,I,T>::poll_inner::h2b3d24b8f8211935
                               at /home/joonas/.cargo/registry/src/github.zerozr99.workers.dev-1ecc6299db9ec823/hyper-0.14.20/src/proto/h1/dispatch.rs:137:16
  38:     0x5637467d30fd - hyper::proto::h1::dispatch::Dispatcher<D,Bs,I,T>::poll_catch::hfead020b3bd85cd6
                               at /home/joonas/.cargo/registry/src/github.zerozr99.workers.dev-1ecc6299db9ec823/hyper-0.14.20/src/proto/h1/dispatch.rs:120:28
  39:     0x5637466f9f52 - <hyper::proto::h1::dispatch::Dispatcher<D,Bs,I,T> as core::future::future::Future>::poll::hb9d39bd98e716b09
                               at /home/joonas/.cargo/registry/src/github.zerozr99.workers.dev-1ecc6299db9ec823/hyper-0.14.20/src/proto/h1/dispatch.rs:424:9
  40:     0x5637466f9f52 - <hyper::server::conn::ProtoServer<T,B,S,E> as core::future::future::Future>::poll::h7665d21f4b883402
                               at /home/joonas/.cargo/registry/src/github.zerozr99.workers.dev-1ecc6299db9ec823/hyper-0.14.20/src/server/conn.rs:952:47
  41:     0x5637466f9f52 - <hyper::server::conn::upgrades::UpgradeableConnection<I,S,E> as core::future::future::Future>::poll::hb96f5473d0574cb8
                               at /home/joonas/.cargo/registry/src/github.zerozr99.workers.dev-1ecc6299db9ec823/hyper-0.14.20/src/server/conn.rs:1012:30
  42:     0x56374671e6bf - <hyper::common::drain::Watching<F,FN> as core::future::future::Future>::poll::hf0c8ec2a7a8ed8b0
  43:     0x56374671e6bf - <hyper::server::server::new_svc::NewSvcTask<I,N,S,E,W> as core::future::future::Future>::poll::h846866b9a0929fda
                               at /home/joonas/.cargo/registry/src/github.zerozr99.workers.dev-1ecc6299db9ec823/hyper-0.14.20/src/server/server.rs:728:36
  44:     0x5637467f65e7 - tokio::runtime::task::core::CoreStage<T>::poll::{{closure}}::h9a58eefb1d854ebe
                               at /home/joonas/.cargo/registry/src/github.zerozr99.workers.dev-1ecc6299db9ec823/tokio-1.21.1/src/runtime/task/core.rs:184:17
  45:     0x5637467f65e7 - tokio::loom::std::unsafe_cell::UnsafeCell<T>::with_mut::hbd0e5f206f1f3f6f
                               at /home/joonas/.cargo/registry/src/github.zerozr99.workers.dev-1ecc6299db9ec823/tokio-1.21.1/src/loom/std/unsafe_cell.rs:14:9
  46:     0x5637467f65e7 - tokio::runtime::task::core::CoreStage<T>::poll::hbee48de80c4fcccd
                               at /home/joonas/.cargo/registry/src/github.zerozr99.workers.dev-1ecc6299db9ec823/tokio-1.21.1/src/runtime/task/core.rs:174:13
  47:     0x56374685c61a - tokio::runtime::task::harness::poll_future::{{closure}}::h7ca64421cdeddcb2
                               at /home/joonas/.cargo/registry/src/github.zerozr99.workers.dev-1ecc6299db9ec823/tokio-1.21.1/src/runtime/task/harness.rs:480:19
  48:     0x56374685c61a - <core::panic::unwind_safe::AssertUnwindSafe<F> as core::ops::function::FnOnce<()>>::call_once::h2de3a15ff26ba160
                               at /rustc/e092d0b6b43f2de967af0887873151bb1c0b18d3/library/core/src/panic/unwind_safe.rs:271:9
  49:     0x56374685c61a - std::panicking::try::do_call::hf6d7a880e62abda6
                               at /rustc/e092d0b6b43f2de967af0887873151bb1c0b18d3/library/std/src/panicking.rs:492:40
  50:     0x56374685c61a - std::panicking::try::h531c1d3ec5cbe2b2
                               at /rustc/e092d0b6b43f2de967af0887873151bb1c0b18d3/library/std/src/panicking.rs:456:19
  51:     0x56374685c61a - std::panic::catch_unwind::h4f0af80b22a9de64
                               at /rustc/e092d0b6b43f2de967af0887873151bb1c0b18d3/library/std/src/panic.rs:137:14
  52:     0x56374685c61a - tokio::runtime::task::harness::poll_future::h57ec7dda84531f03
                               at /home/joonas/.cargo/registry/src/github.zerozr99.workers.dev-1ecc6299db9ec823/tokio-1.21.1/src/runtime/task/harness.rs:468:18
  53:     0x56374685c61a - tokio::runtime::task::harness::Harness<T,S>::poll_inner::heca3dd74238bdd7e
                               at /home/joonas/.cargo/registry/src/github.zerozr99.workers.dev-1ecc6299db9ec823/tokio-1.21.1/src/runtime/task/harness.rs:104:27
  54:     0x56374685c61a - tokio::runtime::task::harness::Harness<T,S>::poll::he0f319957dba656d
                               at /home/joonas/.cargo/registry/src/github.zerozr99.workers.dev-1ecc6299db9ec823/tokio-1.21.1/src/runtime/task/harness.rs:57:15
  55:     0x5637470e35c5 - std::thread::local::LocalKey<T>::with::h38aaa913b8a48d65
  56:     0x563747107563 - tokio::runtime::scheduler::multi_thread::worker::Context::run_task::hf28064e32e379826
  57:     0x563747106a80 - tokio::runtime::scheduler::multi_thread::worker::Context::run::hec211607b213b37b
  58:     0x56374710a7b7 - tokio::macros::scoped_tls::ScopedKey<T>::set::hd7166d6799738ff0
  59:     0x5637471064a9 - tokio::runtime::scheduler::multi_thread::worker::run::h958f4678849dd1fe
  60:     0x5637470f575c - <tokio::runtime::blocking::task::BlockingTask<T> as core::future::future::Future>::poll::h0ab71826e7387519
  61:     0x5637470da7e9 - tokio::runtime::task::harness::Harness<T,S>::poll::h091e55b483c30575
  62:     0x5637470f4f5a - tokio::runtime::blocking::pool::Inner::run::h3a91a3d2536a1c92
  63:     0x5637470e6df2 - std::sys_common::backtrace::__rust_begin_short_backtrace::h6a13e50bb80c5a9b
  64:     0x5637470e751f - core::ops::function::FnOnce::call_once{{vtable.shim}}::h81568063c1016e71
  65:     0x563747212053 - <alloc::boxed::Box<F,A> as core::ops::function::FnOnce<Args>>::call_once::h191d5c5ea3edb31d
                               at /rustc/e092d0b6b43f2de967af0887873151bb1c0b18d3/library/alloc/src/boxed.rs:1872:9
  66:     0x563747212053 - <alloc::boxed::Box<F,A> as core::ops::function::FnOnce<Args>>::call_once::h42ef7cb2ae640a31
                               at /rustc/e092d0b6b43f2de967af0887873151bb1c0b18d3/library/alloc/src/boxed.rs:1872:9
  67:     0x563747212053 - std::sys::unix::thread::Thread::new::thread_start::he47f7169665dab60
                               at /rustc/e092d0b6b43f2de967af0887873151bb1c0b18d3/library/std/src/sys/unix/thread.rs:108:17
  68:     0x7f3b56aacb43 - start_thread
                               at ./nptl/./nptl/pthread_create.c:442:8
  69:     0x7f3b56b3ea00 - clone3
                               at ./misc/../sysdeps/unix/sysv/linux/x86_64/clone3.S:81
  70:                0x0 - <unknown>

In this branch, I've sprinkled a lot of block_in_place around the blocking parts after I ran into a deadlock caused by the unstealable lifo slot because there was blocking within the runtime threads. It is unlikely that I've caught all places of blocking within async context.

If I ended up misusing the block_in_place and block_on then I wish the assertion would have a clear message about the misuse. However since it only triggers under external load (and while being nice -20), I suspect it is a real tokio issue.

Metadata

Metadata

Assignees

No one assigned

    Labels

    A-tokioArea: The main tokio crateC-bugCategory: This is a bug.

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions