Skip to content

(RHEL-93425) resolved: fix use-after-free with queries hitting the cache#462

Merged
jamacku merged 4 commits intoredhat-plumbers:rhel-8.10.0from
mrc0mmand:RHEL-93425-resolved-segfault
Jan 30, 2026
Merged

(RHEL-93425) resolved: fix use-after-free with queries hitting the cache#462
jamacku merged 4 commits intoredhat-plumbers:rhel-8.10.0from
mrc0mmand:RHEL-93425-resolved-segfault

Conversation

@mrc0mmand
Copy link
Member

@mrc0mmand mrc0mmand commented Jan 16, 2026

No description provided.

(cherry picked from commit 7877e5c)

Related: RHEL-93425
When dns_transaction_complete() manages to resolve a query, it invalidates the
query candidate object. It shall not be accessed afterwards.

We have the following chain of calls:
dns_query_candidate_go → dns_transaction_go → dns_transaction_prepare → dns_cache_lookup (success: 1)
                                                                      → dns_transaction_complete
After returning back to dns_query_candidate_go(), we'd attempt to continue
iteration over the list of transactions attached to the query candidate,
accessing already freed (and overwritten) memory:

(gdb) bt
0  0x00007f637297cf47 in hashmap_iterate_entry (i=0x7ffe7e15cc90, h=0x706f746b73656465) at ../src/basic/hashmap.c:703
1  _hashmap_iterate (h=0x706f746b73656465, i=i@entry=0x7ffe7e15cc90, value=value@entry=0x7ffe7e15cc88,
    key=key@entry=0x0) at ../src/basic/hashmap.c:712
2  0x00007f637297d01b in set_iterate (s=<optimized out>, i=i@entry=0x7ffe7e15cc90, value=value@entry=0x7ffe7e15cc88)
    at ../src/basic/hashmap.c:733
hence we crash

3  0x0000557bc99eb80f in dns_query_candidate_go (c=c@entry=0x557bcaf86890) at ../src/resolve/resolved-dns-query.c:139
...but c is not valid here in the second iteration of the loop

4  0x0000557bc99eb720 in dns_query_candidate_notify (c=0x557bcaf86890) at ../src/resolve/resolved-dns-query.c:271
c was valid here at entry...

5  0x0000557bc99efe28 in dns_transaction_complete (t=0x557bcac072f0, state=<optimized out>)
    at ../src/resolve/resolved-dns-transaction.c:350
t is a valid transaction (11481 in the backtrace below)

6  0x0000557bc99f1efb in dns_transaction_process_reply (t=0x557bcac072f0, p=<optimized out>)
    at ../src/resolve/resolved-dns-transaction.c:1171
7  0x0000557bc99f2d41 in on_dns_packet (s=<optimized out>, fd=<optimized out>, revents=<optimized out>,
    userdata=0x557bcac072f0) at ../src/resolve/resolved-dns-transaction.c:1223
8  0x00007f6372a25217 in source_dispatch (s=s@entry=0x557bcb162c50) at ../src/libsystemd/sd-event/sd-event.c:3181
9  0x00007f6372a254fd in sd_event_dispatch (e=0x557bcb15b050) at ../src/libsystemd/sd-event/sd-event.c:3620
10 0x00007f6372a267c8 in sd_event_run (e=e@entry=0x557bcb15b050, timeout=timeout@entry=18446744073709551615)
    at ../src/libsystemd/sd-event/sd-event.c:3678
11 0x00007f6372a269ef in sd_event_loop (e=0x557bcb15b050) at ../src/libsystemd/sd-event/sd-event.c:3700
12 0x0000557bc99ddc14 in run (argc=<optimized out>, argv=<optimized out>) at ../src/resolve/resolved.c:92
13 0x0000557bc99d260a in main (argc=<optimized out>, argv=<optimized out>) at ../src/resolve/resolved.c:99

xxx.name.net systemd-resolved[31705]: Got message type=method_call sender=:1.3644 destination=org.freedesktop.resolve1 path=/org/freedesktop/resolve1 interface=org.freedesktop.resolve1.Manager member=ResolveHostname cookie=2 reply_cookie=0 signature=isit error-name=n/a error-message=n/a
xxx.name.net systemd-resolved[31705]: idn2_lookup_u8: xxx → xxx
xxx.name.net systemd-resolved[31705]: Looking up RR for xxx IN A.
xxx.name.net systemd-resolved[31705]: Sent message type=method_call sender=n/a destination=org.freedesktop.DBus path=/org/freedesktop/DBus interface=org.freedesktop.DBus member=AddMatch cookie=1102 reply_cookie=0 signature=s error-name=n/a error-message=n/a
xxx.name.net systemd-resolved[31705]: Sent message type=method_call sender=n/a destination=org.freedesktop.DBus path=/org/freedesktop/DBus interface=org.freedesktop.DBus member=GetNameOwner cookie=1103 reply_cookie=0 signature=s error-name=n/a error-message=n/a
xxx.name.net systemd-resolved[31705]: Got message type=method_return sender=org.freedesktop.DBus destination=:1.3324 path=n/a interface=n/a member=n/a cookie=4294967295 reply_cookie=1103 signature=s error-name=n/a error-message=n/a
xxx.name.net systemd-resolved[31705]: Cache miss for xxx.name.net IN A
xxx.name.net systemd-resolved[31705]: Transaction 11481 for <xxx.name.net IN A> scope dns on enp42s0/*.
xxx.name.net systemd-resolved[31705]: Using feature level UDP for transaction 11481.
xxx.name.net systemd-resolved[31705]: Using DNS server 192.168.1.1 for transaction 11481.
xxx.name.net systemd-resolved[31705]: Sending query packet with id 11481 of size 35.
xxx.name.net systemd-resolved[31705]: Got message type=method_return sender=org.freedesktop.DBus destination=:1.3324 path=n/a interface=n/a member=n/a cookie=4294967295 reply_cookie=1102 signature= error-name=n/a error-message=n/a
xxx.name.net systemd-resolved[31705]: Match type='signal',sender='org.freedesktop.DBus',path='/org/freedesktop/DBus',interface='org.freedesktop.DBus',member='NameOwnerChanged',arg0=':1.3644' successfully installed.
xxx.name.net systemd-resolved[31705]: Processing incoming packet on transaction 11481 (rcode=NXDOMAIN).
xxx.name.net systemd-resolved[31705]: Not caching negative entry without a SOA record: xxx.name.net IN A
xxx.name.net systemd-resolved[31705]: Transaction 11481 for <xxx.name.net IN A> on scope dns on enp42s0/* now complete with <rcode-failure> from network (unsigned).
xxx.name.net systemd-resolved[31705]: Positive cache hit for xxx.lan IN A
xxx.name.net systemd-resolved[31705]: Transaction 64364 for <xxx.lan IN A> on scope dns on enp42s0/* now complete with <success> from cache (unsigned).
xxx.name.net systemd-resolved[31705]: Sent message type=method_return sender=n/a destination=:1.3644 path=n/a interface=n/a member=n/a cookie=1104 reply_cookie=2 signature=a(iiay)st error-name=n/a error-message=n/a
xxx.name.net systemd-resolved[31705]: Sent message type=method_call sender=n/a destination=org.freedesktop.DBus path=/org/freedesktop/DBus interface=org.freedesktop.DBus member=RemoveMatch cookie=1105 reply_cookie=0 signature=s error-name=n/a error-message=n/a
xxx.name.net systemd-resolved[31705]: Freeing transaction 64364.
xxx.name.net systemd[1]: systemd-resolved.service: Main process exited, code=dumped, status=11/SEGV
xxx.name.net systemd[1]: systemd-resolved.service: Failed with result 'core-dump'.

Fixes #16168, https://bugzilla.redhat.com/show_bug.cgi?id=1895937.

(cherry picked from commit 4ea8b44)

Resolves: RHEL-93425
@mrc0mmand mrc0mmand force-pushed the RHEL-93425-resolved-segfault branch from 73ac4f8 to 0e5af2b Compare January 16, 2026 12:41
@github-actions github-actions bot added tracker/missing Formerly needs-bz pr/needs-ci Formerly needs-ci pr/needs-review Formerly needs-review labels Jan 16, 2026
@github-actions
Copy link

github-actions bot commented Jan 16, 2026

Commit validation

Tracker - RHEL-93425

The following commits meet all requirements

commit upstream
4d56ec8 - resolved: add dns_query_candidate_freep() systemd/systemd@7877e5c
eec31ef - resolved: fix use-after-free with queries hitting the cache systemd/systemd@4ea8b44
0243672 - resolve: exit from loop for transactions when transactions has been re… systemd/systemd@5814acc
19086f7 - locale-util: do not call setlocale() when multi-threaded systemd/systemd@ca13432

Tracker validation

Success

🟢 Tracker RHEL-93425 has set desired product: rhel-8.10.z
🟢 Tracker RHEL-93425 has set desired component: systemd
🟢 Tracker RHEL-93425 has been approved
🟢 Tracker RHEL-93425 has set severity


Pull Request validation

Success

🟢 CI - All checks have passed
🟢 Review - Reviewed by a member
🟢 Approval - Changes were approved


Auto Merge

Failed

🔴 Pull Request has unsupported target branch rhel-8.10.0, expected branches are: 'main,master'

Success

🟢 Pull Request is not marked as draft and it's not blocked by dont-merge label
🟢 Pull Request meet requirements, title has correct form
🟢 Pull Request meet requirements, mergeable is true
🟢 Pull Request meet requirements, mergeable_state is clean

@github-actions github-actions bot removed the tracker/missing Formerly needs-bz label Jan 16, 2026
@github-actions github-actions bot changed the title resolved: fix use-after-free with queries hitting the cache (RHEL-93425) resolved: fix use-after-free with queries hitting the cache Jan 16, 2026
@mrc0mmand mrc0mmand marked this pull request as ready for review January 16, 2026 16:07
@dtardon
Copy link
Member

dtardon commented Jan 20, 2026

I'd do a minimal change and scrap the last 2 commits... IMHO they just complicate the situation, as they introduce a second mechanism to DnsQuery to keep it from being destroyed (the block_ready counter is still there).

@mrc0mmand
Copy link
Member Author

I'd do a minimal change and scrap the last 2 commits... IMHO they just complicate the situation, as they introduce a second mechanism to DnsQuery to keep it from being destroyed (the block_ready counter is still there).

I backported the other two patches since at least according to systemd/systemd#18290 (comment) there's some other issue that wasn't solved by the non-ref-counting approach. However, I was not able to reproduce it, at least yet, the first two patches seem to be enough for the customer issue to go away.

@mrc0mmand mrc0mmand force-pushed the RHEL-93425-resolved-segfault branch from 0e5af2b to eec31ef Compare January 20, 2026 18:52
@github-actions github-actions bot added pr/needs-ci Formerly needs-ci and removed pr/needs-ci Formerly needs-ci labels Jan 20, 2026
@jamacku jamacku requested a review from dtardon January 21, 2026 07:49
@dtardon
Copy link
Member

dtardon commented Jan 21, 2026

I'd do a minimal change and scrap the last 2 commits... IMHO they just complicate the situation, as they introduce a second mechanism to DnsQuery to keep it from being destroyed (the block_ready counter is still there).

I backported the other two patches since at least according to systemd/systemd#18290 (comment) there's some other issue that wasn't solved by the non-ref-counting approach. However, I was not able to reproduce it, at least yet, the first two patches seem to be enough for the customer issue to go away.

Well, Zbyszek very likely understands the code much better than I do... However, I suspect what's really missing is systemd/systemd#37462 .

…egenerated

Fixes #37458.

(cherry picked from commit 5814acca9aa4354d121de4bf174851f092a6b643)

Related: RHEL-93425
@github-actions github-actions bot added the pr/needs-ci Formerly needs-ci label Jan 26, 2026
@mrc0mmand
Copy link
Member Author

I'd do a minimal change and scrap the last 2 commits... IMHO they just complicate the situation, as they introduce a second mechanism to DnsQuery to keep it from being destroyed (the block_ready counter is still there).

I backported the other two patches since at least according to systemd/systemd#18290 (comment) there's some other issue that wasn't solved by the non-ref-counting approach. However, I was not able to reproduce it, at least yet, the first two patches seem to be enough for the customer issue to go away.

Well, Zbyszek very likely understands the code much better than I do... However, I suspect what's really missing is systemd/systemd#37462 .

Let's go with that. I can't reproduce the original issue with the current patch set and everything still seems to work.

And as it's custom, the CI managed to trigger a completely unrelated bug:

>>> PATH=/build/build:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin SYSTEMD_KBD_MODEL_MAP=/build/src/locale/kbd-model-map MALLOC_PERTURB_=25 SYSTEMD_LANGUAGE_FALLBACK_MAP=/build/src/locale/language-fallback-map /build/build/test-bus-watch-bind
 ✀  
stderr:
Initializing server
Initializing client1
Initializing client2
Bus client2: changing state UNSET → OPENING
Added inotify watch for /dev on bus client2: 2
Added inotify watch for /dev/shm on bus client2: 3
Added inotify watch for /dev/shm/systemd-watch-bind-tBWLuu on bus client2: 4
Added inotify watch for /dev/shm/systemd-watch-bind-tBWLuu/this on bus client2: -1
Bus client2: changing state OPENING → WATCH_BIND
=================================================================
==7942==ERROR: AddressSanitizer: heap-use-after-free on address 0x60f000000040 at pc 0x7f145ad30615 bp 0x7f144e7eb870 sp 0x7f144e7eb018
WRITE of size 2 at 0x60f000000040 thread T2
    #0 0x7f145ad30614  (/lib64/libasan.so.5+0x43614)
    #1 0x7f1459ca3f53 in is_locale_utf8 ../src/basic/locale-util.c:237
    #2 0x7f1459ca4c92 in special_glyph ../src/basic/locale-util.c:445
    #3 0x7f1459e38878 in bus_set_state ../src/libsystemd/sd-bus/sd-bus.c:522
    #4 0x7f1459e404fa in sd_bus_start ../src/libsystemd/sd-bus/sd-bus.c:1170
    #5 0x5561e1c0a312 in thread_client1 ../src/libsystemd/sd-bus/test-bus-watch-bind.c:141
    #6 0x7f14584c61c9 in start_thread (/lib64/libpthread.so.0+0x81c9)
    #7 0x7f14581218d2 in __GI___clone (/lib64/libc.so.6+0x398d2)

0x60f000000040 is located 0 bytes inside of 167-byte region [0x60f000000040,0x60f0000000e7)
freed by thread T3 here:
    #0 0x7f145addc7f0 in __interceptor_free (/lib64/libasan.so.5+0xef7f0)
    #1 0x7f145812c58a in __GI_setlocale (/lib64/libc.so.6+0x4458a)
    #2 0x7f14582803c4  (/lib64/libc.so.6+0x1983c4)

previously allocated by thread T2 here:
    #0 0x7f145addcbb8 in __interceptor_malloc (/lib64/libasan.so.5+0xefbb8)
    #1 0x7f145812bf27 in new_composite_name (/lib64/libc.so.6+0x43f27)

Thread T2 created by T0 here:
    #0 0x7f145ad3feb3 in __interceptor_pthread_create (/lib64/libasan.so.5+0x52eb3)
    #1 0x5561e1c0c427 in main ../src/libsystemd/sd-bus/test-bus-watch-bind.c:212
    #2 0x7f14581227e4 in __libc_start_main (/lib64/libc.so.6+0x3a7e4)

Thread T3 created by T0 here:
    #0 0x7f145ad3feb3 in __interceptor_pthread_create (/lib64/libasan.so.5+0x52eb3)
    #1 0x5561e1c0c47f in main ../src/libsystemd/sd-bus/test-bus-watch-bind.c:213
    #2 0x7f14581227e4 in __libc_start_main (/lib64/libc.so.6+0x3a7e4)

SUMMARY: AddressSanitizer: heap-use-after-free (/lib64/libasan.so.5+0x43614) 
Shadow bytes around the buggy address:
  0x0c1e7fff7fb0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x0c1e7fff7fc0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x0c1e7fff7fd0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x0c1e7fff7fe0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x0c1e7fff7ff0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
=>0x0c1e7fff8000: fa fa fa fa fa fa fa fa[fd]fd fd fd fd fd fd fd
  0x0c1e7fff8010: fd fd fd fd fd fd fd fd fd fd fd fd fd fa fa fa
  0x0c1e7fff8020: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x0c1e7fff8030: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x0c1e7fff8040: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x0c1e7fff8050: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
Shadow byte legend (one shadow byte represents 8 application bytes):
  Addressable:           00
  Partially addressable: 01 02 03 04 05 06 07 
  Heap left redzone:       fa
  Freed heap region:       fd
  Stack left redzone:      f1
  Stack mid redzone:       f2
  Stack right redzone:     f3
  Stack after return:      f5
  Stack use after scope:   f8
  Global redzone:          f9
  Global init order:       f6
  Poisoned by user:        f7
  Container overflow:      fc
  Array cookie:            ac
  Intra object redzone:    bb
  ASan internal:           fe
  Left alloca redzone:     ca
  Right alloca redzone:    cb
==7942==ABORTING

Which seems to be the same issue as in systemd/systemd#30141. I'll backport the fix (systemd/systemd@ca13432) on top of this branch, as it's pretty simple & straightforward.

Fixes #30141.

(cherry picked from commit ca13432d600593b8eda76721118763b63746eb33)

Related: RHEL-93425
@github-actions github-actions bot removed the pr/needs-ci Formerly needs-ci label Jan 26, 2026
Copy link
Member

@dtardon dtardon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@github-actions github-actions bot removed the pr/needs-review Formerly needs-review label Jan 28, 2026
@jamacku jamacku merged commit e4279ca into redhat-plumbers:rhel-8.10.0 Jan 30, 2026
9 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants