Skip to content
Closed
Show file tree
Hide file tree
Changes from 3 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
22 changes: 19 additions & 3 deletions lld/ELF/InputSection.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -20,8 +20,6 @@
#include "llvm/Support/Compression.h"
#include "llvm/Support/Endian.h"
#include "llvm/Support/xxhash.h"
#include <algorithm>
#include <mutex>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should keep the headers. mutex is used by std::mutex in this file.

#include <vector>

using namespace llvm;
Expand Down Expand Up @@ -888,6 +886,7 @@ void InputSection::relocateNonAlloc(uint8_t *buf, ArrayRef<RelTy> rels) {
const bool isDebug = isDebugSection(*this);
const bool isDebugLocOrRanges =
isDebug && (name == ".debug_loc" || name == ".debug_ranges");
const bool isDebugNames = isDebug && name == ".debug_names";
const bool isDebugLine = isDebug && name == ".debug_line";
std::optional<uint64_t> tombstone;
for (const auto &patAndValue : llvm::reverse(config->deadRelocInNonAlloc))
Expand Down Expand Up @@ -918,7 +917,8 @@ void InputSection::relocateNonAlloc(uint8_t *buf, ArrayRef<RelTy> rels) {
continue;

if (tombstone ||
(isDebug && (type == target.symbolicRel || expr == R_DTPREL))) {
((isDebug && (type == target.symbolicRel || expr == R_DTPREL))) ||
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would be nice to know from @MaskRay (or anyone else) why this type == target.symbolicRel is here, and if there's some way we can generalize it to cover the .debug_names case we're interested in (a R_X86_64_32 relocation) without needing to special case isDebugNames in this condition

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. :D As I mentioned, for DWARF64 isDebugNames check is not necessary.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While we're waiting for an expert opinion, if youre bored, you could try removing the type == target.symbolicRel check and running the lld tests, about 5 tests fail - investigating one fo those failures to see what goes wrong in the absence of that condition might help point the way to understanding how to generalize the condition enough to cover the .debug_names situation, without generalizing too far and breaking those tests

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well basically that condition checks if relocation is 64bit: R_X86_64_64 (type == 1).
Primarily used in .debug_addr. Incidentally why DWARF64 tests works without adding isDebugNames to that condition. since for .debug_names relocation type is controlled by DWARF32 vs DWARF64.
Slightly more interesting case is expr == R_DTPREL. Since pretty much all relocs in debug sections are ABS. Looks like that relocation is used to reference TLS symbols. Learned something new today.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, mechanically I understand what the check does.

What I mean is: Why is this check necessary? What bad things happen (some tests fail, but I haven't looked closely at what they're testing or what the bad behavior looks like) when this check fails/isn't checked? And why is it OK to ignore this check in the .debug_names case? And could we look at that reason and generalize the check so it's not specifically about .debug_names, but about whatever general property is true in the .debug_names case and would be true in other cases?

Basically, it seems the type == target.symbolicRel check is overly restrictive (we have at least one counter example - the .debug_names R_X86_64_32 situation) and I'd like to know ideally how to generically relax this check to account for our situation, but also others that may come up in the future/just not to have a weird .debug_names special case that we don't really understand/can't explain why it's acceptable here, but not there - and layering special cases makes the code harder to understand and maintain.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, I see what you mean. Yeah was going to dig in further today and see how to generalize. You are right it is over restrictive. Ideally it probably should be something like if (tombstone || debug), and then some logic depending on relocation type and probably section. I am hesitant to change current defaults due to downstream tools (cough lldb), breaking or something.

Copy link
Contributor Author

@ayermolo ayermolo Nov 2, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK I think I understand the history here....
My resources:
https://sourceware.org/pipermail/binutils/2020-May/111357.html
https://lists.llvm.org/pipermail/llvm-dev/2020-May/141885.html
https://reviews.llvm.org/D81784
https://reviews.llvm.org/D84825

TLDR on this whole thing. This code primarily deals with an issue of what to do with symbols in debug sections when text sections they reference get GC/ICF. Those addresses need to be tombsoned. The 0x0 doesn't work in .debug_ranges/.debug_loc because before DWARF5 0x0,0x0 is how those lists ended. The -1 doesn't work because it has special meaning. So first -2 was implemented, but that had issues also, so then it was changed to 1 for loc/ranges and 0 for "everything else".

Except since relocation type allowed (type == target.symbolicRel) is 64 bit "everything else" is .debug_addr section and .debug_info with form DW_FORM_addr. Since in 32 bit dwarf only thing that is 64bit are addresses in .text section. At some point expr == R_DTPREL was added for TLS stuff.

The 32bit relocs were left alone to follow the same path as others and get assigned 0x0 + addend. Seems like mostly for "if it's not broken" reasons.
Although for non-debug non-alloc can have pointer subtraction:
https://lists.llvm.org/pipermail/llvm-dev/2020-May/141918.html

Anyway. IF my understanding is correct I don't see why we can't generalize for all debug sections to

  1. Tombstone to 1 for ranges/loc
  2. MAXINT for .debug_names
  3. 0 for everything else.

This will leave current behavior for cases that really matter, GC/ICF, and for relative relocations (32 bit outside of .debug_names), I don't think it should matter?

Hopefully I didn't miss anything. :)

@MaskRay WDYT?

isDebugNames) {
// Resolve relocations in .debug_* referencing (discarded symbols or ICF
// folded section symbols) to a tombstone value. Resolving to addend is
// unsatisfactory because the result address range may collide with a
Expand Down Expand Up @@ -947,6 +947,22 @@ void InputSection::relocateNonAlloc(uint8_t *buf, ArrayRef<RelTy> rels) {
//
// TODO To reduce disruption, we use 0 instead of -1 as the tombstone
// value. Enable -1 in a future release.

if (isDebugNames && dyn_cast<Undefined>(&sym)) {
uint64_t maxVal = 0;
switch (type) {
case R_X86_64_64:
maxVal = llvm::maxUIntN(64);
break;
case R_X86_64_32:
maxVal = llvm::maxUIntN(32);
break;
default:
llvm_unreachable("Unsupported relocation type in .debug_names.");
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Be nice to hear from @MaskRay or otherwise about whether we can generalize this in some way.

@ayermolo perhaps you can check the commit history to see how the -1 as tombstone worked - it was committed a few years back, so there might be some hint about how to do that that doesn't require special casing each relocation

target.relocateNoSym(bufLoc, type, SignExtend64<bits>(maxVal));
continue;
}
auto *ds = dyn_cast<Defined>(&sym);
if (!sym.getOutputSection() || (ds && ds->folded && !isDebugLine)) {
// If -z dead-reloc-in-nonalloc= is specified, respect it.
Expand Down
397 changes: 397 additions & 0 deletions lld/test/ELF/Inputs/dwarf5-64-debug-names-type-comdat-helper.s

Large diffs are not rendered by default.

384 changes: 384 additions & 0 deletions lld/test/ELF/Inputs/dwarf5-64-debug-names-type-comdat-main.s

Large diffs are not rendered by default.

410 changes: 410 additions & 0 deletions lld/test/ELF/Inputs/dwarf5-debug-names-type-comdat-helper.s

Large diffs are not rendered by default.

397 changes: 397 additions & 0 deletions lld/test/ELF/Inputs/dwarf5-debug-names-type-comdat-main.s

Large diffs are not rendered by default.

20 changes: 20 additions & 0 deletions lld/test/ELF/x86-64-dwarf5-64-debug-names-type-comdat.test
Original file line number Diff line number Diff line change
@@ -0,0 +1,20 @@
// REQUIRES: x86
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

// RUN: llvm-mc -filetype=obj -triple=x86_64-unknown-linux %p/Inputs/dwarf5-64-debug-names-type-comdat-main.s -o %t.o
// RUN: llvm-mc -filetype=obj -triple=x86_64-unknown-linux %p/Inputs/dwarf5-64-debug-names-type-comdat-helper.s -o %t1.o
// RUN: ld.lld %t.o %t1.o -o %t1
// RUN: llvm-readelf --relocs %t.o | FileCheck --check-prefix=RELOCMAIN %s
// RUN: llvm-dwarfdump --debug-names %t1 | FileCheck %s

// Test checks that LLD tombstones TU section that was de-duplicated using COMDAT to the maxium value.
// For some reason llvm-dwarfdump doesn't print out full 64bitt offset at the moment.
// Working around it by checking that relocation is 64 bit.

// RELOCMAIN: rela.debug_names
// RELOCMAIN-NEXT: Offset
// RELOCMAIN-NEXT: R_X86_64_64
// RELOCMAIN-SAME: .debug_info + 0
// RELOCMAIN-NEXT: R_X86_64_64
// RELOCMAIN-SAME: .debug_info + 0

// CHECK: LocalTU[0]: 0x00000000
// CHECK: LocalTU[0]: 0xffffffffffffffff
10 changes: 10 additions & 0 deletions lld/test/ELF/x86-64-dwarf5-debug-names-type-comdat.test
Original file line number Diff line number Diff line change
@@ -0,0 +1,10 @@
// REQUIRES: x86
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In newer tests we make the comment (non-RUN non-CHECK) marker outstanding. We typically use # RUN: # CHECK: and ## Regular comments.

// RUN: llvm-mc -filetype=obj -triple=x86_64-unknown-linux %p/Inputs/dwarf5-debug-names-type-comdat-main.s -o %t.o
// RUN: llvm-mc -filetype=obj -triple=x86_64-unknown-linux %p/Inputs/dwarf5-debug-names-type-comdat-helper.s -o %t1.o
// RUN: ld.lld %t.o %t1.o -o %t1
// RUN: llvm-dwarfdump --debug-names %t1 | FileCheck %s

// Test checks that LLD tombstones TU section that was de-duplicated using COMDAT to the maxium value.

// CHECK: LocalTU[0]: 0x00000000
// CHECK: LocalTU[0]: 0xffffffff