Implement src:*=sanitize for UBSan#139772
Conversation
|
@llvm/pr-subscribers-clang Author: Qinkun Bao (qinkunbao) ChangesIt is a draft implementation for "src:*=sanitize". It should be applied to all sanitizers. Any srcs assigned to the sanitize category will have their sanitizer instrumentation remained ignored by "src:". For example,
However
Full diff: https://github.com/llvm/llvm-project/pull/139772.diff 2 Files Affected:
diff --git a/clang/lib/Basic/NoSanitizeList.cpp b/clang/lib/Basic/NoSanitizeList.cpp
index e7e63c1f419e6..c4db5a5a211d2 100644
--- a/clang/lib/Basic/NoSanitizeList.cpp
+++ b/clang/lib/Basic/NoSanitizeList.cpp
@@ -44,7 +44,8 @@ bool NoSanitizeList::containsFunction(SanitizerMask Mask,
bool NoSanitizeList::containsFile(SanitizerMask Mask, StringRef FileName,
StringRef Category) const {
- return SSCL->inSection(Mask, "src", FileName, Category);
+ bool allowList = SSCL->inSection(Mask, "src", FileName, "sanitize");
+ return SSCL->inSection(Mask, "src", FileName, Category) && !allowList;
}
bool NoSanitizeList::containsMainFile(SanitizerMask Mask, StringRef FileName,
diff --git a/clang/test/CodeGen/ubsan-src-ignorelist-category.test b/clang/test/CodeGen/ubsan-src-ignorelist-category.test
new file mode 100644
index 0000000000000..5242c10bdeec7
--- /dev/null
+++ b/clang/test/CodeGen/ubsan-src-ignorelist-category.test
@@ -0,0 +1,23 @@
+// RUN: rm -rf %t
+// RUN: split-file %s %t
+// RUN: %clang_cc1 -triple x86_64-linux-gnu -fsanitize=signed-integer-overflow -fsanitize-ignorelist=%t/src.ignorelist -emit-llvm %t/test1.c -o - | FileCheck %s -check-prefix=CHECK-ALLOWLIST
+// RUN: %clang_cc1 -triple x86_64-linux-gnu -fsanitize=signed-integer-overflow -fsanitize-ignorelist=%t/src.ignorelist -emit-llvm %t/test2.c -o - | FileCheck %s -check-prefix=CHECK-IGNORELIST
+
+
+// Verify ubsan only emits checks for files in the allowlist
+
+//--- src.ignorelist
+src:*
+src:*/test1.c=sanitize
+
+//--- test1.c
+int add1(int a, int b) {
+// CHECK-ALLOWLIST: llvm.sadd.with.overflow.i32
+ return a+b;
+}
+
+//--- test2.c
+int add2(int a, int b) {
+// CHECK-IGNORELIST-NOT: llvm.sadd.with.overflow.i32
+ return a+b;
+}
|
|
Reminder to @thurstond as well :) Please figure out https://llvm.org/docs/GitHub.html#stacked-pull-requests |
| // Verify ubsan only emits checks for files in the allowlist | ||
|
|
||
| //--- src.ignorelist | ||
| src:* |
There was a problem hiding this comment.
Looks like we have a problem with:
src:*
src:*/mylib/*=sanitize
src:*/mylib/test.cc
There was a problem hiding this comment.
So I have some ideas how to fix this:
- Right now order does not matter, if we say that it's matter now, we don't break existing stuff
- Let's make =sanitize reset all above and =
- So inSection need to return linenum in file which will compare with linenum of =sanitize. There is a version of function for that
Problem: SanitizerSpecialCaseList tracks linenum, but if you check parsing code, it can represent multiple files!
I suggest to start with updating SpecialCaseList to track (fileindex, linenum).
But, SpecialCaseList::Sections for whatever reasons is StringMap,
so it's ordered by name, and it's not used anyway.
Let's first PR to make SpecialCaseList::Sections -> vector
There was a problem hiding this comment.
Looks like we have a problem with:
src:* src:*/mylib/*=sanitize src:*/mylib/test.cc
For this example, I think /mylib/test.cc will be instrumented because the file /mylib/test.cc will fall into two categories (= and =sanitize) with this PR. And it should follow the =sanitizer category. I think it should match the current doc. (I have updated the test to cover the situation. )
I was a little hesitated about the change during implementing the feature. For example, if we have a ignore list file
type:int
src:test.cc=sanitize
If a file test.cc has the type int, shall we still instrument the type int variable in the file test.cc?
On the other hand,
src:test.cc
type:int=sanitize
Shall we still instrument the type int variable in the file test.cc?
I think the both answers are yes. Let me know if it makes sense to you.
I will create a new PR with using user branches in llvm/llvm-project to update SpecialCaseList::Sections. Thank you for the feedback.
There was a problem hiding this comment.
But, SpecialCaseList::Sections for whatever reasons is StringMap,
so it's ordered by name, and it's not used anyway.
Oh, I find the key of the StringMap Sections is actually used. Here is the Link. Here are some options. Let me know which option is better.
- How about creating a vector<Section*> to track the order?
StringMap<Section> Sections;
vector<Section*> SectionsVec;
- We can use the key from
Globs
508f81a to
99dfb74
Compare
99dfb74 to
fcae81c
Compare
…tringMap to vector. As discussed in llvm/llvm-project#139772, SpecialCaseList::Sections can keep the order of Sections when parsing the case list. Reviewers: thurstond, vitalybuka Reviewed By: vitalybuka Pull Request: llvm/llvm-project#140127
… vector. (#140964) As discussed in llvm/llvm-project#139772 and llvm/llvm-project#140529, Matcher::Globs can keep the order when parsing the case list.
|
Close this PR as the implementation goes to other PRs. |
It is a draft implementation for "src:*=sanitize". It should be applied to all sanitizers.
Any srcs assigned to the sanitize category will have their sanitizer instrumentation remained ignored by "src:". For example,
test1.cwill still have the UBSan instrumented.However
test1.cdoes not have the int type check.