Commit 4501bdb
committed
Normalize unicode internally using NFD
Previously, the path reservation system, which defends against unicode
path name collisions (the subject of a handful of past CVE issues), was
using NFKD normalization internally to determine of two paths would be
likely to reference the same file on disk.
This has the weird effect of normalizing things like `℀` into simple
decomposed character strings, for example `a/c`. These can contain
slashes and double-dot sections, which means that the path reservations
may end up reserving more (or different) paths than intended.
Thankfully, tar was already *extracting* properly, even if the path
reservations collided, and these collisions resulted in tar being *more*
aggressive than it should be in restricting parallel extraction, rather
than less. That's a good direction to err in, for security, but also,
made tar less efficient than it could be in some edge cases.
Using NFD normalization, unicode characters are not decomposed in
compatibility mode, but still result in matching path reservation keys
as intended.
This does not cause any change in observed behavior, other than allowing
some files to be extracted in parallel where it is provably safe to do
so.
Credit: discovered by @Sim4n6. This did not result in a juicy security
vulnerability, but it sure looked like one at first. They were extremely
patient, thorough, and persistent in trying to pin this down to a POC
and CVE. There is very little reward or visibility when a security
researcher finds a bug that doesn't result in a security disclosure, but
the attempt often results in improvements to the project.1 parent 24efc74 commit 4501bdb
File tree
5 files changed
+60
-3
lines changed- lib
- tap-snapshots/test
- test
5 files changed
+60
-3
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
6 | 6 | | |
7 | 7 | | |
8 | 8 | | |
9 | | - | |
| 9 | + | |
10 | 10 | | |
11 | 11 | | |
12 | 12 | | |
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
123 | 123 | | |
124 | 124 | | |
125 | 125 | | |
126 | | - | |
| 126 | + | |
127 | 127 | | |
128 | 128 | | |
129 | 129 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
105 | 105 | | |
106 | 106 | | |
107 | 107 | | |
108 | | - | |
| 108 | + | |
109 | 109 | | |
110 | 110 | | |
111 | 111 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
| 1 | + | |
| 2 | + | |
| 3 | + | |
| 4 | + | |
| 5 | + | |
| 6 | + | |
| 7 | + | |
| 8 | + | |
| 9 | + | |
| 10 | + | |
| 11 | + | |
| 12 | + | |
| 13 | + | |
| 14 | + | |
| 15 | + | |
| 16 | + | |
| 17 | + | |
| 18 | + | |
| 19 | + | |
| 20 | + | |
| 21 | + | |
| 22 | + | |
| 23 | + | |
| 24 | + | |
| 25 | + | |
| 26 | + | |
| 27 | + | |
| 28 | + | |
| 29 | + | |
| 30 | + | |
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
| 1 | + | |
1 | 2 | | |
2 | 3 | | |
| 4 | + | |
| 5 | + | |
3 | 6 | | |
4 | 7 | | |
5 | 8 | | |
| |||
10 | 13 | | |
11 | 14 | | |
12 | 15 | | |
| 16 | + | |
| 17 | + | |
| 18 | + | |
| 19 | + | |
| 20 | + | |
| 21 | + | |
| 22 | + | |
| 23 | + | |
| 24 | + | |
| 25 | + | |
| 26 | + | |
| 27 | + | |
| 28 | + | |
| 29 | + | |
| 30 | + | |
| 31 | + | |
| 32 | + | |
| 33 | + | |
| 34 | + | |
| 35 | + | |
| 36 | + | |
| 37 | + | |
| 38 | + | |
| 39 | + | |
0 commit comments