Commit 1db5136
committed
improve compression ratio of small alphabets
fix #3328
In situations where the alphabet size is very small,
the evaluation of literal costs from the Optimal Parser is initially incorrect.
It takes some time to converge, during which compression is less efficient.
This is especially important for small files,
because there will not be enough data to converge,
so most of the parsing is selected based on incorrect metrics.
After this patch, the scenario ##3328 gets fixed,
delivering the expected 29 bytes compressed size (smallest known compressed size).1 parent 0694f14 commit 1db5136
File tree
5 files changed
+76
-40
lines changed- lib/compress
5 files changed
+76
-40
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
1253 | 1253 | | |
1254 | 1254 | | |
1255 | 1255 | | |
1256 | | - | |
| 1256 | + | |
| 1257 | + | |
| 1258 | + | |
| 1259 | + | |
| 1260 | + | |
| 1261 | + | |
| 1262 | + | |
| 1263 | + | |
1257 | 1264 | | |
1258 | 1265 | | |
1259 | 1266 | | |
| |||
1266 | 1273 | | |
1267 | 1274 | | |
1268 | 1275 | | |
| 1276 | + | |
1269 | 1277 | | |
1270 | 1278 | | |
1271 | 1279 | | |
| 1280 | + | |
1272 | 1281 | | |
1273 | 1282 | | |
1274 | 1283 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
2661 | 2661 | | |
2662 | 2662 | | |
2663 | 2663 | | |
2664 | | - | |
2665 | 2664 | | |
2666 | | - | |
2667 | | - | |
2668 | | - | |
2669 | 2665 | | |
2670 | 2666 | | |
2671 | 2667 | | |
2672 | | - | |
| 2668 | + | |
| 2669 | + | |
| 2670 | + | |
| 2671 | + | |
2673 | 2672 | | |
2674 | 2673 | | |
2675 | 2674 | | |
| |||
3688 | 3687 | | |
3689 | 3688 | | |
3690 | 3689 | | |
| 3690 | + | |
3691 | 3691 | | |
3692 | 3692 | | |
3693 | 3693 | | |
3694 | 3694 | | |
3695 | 3695 | | |
3696 | | - | |
3697 | 3696 | | |
3698 | 3697 | | |
3699 | 3698 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
92 | 92 | | |
93 | 93 | | |
94 | 94 | | |
95 | | - | |
96 | | - | |
97 | | - | |
98 | | - | |
99 | | - | |
100 | | - | |
101 | | - | |
102 | | - | |
| 95 | + | |
| 96 | + | |
| 97 | + | |
| 98 | + | |
| 99 | + | |
| 100 | + | |
| 101 | + | |
| 102 | + | |
| 103 | + | |
| 104 | + | |
| 105 | + | |
| 106 | + | |
| 107 | + | |
| 108 | + | |
| 109 | + | |
| 110 | + | |
| 111 | + | |
| 112 | + | |
| 113 | + | |
| 114 | + | |
| 115 | + | |
| 116 | + | |
| 117 | + | |
| 118 | + | |
| 119 | + | |
| 120 | + | |
| 121 | + | |
| 122 | + | |
| 123 | + | |
| 124 | + | |
103 | 125 | | |
104 | | - | |
105 | 126 | | |
106 | 127 | | |
107 | 128 | | |
| |||
119 | 140 | | |
120 | 141 | | |
121 | 142 | | |
122 | | - | |
123 | | - | |
124 | | - | |
125 | | - | |
126 | | - | |
| 143 | + | |
| 144 | + | |
| 145 | + | |
127 | 146 | | |
128 | 147 | | |
129 | 148 | | |
130 | 149 | | |
| 150 | + | |
131 | 151 | | |
132 | 152 | | |
133 | 153 | | |
| |||
146 | 166 | | |
147 | 167 | | |
148 | 168 | | |
149 | | - | |
150 | | - | |
151 | | - | |
152 | | - | |
| 169 | + | |
| 170 | + | |
| 171 | + | |
| 172 | + | |
| 173 | + | |
153 | 174 | | |
154 | 175 | | |
155 | 176 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
18 | 18 | | |
19 | 19 | | |
20 | 20 | | |
21 | | - | |
22 | | - | |
23 | | - | |
24 | | - | |
25 | | - | |
| 21 | + | |
| 22 | + | |
| 23 | + | |
| 24 | + | |
| 25 | + | |
| 26 | + | |
26 | 27 | | |
27 | 28 | | |
28 | | - | |
29 | | - | |
| 29 | + | |
| 30 | + | |
| 31 | + | |
| 32 | + | |
| 33 | + | |
30 | 34 | | |
31 | 35 | | |
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
16 | 16 | | |
17 | 17 | | |
18 | 18 | | |
19 | | - | |
| 19 | + | |
20 | 20 | | |
21 | 21 | | |
22 | 22 | | |
| |||
88 | 88 | | |
89 | 89 | | |
90 | 90 | | |
91 | | - | |
| 91 | + | |
| 92 | + | |
| 93 | + | |
| 94 | + | |
92 | 95 | | |
93 | 96 | | |
94 | 97 | | |
95 | 98 | | |
96 | 99 | | |
97 | | - | |
| 100 | + | |
| 101 | + | |
98 | 102 | | |
99 | 103 | | |
100 | 104 | | |
101 | 105 | | |
102 | 106 | | |
103 | 107 | | |
104 | | - | |
| 108 | + | |
105 | 109 | | |
106 | 110 | | |
107 | 111 | | |
| |||
110 | 114 | | |
111 | 115 | | |
112 | 116 | | |
113 | | - | |
| 117 | + | |
114 | 118 | | |
115 | 119 | | |
116 | 120 | | |
| |||
188 | 192 | | |
189 | 193 | | |
190 | 194 | | |
191 | | - | |
| 195 | + | |
192 | 196 | | |
193 | 197 | | |
194 | 198 | | |
195 | 199 | | |
196 | 200 | | |
197 | | - | |
| 201 | + | |
198 | 202 | | |
199 | 203 | | |
200 | 204 | | |
| |||
224 | 228 | | |
225 | 229 | | |
226 | 230 | | |
227 | | - | |
228 | 231 | | |
229 | 232 | | |
230 | 233 | | |
| |||
0 commit comments