Commit bcf2e5e
committed
Increase limit for finding PE files embedded in other PE files
I am seeing missed detections since we changed to prohibit embedded
file type identification when inside an embedded file.
In particular, I'm seeing this issue with PE files that contain multiple
other MSEXE as well as a variety of false positives for PE file headers.
For example, imagine a PE with four concatenated DLL's, like so:
```
[ EXE file | DLL #1 | DLL #2 | DLL #3 | DLL #4 ]
```
And note that false positives for embedded MSEXE files are fairly common.
So there may be a few mixed in there.
Before limiting embedded file identification we might interpret the file
structure something like this:
```
MSEXE: {
embedded MSEXE #1: false positive,
embedded MSEXE #2: false positive,
embedded MSEXE #3: false positive,
embedded MSEXE #4: DLL #1: {
embedded MSEXE #1: false positive,
embedded MSEXE #2: DLL #2: {
embedded MSEXE #1: DLL #3: {
embedded MSEXE #1: false positive,
embedded MSEXE #2: false positive,
embedded MSEXE #3: false positive,
embedded MSEXE #4: false positive,
embedded MSEXE #5: DLL #4
}
embedded MSEXE #2: false positive,
embedded MSEXE #3: false positive,
embedded MSEXE #4: false positive,
embedded MSEXE #5: false positive,
embedded MSEXE #6: DLL #4
}
embedded MSEXE #3: DLL #3,
embedded MSEXE #4: false positive,
embedded MSEXE #5: false positive,
embedded MSEXE #6: false positive,
embedded MSEXE #7: false positive,
embedded MSEXE #8: DLL #4
}
}
```
This is obviously terrible, which is why why we don't allow detecting
embedded files within other embedded files.
So after we enforce that limit, the same file may be interpreted like
this instead:
```
MSEXE: {
embedded MSEXE #1: false positive,
embedded MSEXE #2: false positive,
embedded MSEXE #3: false positive,
embedded MSEXE #4: DLL #1,
embedded MSEXE #5: false positive,
embedded MSEXE #6: DLL #2,
embedded MSEXE #7: DLL #3,
embedded MSEXE #8: false positive,
embedded MSEXE #9: false positive,
embedded MSEXE #10: false positive,
embedded MSEXE #11: false positive,
embedded MSEXE #12: DLL #4
}
```
That's great! Except that we now exceed the "MAX_EMBEDDED_OBJ" limit
for embedded type matches (limit 10, but 12 found). That means we won't
see or extract the 4th DLL anymore.
My solution is to lift the limit when adding an matched MSEXE type.
We already do this for matched ZIPSFX types.
While doing this, I've significantly tidied up the limits checks to
make it more readble, and removed duplicate checks from within the
`ac_addtype()` function.
CLAM-28971 parent 06bf061 commit bcf2e5e
1 file changed
+143
-30
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
833 | 833 | | |
834 | 834 | | |
835 | 835 | | |
836 | | - | |
837 | | - | |
| 836 | + | |
| 837 | + | |
| 838 | + | |
838 | 839 | | |
839 | 840 | | |
840 | 841 | | |
| |||
1625 | 1626 | | |
1626 | 1627 | | |
1627 | 1628 | | |
1628 | | - | |
1629 | | - | |
| 1629 | + | |
| 1630 | + | |
| 1631 | + | |
| 1632 | + | |
| 1633 | + | |
| 1634 | + | |
| 1635 | + | |
| 1636 | + | |
| 1637 | + | |
| 1638 | + | |
| 1639 | + | |
| 1640 | + | |
1630 | 1641 | | |
1631 | | - | |
| 1642 | + | |
1632 | 1643 | | |
1633 | | - | |
1634 | | - | |
1635 | | - | |
1636 | | - | |
1637 | | - | |
1638 | | - | |
1639 | | - | |
1640 | | - | |
| 1644 | + | |
| 1645 | + | |
1641 | 1646 | | |
1642 | 1647 | | |
1643 | 1648 | | |
1644 | 1649 | | |
1645 | 1650 | | |
1646 | 1651 | | |
1647 | 1652 | | |
1648 | | - | |
1649 | | - | |
1650 | | - | |
| 1653 | + | |
| 1654 | + | |
| 1655 | + | |
1651 | 1656 | | |
1652 | | - | |
| 1657 | + | |
| 1658 | + | |
| 1659 | + | |
1653 | 1660 | | |
1654 | | - | |
| 1661 | + | |
| 1662 | + | |
1655 | 1663 | | |
| 1664 | + | |
1656 | 1665 | | |
1657 | 1666 | | |
| 1667 | + | |
| 1668 | + | |
| 1669 | + | |
| 1670 | + | |
| 1671 | + | |
1658 | 1672 | | |
1659 | 1673 | | |
1660 | 1674 | | |
| |||
1999 | 2013 | | |
2000 | 2014 | | |
2001 | 2015 | | |
2002 | | - | |
2003 | | - | |
2004 | | - | |
2005 | | - | |
2006 | | - | |
2007 | | - | |
2008 | | - | |
2009 | | - | |
| 2016 | + | |
| 2017 | + | |
| 2018 | + | |
| 2019 | + | |
| 2020 | + | |
| 2021 | + | |
| 2022 | + | |
| 2023 | + | |
| 2024 | + | |
| 2025 | + | |
| 2026 | + | |
| 2027 | + | |
| 2028 | + | |
| 2029 | + | |
| 2030 | + | |
| 2031 | + | |
| 2032 | + | |
| 2033 | + | |
| 2034 | + | |
| 2035 | + | |
| 2036 | + | |
| 2037 | + | |
| 2038 | + | |
| 2039 | + | |
| 2040 | + | |
| 2041 | + | |
| 2042 | + | |
| 2043 | + | |
| 2044 | + | |
| 2045 | + | |
| 2046 | + | |
| 2047 | + | |
| 2048 | + | |
| 2049 | + | |
| 2050 | + | |
| 2051 | + | |
| 2052 | + | |
| 2053 | + | |
| 2054 | + | |
| 2055 | + | |
| 2056 | + | |
| 2057 | + | |
| 2058 | + | |
| 2059 | + | |
| 2060 | + | |
| 2061 | + | |
| 2062 | + | |
| 2063 | + | |
| 2064 | + | |
| 2065 | + | |
| 2066 | + | |
| 2067 | + | |
| 2068 | + | |
| 2069 | + | |
| 2070 | + | |
| 2071 | + | |
| 2072 | + | |
| 2073 | + | |
| 2074 | + | |
2010 | 2075 | | |
2011 | 2076 | | |
2012 | 2077 | | |
| |||
2066 | 2131 | | |
2067 | 2132 | | |
2068 | 2133 | | |
2069 | | - | |
2070 | | - | |
2071 | 2134 | | |
2072 | | - | |
2073 | | - | |
| 2135 | + | |
| 2136 | + | |
| 2137 | + | |
| 2138 | + | |
| 2139 | + | |
| 2140 | + | |
| 2141 | + | |
| 2142 | + | |
| 2143 | + | |
| 2144 | + | |
| 2145 | + | |
| 2146 | + | |
| 2147 | + | |
| 2148 | + | |
| 2149 | + | |
| 2150 | + | |
| 2151 | + | |
| 2152 | + | |
| 2153 | + | |
| 2154 | + | |
| 2155 | + | |
| 2156 | + | |
| 2157 | + | |
| 2158 | + | |
| 2159 | + | |
| 2160 | + | |
| 2161 | + | |
| 2162 | + | |
| 2163 | + | |
| 2164 | + | |
| 2165 | + | |
| 2166 | + | |
| 2167 | + | |
| 2168 | + | |
| 2169 | + | |
| 2170 | + | |
| 2171 | + | |
| 2172 | + | |
| 2173 | + | |
| 2174 | + | |
| 2175 | + | |
| 2176 | + | |
| 2177 | + | |
| 2178 | + | |
| 2179 | + | |
| 2180 | + | |
| 2181 | + | |
| 2182 | + | |
| 2183 | + | |
| 2184 | + | |
| 2185 | + | |
| 2186 | + | |
2074 | 2187 | | |
2075 | 2188 | | |
2076 | 2189 | | |
| |||
0 commit comments