Skip to content

Conversation

@val-ms
Copy link
Owner

@val-ms val-ms commented Jun 16, 2021

We would like to switch from Bugzilla to Github Issues. This will make
issue reporting more accessible (more folks have a Github account than a
bugzilla.clamav.net account) in addition to the benefits of a more
modern issue tracker.

However, GitHub Issues reports are always public. Vulnerability reports
will have to go somewhere else. The preferred option is to report them
to Cisco PSIRT after which PSIRT will coordinate with the ClamAV team
and the reporter to resolve the issue.

We would like to switch from Bugzilla to Github Issues. This will make
issue reporting more accessible (more folks have a Github account than a
bugzilla.clamav.net account) in addition to the benefits of a more
modern issue tracker.

However, GitHub Issues reports are always public. Vulnerability reports
will have to go somewhere else. The preferred option is to report them
to Cisco PSIRT after which PSIRT will coordinate with the ClamAV team
and the reporter to resolve the issue.
@val-ms val-ms merged commit ce744f1 into dev/0.104 Jun 24, 2021
val-ms pushed a commit that referenced this pull request Jun 24, 2021
@val-ms val-ms deleted the proposed-security-policy-for-enabling-github-issues branch February 24, 2023 05:27
val-ms added a commit that referenced this pull request Dec 23, 2024
…objstm-1.0.7

Fix possible out of bounds read in PDF parser (1.0.7)
val-ms added a commit that referenced this pull request Oct 14, 2025
I am seeing missed detections since we changed to prohibit embedded
file type identification when inside an embedded file.
In particular, I'm seeing this issue with PE files that contain multiple
other MSEXE as well as a variety of false positives for PE file headers.

For example, imagine a PE with four concatenated DLL's, like so:
```
  [ EXE file   | DLL #1  | DLL #2  | DLL #3  | DLL Cisco-Talos#4 ]
```

And note that false positives for embedded MSEXE files are fairly common.
So there may be a few mixed in there.

Before limiting embedded file identification we might interpret the file
structure something like this:
```
MSEXE: {
  embedded MSEXE #1: false positive,
  embedded MSEXE #2: false positive,
  embedded MSEXE #3: false positive,
  embedded MSEXE Cisco-Talos#4: DLL #1: {
    embedded MSEXE #1: false positive,
    embedded MSEXE #2: DLL #2: {
      embedded MSEXE #1: DLL #3: {
        embedded MSEXE #1: false positive,
        embedded MSEXE #2: false positive,
        embedded MSEXE #3: false positive,
        embedded MSEXE Cisco-Talos#4: false positive,
        embedded MSEXE Cisco-Talos#5: DLL Cisco-Talos#4
      }
      embedded MSEXE #2: false positive,
      embedded MSEXE #3: false positive,
      embedded MSEXE Cisco-Talos#4: false positive,
      embedded MSEXE Cisco-Talos#5: false positive,
      embedded MSEXE Cisco-Talos#6: DLL Cisco-Talos#4
    }
    embedded MSEXE #3: DLL #3,
    embedded MSEXE Cisco-Talos#4: false positive,
    embedded MSEXE Cisco-Talos#5: false positive,
    embedded MSEXE Cisco-Talos#6: false positive,
    embedded MSEXE Cisco-Talos#7: false positive,
    embedded MSEXE Cisco-Talos#8: DLL Cisco-Talos#4
  }
}
```

This is obviously terrible, which is why why we don't allow detecting
embedded files within other embedded files.
So after we enforce that limit, the same file may be interpreted like
this instead:
```
MSEXE: {
  embedded MSEXE #1:  false positive,
  embedded MSEXE #2:  false positive,
  embedded MSEXE #3:  false positive,
  embedded MSEXE Cisco-Talos#4:  DLL #1,
  embedded MSEXE Cisco-Talos#5:  false positive,
  embedded MSEXE Cisco-Talos#6:  DLL #2,
  embedded MSEXE Cisco-Talos#7:  DLL #3,
  embedded MSEXE Cisco-Talos#8:  false positive,
  embedded MSEXE Cisco-Talos#9:  false positive,
  embedded MSEXE Cisco-Talos#10: false positive,
  embedded MSEXE Cisco-Talos#11: false positive,
  embedded MSEXE Cisco-Talos#12: DLL Cisco-Talos#4
}
```

That's great! Except that we now exceed the "MAX_EMBEDDED_OBJ" limit
for embedded type matches (limit 10, but 12 found). That means we won't
see or extract the 4th DLL anymore.

My solution is to lift the limit when adding an matched MSEXE type.
We already do this for matched ZIPSFX types.
While doing this, I've significantly tidied up the limits checks to
make it more readble, and removed duplicate checks from within the
`ac_addtype()` function.

CLAM-2897
val-ms added a commit that referenced this pull request Oct 14, 2025
I am seeing missed detections since we changed to prohibit embedded
file type identification when inside an embedded file.
In particular, I'm seeing this issue with PE files that contain multiple
other MSEXE as well as a variety of false positives for PE file headers.

For example, imagine a PE with four concatenated DLL's, like so:
```
  [ EXE file   | DLL #1  | DLL #2  | DLL #3  | DLL Cisco-Talos#4 ]
```

And note that false positives for embedded MSEXE files are fairly common.
So there may be a few mixed in there.

Before limiting embedded file identification we might interpret the file
structure something like this:
```
MSEXE: {
  embedded MSEXE #1: false positive,
  embedded MSEXE #2: false positive,
  embedded MSEXE #3: false positive,
  embedded MSEXE Cisco-Talos#4: DLL #1: {
    embedded MSEXE #1: false positive,
    embedded MSEXE #2: DLL #2: {
      embedded MSEXE #1: DLL #3: {
        embedded MSEXE #1: false positive,
        embedded MSEXE #2: false positive,
        embedded MSEXE #3: false positive,
        embedded MSEXE Cisco-Talos#4: false positive,
        embedded MSEXE Cisco-Talos#5: DLL Cisco-Talos#4
      }
      embedded MSEXE #2: false positive,
      embedded MSEXE #3: false positive,
      embedded MSEXE Cisco-Talos#4: false positive,
      embedded MSEXE Cisco-Talos#5: false positive,
      embedded MSEXE Cisco-Talos#6: DLL Cisco-Talos#4
    }
    embedded MSEXE #3: DLL #3,
    embedded MSEXE Cisco-Talos#4: false positive,
    embedded MSEXE Cisco-Talos#5: false positive,
    embedded MSEXE Cisco-Talos#6: false positive,
    embedded MSEXE Cisco-Talos#7: false positive,
    embedded MSEXE Cisco-Talos#8: DLL Cisco-Talos#4
  }
}
```

This is obviously terrible, which is why why we don't allow detecting
embedded files within other embedded files.
So after we enforce that limit, the same file may be interpreted like
this instead:
```
MSEXE: {
  embedded MSEXE #1:  false positive,
  embedded MSEXE #2:  false positive,
  embedded MSEXE #3:  false positive,
  embedded MSEXE Cisco-Talos#4:  DLL #1,
  embedded MSEXE Cisco-Talos#5:  false positive,
  embedded MSEXE Cisco-Talos#6:  DLL #2,
  embedded MSEXE Cisco-Talos#7:  DLL #3,
  embedded MSEXE Cisco-Talos#8:  false positive,
  embedded MSEXE Cisco-Talos#9:  false positive,
  embedded MSEXE Cisco-Talos#10: false positive,
  embedded MSEXE Cisco-Talos#11: false positive,
  embedded MSEXE Cisco-Talos#12: DLL Cisco-Talos#4
}
```

That's great! Except that we now exceed the "MAX_EMBEDDED_OBJ" limit
for embedded type matches (limit 10, but 12 found). That means we won't
see or extract the 4th DLL anymore.

My solution is to lift the limit when adding an matched MSEXE type.
We already do this for matched ZIPSFX types.
While doing this, I've significantly tidied up the limits checks to
make it more readble, and removed duplicate checks from within the
`ac_addtype()` function.

CLAM-2897
val-ms added a commit that referenced this pull request Oct 16, 2025
I am seeing missed detections since we changed to prohibit embedded
file type identification when inside an embedded file.
In particular, I'm seeing this issue with PE files that contain multiple
other MSEXE as well as a variety of false positives for PE file headers.

For example, imagine a PE with four concatenated DLL's, like so:
```
  [ EXE file   | DLL #1  | DLL #2  | DLL #3  | DLL Cisco-Talos#4 ]
```

And note that false positives for embedded MSEXE files are fairly common.
So there may be a few mixed in there.

Before limiting embedded file identification we might interpret the file
structure something like this:
```
MSEXE: {
  embedded MSEXE #1: false positive,
  embedded MSEXE #2: false positive,
  embedded MSEXE #3: false positive,
  embedded MSEXE Cisco-Talos#4: DLL #1: {
    embedded MSEXE #1: false positive,
    embedded MSEXE #2: DLL #2: {
      embedded MSEXE #1: DLL #3: {
        embedded MSEXE #1: false positive,
        embedded MSEXE #2: false positive,
        embedded MSEXE #3: false positive,
        embedded MSEXE Cisco-Talos#4: false positive,
        embedded MSEXE Cisco-Talos#5: DLL Cisco-Talos#4
      }
      embedded MSEXE #2: false positive,
      embedded MSEXE #3: false positive,
      embedded MSEXE Cisco-Talos#4: false positive,
      embedded MSEXE Cisco-Talos#5: false positive,
      embedded MSEXE Cisco-Talos#6: DLL Cisco-Talos#4
    }
    embedded MSEXE #3: DLL #3,
    embedded MSEXE Cisco-Talos#4: false positive,
    embedded MSEXE Cisco-Talos#5: false positive,
    embedded MSEXE Cisco-Talos#6: false positive,
    embedded MSEXE Cisco-Talos#7: false positive,
    embedded MSEXE Cisco-Talos#8: DLL Cisco-Talos#4
  }
}
```

This is obviously terrible, which is why why we don't allow detecting
embedded files within other embedded files.
So after we enforce that limit, the same file may be interpreted like
this instead:
```
MSEXE: {
  embedded MSEXE #1:  false positive,
  embedded MSEXE #2:  false positive,
  embedded MSEXE #3:  false positive,
  embedded MSEXE Cisco-Talos#4:  DLL #1,
  embedded MSEXE Cisco-Talos#5:  false positive,
  embedded MSEXE Cisco-Talos#6:  DLL #2,
  embedded MSEXE Cisco-Talos#7:  DLL #3,
  embedded MSEXE Cisco-Talos#8:  false positive,
  embedded MSEXE Cisco-Talos#9:  false positive,
  embedded MSEXE Cisco-Talos#10: false positive,
  embedded MSEXE Cisco-Talos#11: false positive,
  embedded MSEXE Cisco-Talos#12: DLL Cisco-Talos#4
}
```

That's great! Except that we now exceed the "MAX_EMBEDDED_OBJ" limit
for embedded type matches (limit 10, but 12 found). That means we won't
see or extract the 4th DLL anymore.

My solution is to lift the limit when adding an matched MSEXE type.
We already do this for matched ZIPSFX types.
While doing this, I've significantly tidied up the limits checks to
make it more readble, and removed duplicate checks from within the
`ac_addtype()` function.

CLAM-2897
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants