Skip to content

Conversation

@G-Rath
Copy link
Collaborator

@G-Rath G-Rath commented Aug 25, 2025

This is an attempt to make our CI easier to work with day-to-day by introducing go-vcr as a proxy to record and replay certain HTTP requests, with our "update snapshots" workflow being updated to also update these records.

The idea is this will let us continue effectively testing against real external APIs but let us decide when those external APIs change by way of updating the HTTP records (terms cassettes, in keeping with the VCR metaphor).

For now I've just focused on the osv.dev client, setting up go-vcr to record interactions with any endpoint aside from /v1/vulns/<id> since those requests have huge responses making the cassettes easily balloon beyond 5mb per file - while these do somewhat frequently change, I'm hoping that their day-to-day changes won't be as noisy and especially considering its the response of the bulk query endpoint that leads to the specific vuln lookups so since they're being recorded things should be a lot more predictable.

I've structured things so that a recording client is created per top-level test so that a single cassette is used for any subtests, and a X-Test-Name header is included in each request via a second http.Client with a custom RoundTripper that gets created in each subtest wrapping the recording client, which allows us to sort the interactions afterwards giving us a stable diff when regenerating.

I have also included a custom hook to remove unneeded headers to both further reduce the noise in diffs and the size of the cassettes overall; the only header I'm on the fence about is the Date header since that relates to caching and its technically possible for code to check that for determining if data should be re-fetched, but I think it's best to omit it for now given its present in every single request (ideally I'd like to have it be updated only when something else changes, but that doesn't seem possible without a lot more work).

Finally, one notable downside for us to keep an eye on is with keeping the cassettes up to date: the nature of this functionality means its inherently hard for such a library to determine if a particular record is obsolete or needs to be generated as (among other things) that would most likely require making the request in the first place e.g. if we change what headers are excluded but don't explicitly regenerate everything, I don't think CI will complain - thankfully, the worse case is that this should get caught by our "update snapshots" workflow

@G-Rath
Copy link
Collaborator Author

G-Rath commented Aug 25, 2025

hmm so (unsurprisingly, in hindsight) this really moves the problem rather than solve it outright as now our CI can fail due to other external conditions, and it doesn't present itself super obviously.

We have these failing tests:

  • TestCommand_OCIImage_JSONFormat/Scanning_python_image_with_some_packages
  • TestCommand_OCIImage_JSONFormat/Scanning_python_image_with_some_packages
  • TestCommand_LockfileWithExplicitParseAs/"apk-installed"_is_supported
  • TestCommand_LockfileWithExplicitParseAs/"dpkg-status"_is_supported
  • TestCommand/requirements.txt_can_have_all_kinds_of_names

The first two are really the same failure because its the same underlying image just with a different output format, and the issue is that a new version of certifi has been published

image

This is addressed by running scripts/build_test_images.sh --force --no-cache, and something we already knew about.

The next two I'm guessing are a result of the underlying OS extractors pulling information from the local environment which for me is Ubuntu 22.04 but CI is e.g. Ubuntu 24.04

image

And then the last one I'm guessing is the result of typing-extensions having a new version, and that that is a transient dependency of one of our fixtures since its not mentioned anywhere in our codebase

image

@G-Rath G-Rath force-pushed the test/vcr branch 2 times, most recently from b304d09 to 8a43656 Compare September 2, 2025 03:40
@G-Rath G-Rath force-pushed the test/vcr branch 2 times, most recently from 869d0fd to 6b3c1df Compare October 5, 2025 20:31
@codecov-commenter
Copy link

codecov-commenter commented Oct 5, 2025

Codecov Report

❌ Patch coverage is 64.00000% with 36 lines in your changes missing coverage. Please review.
✅ Project coverage is 67.87%. Comparing base (b518856) to head (b5351a6).

Files with missing lines Patch % Lines
cmd/osv-scanner/internal/testcmd/vcr.go 49.25% 30 Missing and 4 partials ⚠️
cmd/osv-scanner/main.go 0.00% 1 Missing ⚠️
cmd/osv-scanner/mcp/command.go 0.00% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #2200      +/-   ##
==========================================
- Coverage   67.95%   67.87%   -0.08%     
==========================================
  Files         172      173       +1     
  Lines       12821    12895      +74     
==========================================
+ Hits         8712     8752      +40     
- Misses       3437     3467      +30     
- Partials      672      676       +4     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@G-Rath G-Rath force-pushed the test/vcr branch 4 times, most recently from b08f834 to 8777f4f Compare October 7, 2025 03:01
@G-Rath G-Rath marked this pull request as ready for review October 7, 2025 22:55
@G-Rath G-Rath requested review from another-rex and cuixq October 7, 2025 22:55
@G-Rath G-Rath force-pushed the test/vcr branch 7 times, most recently from 1d5beb7 to 41df179 Compare October 28, 2025 18:31
another-rex added a commit to google/osv.dev that referenced this pull request Nov 2, 2025
Hopefully the readme explains most of this, but at a high-level this
introduces a new tool based off `osv-scanner` to make it easy to test
API changes.

Tests are provided as "cassettes" which we use `go-vcr` to load since
that has the bonus benefit of being [compatible with cassettes we'll be
using in
`osv-scanner`](google/osv-scanner#2200), though
we do format the `request.body` to be multi-line so it's easier to
understand what the query under test is.

---------

Co-authored-by: Rex P <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants