Skip to content

Support downloading standalone installation archives from CI builds#643

Merged
tsibley merged 2 commits into
masterfrom
trs/cli-download-ci-builds
Jan 9, 2023
Merged

Support downloading standalone installation archives from CI builds#643
tsibley merged 2 commits into
masterfrom
trs/cli-download-ci-builds

Conversation

@tsibley
Copy link
Copy Markdown
Contributor

@tsibley tsibley commented Jan 7, 2023

This new endpoint allows the standalone installer to install not just
released versions but also the builds produced by arbitrary CI runs.
That's very helpful for development and testing of PRs. With this new
endpoint, for example, we can run:

curl -fsSL --proto '=https' https://nextstrain.org/cli/installer/linux \
    | DESTINATION=/tmp/cli bash -s ci-build/3859193828

to install /tmp/cli/nextstrain from:

https://github.com/nextstrain/cli/actions/runs/3859193828#artifacts

Artifacts from GitHub Actions workflow runs require a bit more ceremony
than release assets, as all artifacts come wrapped in a ZIP file, which
we need to unwrap server-side for our installer. Doing this server-side
also resolves the issue of artifacts requiring authentication to
download (despite that our artifacts are publicly visible). Keeping the
additional complexity of API requests, authentication, and additional
compression out of the installer itself keeps the installer simpler and
thus more robust for end users.

Testing

  • Manual testing with both curl and the standalone installer pointed at my local server
  • Checks pass

…in preparation for re-use in a new handler in this module.
…rom CI builds

This new endpoint allows the standalone installer to install not just
released versions but also the builds produced by arbitrary CI runs.
That's very helpful for development and testing of PRs.  With this new
endpoint, for example, we can run:

    curl -fsSL --proto '=https' https://nextstrain.org/cli/installer/linux \
        | DESTINATION=/tmp/cli bash -s ci-build/3859193828

to install /tmp/cli/nextstrain from:

    https://github.com/nextstrain/cli/actions/runs/3859193828#artifacts

Artifacts from GitHub Actions workflow runs require a bit more ceremony
than release assets, as all artifacts come wrapped in a ZIP file, which
we need to unwrap server-side for our installer.  Doing this server-side
also resolves the issue of artifacts requiring authentication to
download (despite that our artifacts are publicly visible).  Keeping the
additional complexity of API requests, authentication, and additional
compression out of the installer itself keeps the installer simpler and
thus more robust for end users.
@nextstrain-bot nextstrain-bot temporarily deployed to nextstrain-s-trs-cli-do-f9qv2k January 7, 2023 01:13 Inactive
@tsibley
Copy link
Copy Markdown
Contributor Author

tsibley commented Jan 7, 2023

With this endpoint in place, we can ~easily extend it to also support pr/X "versions", which would lookup the latest successful CI run for the given PR and download that.

@tsibley
Copy link
Copy Markdown
Contributor Author

tsibley commented Jan 7, 2023

I've been thinking about adding this functionality for a while, but nextstrain/cli#248 today motivated me to do it.

@tsibley
Copy link
Copy Markdown
Contributor Author

tsibley commented Jan 9, 2023

I'm going to merge and deploy this before review, as it seems low stakes and is primarily an internal endpoint to aid our development, so the audience is small.

@tsibley tsibley merged commit 3ac95dc into master Jan 9, 2023
@tsibley tsibley deleted the trs/cli-download-ci-builds branch January 9, 2023 18:38
@tsibley
Copy link
Copy Markdown
Contributor Author

tsibley commented Jan 9, 2023

Deployed to canary. Tested it with:

curl -fsSL --proto '=https' https://nextstrain.org/cli/installer/linux \
  | DESTINATION=/tmp/cli \
    NEXTSTRAIN_DOT_ORG=https://next.nextstrain.org \
    bash -s ci-build/3859193828

and got a 500 when downloading the tarball via next.nextstrain.org because the artifact download got a 403 (even though we're providing authorization):

2023-01-09T19:16:12.473547+00:00 app[web.1]: [verbose]  [fetch] GET https://api.github.com/repos/nextstrain/cli/actions/runs/3859193828/artifacts (cache: undefined)
2023-01-09T19:16:12.575884+00:00 app[web.1]: [verbose]  [fetch] 200 OK https://api.github.com/repos/nextstrain/cli/actions/runs/3859193828/artifacts (cache miss, timestamp 2023-01-09T19:16:12.575Z)
2023-01-09T19:16:12.583163+00:00 app[web.1]: [verbose]  [fetch] GET https://api.github.com/repos/nextstrain/cli/actions/artifacts/501513693/zip (cache: undefined)
2023-01-09T19:16:12.639463+00:00 app[web.1]: [verbose]  [fetch] 403 Forbidden https://api.github.com/repos/nextstrain/cli/actions/artifacts/501513693/zip (cache skip, timestamp null)
2023-01-09T19:16:12.640138+00:00 app[web.1]: [verbose]  Sending InternalServerError: upstream said: 403 Forbidden error as JSON
2023-01-09T19:16:12.641472+00:00 heroku[router]: at=info method=GET path="/cli/download/ci-build/3859193828/standalone-x86_64-unknown-linux-gnu.tar.gz" host=next.nextstrain.org request_id=… fwd="…" dyno=web.1 connect=0ms service=173ms status=500 bytes=297 protocol=https

This could be a scope issue with the token we're using for next.nextstrain.org?

@tsibley
Copy link
Copy Markdown
Contributor Author

tsibley commented Jan 9, 2023

Ok, I think I've come to an understanding here.

During development, I tested locally with my standard "various and sundry" personal access token (classic) that's granted limited scope: just public_repo. Downloading artifacts from the public nextstrain/cli repo worked fine.

The token we use for nextstrain.org has no scopes (because even public_repo includes write access). This means it has a read-only view of only public resources. I thought this would be sufficient to download artifacts from a public repo, but it turns out not to be. This isn't documented anywhere as far as I can tell.

Both tokens are "classic" personal access tokens.

I tested using a new "fine-grained" token without any permissions granted, which I believe is supposed to be roughly equivalent to a "classic" token without any scopes granted. But there are clearly some differences, because this fine-grained token works for artifact downloading when the classic token doesn't.

Details

image

image

So I think we want to replace the classic token with a fine-grained token (which is what GitHub generally recommends now anyhow). This would let us still use a single GITHUB_TOKEN for nextstrain.org, while not granting it permissions/scopes we don't want for security reasons.

@tsibley
Copy link
Copy Markdown
Contributor Author

tsibley commented Jan 9, 2023

I thought this would be sufficient to download artifacts from a public repo, but it turns out not to be. This isn't documented anywhere as far as I can tell.

Note that the classic token we use can view information about an artifact:

GET https://api.github.com/repos/nextstrain/cli/actions/artifacts/501513693 HTTP/1.1

HTTP/1.1 200 
content-type: application/json; charset=utf-8
content-length: 695
x-oauth-scopes:                          
x-accepted-oauth-scopes: 

{
  "id": 501513693,
  "node_id": "MDg6QXJ0aWZhY3Q1MDE1MTM2OTM=",                                                                                                                                                   
  "name": "standalone-x86_64-unknown-linux-gnu",
  "size_in_bytes": 51091874,
  "url": "https://api.github.com/repos/nextstrain/cli/actions/artifacts/501513693",
  "archive_download_url": "https://api.github.com/repos/nextstrain/cli/actions/artifacts/501513693/zip",
  "expired": false,
  "created_at": "2023-01-06T23:45:43Z",
  "updated_at": "2023-01-06T23:45:45Z",
  "expires_at": "2023-04-06T23:17:36Z",
  "workflow_run": {
    "id": 3859193828,
    "repository_id": 139047738,
    "head_repository_id": 139047738,
    "head_branch": "trs/singularity-runtime",
    "head_sha": "d435db68160b6a45277b1ee72006a5e16090259c"
  }
}

just not download it:

GET https://api.github.com/repos/nextstrain/cli/actions/artifacts/501513693/zip HTTP/1.1

HTTP/1.1 403
content-type: application/json; charset=utf-8
content-length: 168
x-oauth-scopes: 
x-accepted-oauth-scopes: 

{
  "message": "You must have the actions scope to download artifacts.",
  "documentation_url": "https://docs.github.com/rest/reference/actions#download-an-artifact"
}

@tsibley
Copy link
Copy Markdown
Contributor Author

tsibley commented Jan 9, 2023

Despite the error response saying the actions scope is required, that is not a documented scope for personal access tokens (which are OAuth tokens).

The download endpoint documentation says:

Anyone with read access to the repository can use this endpoint. If the repository is private you must use an access token with the repo scope. GitHub Apps must have the actions:read permission to use this endpoint.

Our classic token has read access to the repository, so should have access per this doc. The repo is not private. The classic token is a personal access token, not a GitHub Apps token, so should not require the actions:read permission.

@tsibley
Copy link
Copy Markdown
Contributor Author

tsibley commented Jan 9, 2023

I replaced the GITHUB_TOKEN used by next.nextstrain.org with a new fine-grained token as described above:

image

and all seems to be working there. I'll make the same change to nextstrain.org soon, and eventually revoke the classic token.

One thing to note is that fine-grained tokens must have expiration dates ≤1y in the future, so this token expires 9 Jan 2024, and we'll have to manually rotate it before then. Not sure the best way to track this task…

@tsibley
Copy link
Copy Markdown
Contributor Author

tsibley commented Jan 9, 2023

With this endpoint in place, we can ~easily extend it to also support pr/X "versions", which would lookup the latest successful CI run for the given PR and download that.

Implemented as #645.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

No open projects

Development

Successfully merging this pull request may close these issues.

2 participants