Skip to content

feat: automatically adds acces to the vault when deploying dev#36116

Open
BrageHK wants to merge 3 commits intomasterfrom
bragehk/do-not-require-failed-deployment-for-secrets
Open

feat: automatically adds acces to the vault when deploying dev#36116
BrageHK wants to merge 3 commits intomasterfrom
bragehk/do-not-require-failed-deployment-for-secrets

Conversation

@BrageHK
Copy link
Contributor

@BrageHK BrageHK commented Mar 6, 2026

I confirm that this contribution is made under the terms of the license found in the root directory of this repository's source tree and that I have the authority necessary to make this contribution on behalf of its copyright owner.

Should contain the same logic as this for pyvespa. Have tested it on cloud.

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds automatic setup of Vespa Cloud vault access rules during vespa deploy (intended for dev deployments) by extracting referenced vault names from services.xml and calling a new helper to ensure SANDBOX access rules exist.

Changes:

  • Add Services.VaultNames() to extract/deduplicate/sort vault names from <secrets> in services.xml.
  • Introduce EnsureVaultAccessForDev (and supporting cloudTarget helpers) to GET/PUT vault access rules and fetch CSRF token.
  • Wire vault access setup into the cloud deploy command flow; add unit tests for both vault parsing and vault access behavior.

Reviewed changes

Copilot reviewed 5 out of 5 changed files in this pull request and generated 9 comments.

Show a summary per file
File Description
client/go/internal/vespa/xml/config.go Adds VaultNames() to extract referenced vaults from raw services.xml.
client/go/internal/vespa/xml/config_test.go Adds coverage for vault name extraction (none/single/multiple).
client/go/internal/vespa/vault.go New implementation to ensure SANDBOX rule exists for referenced vaults (CSRF + GET/PUT).
client/go/internal/vespa/vault_test.go Adds tests for non-cloud no-op, empty vault list, already-present rule, adding rule, and GET error.
client/go/internal/cli/cmd/deploy.go Calls vault access setup during cloud deploy when vaults are referenced.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

}
defer putResp.Body.Close()
putRawBody, _ := io.ReadAll(putResp.Body)
var putVaultResp vaultResponse
Copy link

Copilot AI Mar 6, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The PUT response body is always printed to stdout via fmt.Printf. This will leak potentially sensitive information into user output and logs, and can also break tooling that expects clean CLI output. Please remove this, or route it through the existing CLI logging infrastructure behind an explicit debug flag.

Suggested change
var putVaultResp vaultResponse

Copilot uses AI. Check for mistakes.
"time"
)

const SECRET_STORE_DEV_ALIAS = "SANDBOX"
Copy link

Copilot AI Mar 6, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SECRET_STORE_DEV_ALIAS is exported (all-caps identifier) but only used within this file. Consider making it unexported to avoid expanding the package’s public surface area unnecessarily (e.g., secretStoreDevAlias or similar).

Suggested change
const SECRET_STORE_DEV_ALIAS = "SANDBOX"
const secretStoreDevAlias = "SANDBOX"

Copilot uses AI. Check for mistakes.
Comment on lines +116 to +126
putResp, err := deployService2.Do(putReq, 10*time.Second)
if err != nil {
return fmt.Errorf("could not set vault access rule for %q: %w", vaultName, err)
}
defer putResp.Body.Close()
putRawBody, _ := io.ReadAll(putResp.Body)
var putVaultResp vaultResponse
if err := json.Unmarshal(putRawBody, &putVaultResp); err != nil {
return fmt.Errorf("could not parse vault PUT response for %q: %w", vaultName, err)
}

Copy link

Copilot AI Mar 6, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The PUT response status code is not checked before attempting to read/unmarshal the body. If the API returns a non-2xx status, this may turn into a JSON parsing error and hide the actual server error. Check putResp.StatusCode and return a clear error (include body) when it’s not successful.

Copilot uses AI. Check for mistakes.
Comment on lines +29 to +49
func (t *cloudTarget) csrfToken() (string, error) {
req, err := http.NewRequest("GET", fmt.Sprintf("%s/csrf/v1", t.apiOptions.System.URL), nil)
if err != nil {
return "", err
}
deployService, err := t.DeployService()
if err != nil {
return "", err
}
resp, err := deployService.Do(req, 10*time.Second)
if err != nil {
return "", err
}
defer resp.Body.Close()
var result struct {
Token string `json:"token"`
}
if err := json.NewDecoder(resp.Body).Decode(&result); err != nil {
return "", err
}
return result.Token, nil
Copy link

Copilot AI Mar 6, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

csrfToken() decodes the response body without checking resp.StatusCode first. If the CSRF endpoint returns an error status (e.g. 401/403/500), this will likely surface as a JSON decode error or return an empty token. Check resp.StatusCode and return a helpful error (including body) for non-success responses.

Copilot uses AI. Check for mistakes.
Comment on lines +128 to +138
for _, rule := range putVaultResp.Rules {
if rule.Application == appID {
for _, ctx := range rule.Contexts {
if ctx == SECRET_STORE_DEV_ALIAS {
return nil
}
}
}
}
return fmt.Errorf("vault access rule for %q was not confirmed in response", vaultName)
}
Copy link

Copilot AI Mar 6, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After the PUT, the confirmation check iterates over vaultResp.Rules (the pre-PUT response) instead of the PUT response. If the rule wasn’t present before, this will always fall through and return an error even when the PUT succeeded. Iterate over putVaultResp.Rules (or re-GET) when verifying the rule was applied.

Copilot uses AI. Check for mistakes.
Comment on lines +101 to +111
csrfToken, _ := t.csrfToken()

// PUT updated rules
putReq, err := http.NewRequest("PUT", vaultURL, bytes.NewReader(body))
if err != nil {
return err
}
putReq.Header.Set("Content-Type", "application/json")
if csrfToken != "" {
putReq.Header.Set("vespa-csrf-token", csrfToken)
}
Copy link

Copilot AI Mar 6, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Errors from csrfToken() are ignored (csrfToken, _ := t.csrfToken()). If the token endpoint fails and the API requires a token, the subsequent PUT will fail in a harder-to-diagnose way. Handle and return the csrfToken() error (or explicitly document and implement a fallback behavior).

Copilot uses AI. Check for mistakes.
Comment on lines +67 to +76
resp, err := deployService.Do(req, 10*time.Second)
if err != nil {
return fmt.Errorf("could not get vault access rules for %q: %w", vaultName, err)
}
defer resp.Body.Close()
getRawBody, _ := io.ReadAll(resp.Body)
var vaultResp vaultResponse
if err := json.Unmarshal(getRawBody, &vaultResp); err != nil {
return fmt.Errorf("could not parse vault access rules for %q: %w", vaultName, err)
}
Copy link

Copilot AI Mar 6, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

HTTP status codes are not checked for either the GET or PUT requests. As written, a non-200 response (e.g. 403/404/500) will still be read and unmarshaled, producing confusing JSON errors or silently proceeding with an empty rule set. Check resp.StatusCode/putResp.StatusCode and return a helpful error (including response body) when the status is not successful.

Copilot uses AI. Check for mistakes.
Comment on lines +151 to +155
// Errors are non-fatal warnings for the caller.
func EnsureVaultAccessForDev(target Target, vaultNames []string) error {
ct, ok := target.(*cloudTarget)
if !ok || len(vaultNames) == 0 {
return nil
Copy link

Copilot AI Mar 6, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

EnsureVaultAccessForDev is documented/named as applying to dev deployments, but it currently runs for any cloud deployment (including prod) as long as vaultNames is non-empty. Add a guard on target.Deployment().Zone.Environment (or equivalent) so this only mutates SANDBOX/dev access when deploying to dev.

Suggested change
// Errors are non-fatal warnings for the caller.
func EnsureVaultAccessForDev(target Target, vaultNames []string) error {
ct, ok := target.(*cloudTarget)
if !ok || len(vaultNames) == 0 {
return nil
// Returns nil for non-cloud targets, non-dev environments, or when no vaults are referenced.
// Errors are non-fatal warnings for the caller.
func EnsureVaultAccessForDev(target Target, vaultNames []string) error {
if len(vaultNames) == 0 {
return nil
}
deployment := target.Deployment()
if deployment == nil || deployment.Zone.Environment != "dev" {
// Only modify SANDBOX/dev access rules when deploying to dev.
return nil
}
ct, ok := target.(*cloudTarget)
if !ok {

Copilot uses AI. Check for mistakes.
Comment on lines +89 to +94
// Build new rule with no context restriction (grants access to all environments)
newRule := vaultAccessRule{
Application: appID,
Contexts: []string{SECRET_STORE_DEV_ALIAS},
ID: len(vaultResp.Rules),
}
Copy link

Copilot AI Mar 6, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The comment says the new rule has “no context restriction (grants access to all environments)”, but the code sets Contexts to []string{SECRET_STORE_DEV_ALIAS} (SANDBOX), which is a restriction. Update the comment to match behavior (or adjust the Contexts logic if the intent really is all environments).

Copilot uses AI. Check for mistakes.
@BrageHK
Copy link
Contributor Author

BrageHK commented Mar 6, 2026

@thomasht86 Everything works as expected now. The only thing I am wondering about is why i got an empty list in the return body of the put request? In the pyvespa implementation the retuned body is checked if it contains the correct access rights, but I cannot check this as the body is empty. Do you know why it's like this?

@thomasht86
Copy link
Contributor

awaiting changes as discussed

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants