Skip to content

Conversation

@wking
Copy link
Member

@wking wking commented Apr 3, 2019

Builds on #1423 (spun off from here); I'll rebase after that lands.

Uncomment all of the regions in pkg/types/aws/validation which had been commented out in (#1407). Instead, perform the "can we find an AMI for that region?" check directly in pkg/asset/installconfig/aws when we're building a list of regions for the wizard prompt. With this change, bumping the rhcos.json asset (via hack/update-rhcos-bootimage.py) will automatically keep the wizard prompt's choices in sync with the published AMIs.

@openshift-ci-robot openshift-ci-robot added the size/L Denotes a PR that changes 100-499 lines, ignoring generated files. label Apr 3, 2019
@openshift-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: wking

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci-robot openshift-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Apr 3, 2019
@wking wking force-pushed the regions-from-rhcos-metadata branch from 2e69f6b to 96c8b39 Compare April 4, 2019 04:15
@abhinavdahiya
Copy link
Contributor

wouldn't the other way ie switching validation to use AMI regions from internal meta.json more sense... so that validation of install config catch the invalid region?

@wking wking force-pushed the regions-from-rhcos-metadata branch from 96c8b39 to fac8514 Compare April 4, 2019 19:00
@openshift-ci-robot openshift-ci-robot added size/M Denotes a PR that changes 30-99 lines, ignoring generated files. and removed size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels Apr 4, 2019
@wking
Copy link
Member Author

wking commented Apr 4, 2019

wouldn't the other way ie switching validation to use AMI regions from internal meta.json more sense... so that validation of install config catch the invalid region?

The previous install-config validation was overly-strict. For example, you can have a working install-config in a region that does not include an official RHCOS AMI as long as you specify your own AMI. For example, see my suggestion to QE here, and while we ended up closing that bug, I suspect other folks in cn-north-1 or us-gov-west-1 or similar places where we can't put official AMIs to need to do similar things. The issue with regions that have RH-published RHCOS AMIs is just with picking a default AMI. If you want to detect that at install-time, I can drop rhcos.Image and make that lookup part of install-config defaulting. Thoughts?

Rebased onto master with 96c8b39 -> fac85144a.

@abhinavdahiya
Copy link
Contributor

wouldn't the other way ie switching validation to use AMI regions from internal meta.json more sense... so that validation of install config catch the invalid region?

The previous install-config validation was overly-strict. For example, you can have a working install-config in a region that does not include an official RHCOS AMI as long as you specify your own AMI. For example, see my suggestion to QE try here, and while we ended up closing that bug, I suspect other folks in cn-north-1 or us-gov-west-1 or similar places where we can't put official AMIs to need to do similar things. The issue with regions that have RH-published RHCOS AMIs is just with picking a default AMI.

trying to use master openshift-install

$ cat >dev/install-config.yaml <<EOF
apiVersion: v1beta4
baseDomain: devcluster.openshift.com
controlPlane:
  name: master
  replicas: 3
compute:
- name: worker
  replicas: 3
  platform:
metadata:
  name: adahiya-1
networking:
  clusterNetworks:
  - cidr: 10.128.0.0/14
    hostSubnetLength: 9
  machineCIDR: 10.0.0.0/16
  serviceCIDR: 172.30.0.0/16
  type: OpenShiftSDN
platform:
  aws:
    region: cn-north-1
pullSecret: '{"auths":{.....}}}'
sshKey: |
  ssh-rsa REDACTED
EOF
$ ./bin/openshift-install version
./bin/openshift-install unreleased-master-700-gf04777c0164fc5822bd68b24301ea995f559e812
built from commit f04777c0164fc5822bd68b24301ea995f559e812
release image registry.svc.ci.openshift.org/openshift/origin-release:v4.0
$ ./bin/openshift-install --dir dev create manifests
FATAL failed to fetch Master Machines: failed to load asset "Install Config": invalid "install-config.yaml" file: platform.aws.region: Unsupported value: "cn-north-1": supported values: "ap-northeast-1", "ap-northeast-2", "ap-south-1", "ap-southeast-1", "ap-southeast-2", "ca-central-1", "eu-central-1", "eu-west-1", "eu-west-2", "eu-west-3", "sa-east-1", "us-east-1", "us-east-2", "us-west-1", "us-west-2"

So i'm not sure even copying will help the https://bugzilla.redhat.com/show_bug.cgi?id=1669396#c18
and OS_IMAGE override is not meant for everybody (dev purpose only) and therefore it is not documented.

This PR changes the current behavior as it would pass install-config validation but then later on fail when trying to fetch image from meta.json.

wking added a commit to wking/openshift-installer that referenced this pull request Sep 3, 2019
RHCOS grew support for this in 905db73 (data/data/rhcos.json:
update the bootimage to 420.8.20190708.2 for CRI-O 1.14, 2019-07-03, openshift#1941).
And until we get something like [1], we need this not-very-DRY bump to
keep up.

[1]: openshift#1528
@abhinavdahiya
Copy link
Contributor

/close

closing without prejudice due to age. Please consider opening an issue or enhancement to discuss and bring consensus on the fix, or if you think this is still important in current state feel free to reopen.

@openshift-ci-robot
Copy link
Contributor

@abhinavdahiya: Closed this PR.

In response to this:

/close

closing without prejudice due to age. Please consider opening an issue or enhancement to discuss and bring consensus on the fix, or if you think this is still important in current state feel free to reopen.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

wking added 3 commits October 11, 2019 15:09
Uncomment all of the regions in pkg/types/aws/validation (unwinding
that part of bd88157, hack/build: Pin to RHCOS 400.7.20190306.0,
2019-03-12, openshift#1407).  Instead, perform the "can we find an AMI for that
region?" check directly in pkg/asset/installconfig/aws when we're
building a list of regions for the wizard prompt.  With this change,
bumping the rhcos.json asset (via hack/update-rhcos-bootimage.py) will
automatically keep the wizard prompt's choices in sync with the
published AMIs.
So we can ensure we get the error we expect, instead of some other
error.
So users can easily see what the alternatives are.  Also soften the
expected regexp so we don't have to bump it when we get a new RHCOS
JSON file with a new region.
@wking wking reopened this Oct 11, 2019
@openshift-ci-robot openshift-ci-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Oct 11, 2019
@wking wking force-pushed the regions-from-rhcos-metadata branch from fac8514 to b8a02dc Compare October 11, 2019 22:09
@openshift-ci-robot openshift-ci-robot added size/L Denotes a PR that changes 100-499 lines, ignoring generated files. and removed size/M Denotes a PR that changes 30-99 lines, ignoring generated files. labels Oct 11, 2019
@openshift-ci-robot openshift-ci-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Oct 11, 2019
@wking
Copy link
Member Author

wking commented Oct 11, 2019

Rebased on master with fac85144a -> b8a02dc, which also distinguishes between "is a known AWS region" and "has a default RHCOS AMI".

...OS_IMAGE override is not meant for everybody...

Since #2002, user-specified AMI IDs are no longer a special case.

openshift-cherrypick-robot pushed a commit to openshift-cherrypick-robot/installer that referenced this pull request Oct 24, 2019
RHCOS grew support for this in 905db73 (data/data/rhcos.json:
update the bootimage to 420.8.20190708.2 for CRI-O 1.14, 2019-07-03, openshift#1941).
And until we get something like [1], we need this not-very-DRY bump to
keep up.

[1]: openshift#1528
@crawford
Copy link
Contributor

This makes me nervous. The RHCOS build pipeline should not be the source of truth for the list of supported regions as it is just one of many pieces that have to be in place in order for the installation to succeed.

@wking
Copy link
Member Author

wking commented Oct 28, 2019

What are the other pieces?

jhixson74 pushed a commit to jhixson74/installer that referenced this pull request Dec 6, 2019
RHCOS grew support for this in 905db73 (data/data/rhcos.json:
update the bootimage to 420.8.20190708.2 for CRI-O 1.14, 2019-07-03, openshift#1941).
And until we get something like [1], we need this not-very-DRY bump to
keep up.

[1]: openshift#1528
@openshift-ci-robot
Copy link
Contributor

@wking: The following tests failed, say /retest to rerun all failed tests:

Test name Commit Details Rerun command
ci/prow/e2e-aws-disruptive fac85144afcde5a63465d4f7d120e0e8f3ba088e link /test e2e-aws-disruptive
ci/prow/unit b8a02dc link /test unit
ci/prow/e2e-aws-scaleup-rhel7 b8a02dc link /test e2e-aws-scaleup-rhel7
ci/prow/e2e-aws b8a02dc link /test e2e-aws
ci/prow/tf-lint b8a02dc link /test tf-lint
ci/prow/yaml-lint b8a02dc link /test yaml-lint
ci/prow/shellcheck b8a02dc link /test shellcheck

Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

@openshift-ci-robot
Copy link
Contributor

@wking: PR needs rebase.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

1 similar comment
@openshift-ci-robot
Copy link
Contributor

@wking: PR needs rebase.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@openshift-ci-robot openshift-ci-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Jan 31, 2020
@abhinavdahiya
Copy link
Contributor

Closing due to this being open for a long time, Please feel free to reopen

/close

@openshift-ci-robot
Copy link
Contributor

@abhinavdahiya: Closed this PR.

In response to this:

Closing due to this being open for a long time, Please feel free to reopen

/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by an approver from all required OWNERS files. needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants