Proposing enhancement to BareMetalPlatformStatus #90

sadasu · 2019-10-24T18:52:59Z

BareMetalPlatformStatus CR is being augmented to support
configuration required for successful Baremetal IPI deployments.

openshift-ci-robot · 2019-10-24T18:55:40Z

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: sadasu
To complete the pull request process, please assign kbsingh
You can assign the PR to them by writing /assign @kbsingh in a comment when ready.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

OWNERS

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

Adding config items required for a Baremetal IPI deployment to the BareMetalPlatformStatus CR. Based on enhancement request: openshift/enhancements#90

sadasu · 2019-10-25T19:19:21Z

/cc @deads2k
/cc @abhinavdahiya

deads2k · 2019-10-25T20:30:07Z

please add a discussion about how the information is stored today and how you'll migrate that data. If you don't support upgrade, please indicate that, link to the docs, and ideally where in the code you actually stop that upgrade.

Adding config items required for a Baremetal IPI deployment to the BareMetalPlatformStatus CR. Based on enhancement request: openshift/enhancements#90

deads2k · 2019-10-28T13:17:13Z

enhancements/baremetal-config.md

+### Upgrade / Downgrade Strategy
+
+This enhancement is applicable only to the Baremetal platform type. Since,
+this platform type is not officially supported in 4.2, we feel that an


Please link to where in the docs we say this isn't supported and to where in the code you're preventing upgrades from happening.

As far as I know, we are not preventing upgrades from happening in the code. The current upgrade strategy comes less from the code and more from business logic (for lack of a better word in my vocabulary). @russellb should be able to clarify this further.

@deads2k this is a net-new integration for a currently unsupported/experimental platform so we don't expect there to be any users to upgrade, we'll only support new deployments in the first instance.

Perhaps you can expand a bit on the sort of strategy/docs you would expect in this case, clearly when this becomes fully supported we'll need to ensure we have a strategy to deal with any changes in this schema, but right now I don't think it's a concern, we're still trying to bootstrap this new baremetal IPI platform integration (and this API change is one of the final pieces to enable that)

@deads2k this is a net-new integration for a currently unsupported/experimental platform so we don't expect there to be any users to upgrade, we'll only support new deployments in the first instance.

Perhaps you can expand a bit on the sort of strategy/docs you would expect in this case, clearly when this becomes fully supported we'll need to ensure we have a strategy to deal with any changes in this schema, but right now I don't think it's a concern, we're still trying to bootstrap this new baremetal IPI platform integration (and this API change is one of the final pieces to enable that)

I just want to be sure that customers cannot possibly attempt to upgrade from 4.2 to 4.3. If they can, the chances that they will go from a working cluster to a completely broken one are fairly high. Simply saying, "this isn't supported" doesn't really satisfy, but Derek told me that you can't actually get thing to install in 4.2, so I'm no longer worried about it.

Yeah, currently if anyone tries to deploy using the baremetal platform, the baremetal operator pod will simply crashloop on start because of the missing configmap, we've worked around it for developer testing by manually adding it, but I think for end-users it's extremely unlikely they would do such an upgrade, and also this platform is marked experimental in the installer.

FWIW that's also why I think this is a bug and not really an enhancement - the current MAO integration is broken without passing this data from the installer.

please add a discussion about how the information is stored today and how you'll migrate that data. If you don't support upgrade, please indicate that, link to the docs, and ideally where in the code you actually stop that upgrade.

This information is not currently stored anywhere. For development testing purposes, we have scripts that create a configmap that contains this config. MAO code in 4.2 refers to this ConfigMap which either has to be provided by these scripts (or manually?) when the platform type is "Baremetal".
So, during the 4.3 cycle, we had 1st attempted to have the installer add the ConfigMap but that effort was abandoned in favor of adding these config items to the BaremetalPlatformStatus CR. Hopefully, the enhancement request and this discussion provides the necessary context for this effort.

sadasu · 2019-10-28T15:10:39Z

Updated (hopefully improved) section on upgrade strategy.

hardys · 2019-10-28T15:17:17Z

enhancements/baremetal-config.md

+
+All config values except the RHCOSImageUrl are not expected to change after a
+metal3 cluster is up. Handling changes to this value after install time has
+not been explored and that is a non-goal at this time.


We could mention that the update lifecycle for bootimage is an unsolved problem for all platforms ref openshift/os#381

@imcleod assembled some notes on this issue from a discussion we had a couple months ago as well: https://docs.google.com/document/d/1W_N3pcOuo6E_DcM5GQ731cTTN2VJ1KUa2A49_Zi3IRU/edit

@ericavonb & @imcleod isit OK for me to add a link to this doc in my enhancement request?

hardys · 2019-10-28T15:18:53Z

enhancements/baremetal-config.md

+### User Stories [optional]
+
+This enhancement will allow users to have their Barametal IPI deployments
+customizable to their specific hardware and network requirements.


It also allows IPI baremetal deployments without manual workarounds such as creating the required ConfigMap object manually (which is what we're currently doing for testing).

The user story doesn't actually provide the components that will be using these fields.

Since this proposal is adding these new fields to the global platform configs, I would like to make sure this is actually a infrastructure level information or it belongs to a specific operator.

Augmented The User Stories section to take care of the above 2 comments.

hardys · 2019-10-28T15:19:31Z

enhancements/baremetal-config.md

+
+Baremetal Platform type will be available for customers to use for the first
+time in Openshift 4.3. And, when it is installed, it will always start as
+a fresh baremetal installation atleast in 4.3. There is no use case where a 4.2


typo s/atleast/at least/

hardys · 2019-10-28T15:20:01Z

enhancements/baremetal-config.md

+installation would be upgraded to a 4.3 installation with Baremetal Platform
+support enabled.
+
+For the above reason, in this particulat case, we would not have to make


typo s/particulat/particular/g

hardys

Looks good, couple of small comments/typos

BareMetalPlatformStatus CR is being augmented to support configuration required for successful Baremetal IPI deployments.

abhinavdahiya · 2019-10-29T22:53:27Z

enhancements/baremetal-config.md

+
+This proposal aims to add the following to BareMetalPlatformStatus:
+
+1. ProvisioningInterface : This is the interface name on the underlying


Is this something like talk to this interface if you want to talk to the API, like QEMU URL/amazonaws.com

Also this seems like BareMetalHost Custom Resource should have this information, because i would assume users might want to have different interface on each baremetal-server that is used to talk to provisioning network??

This is the physical nic used for the dedicated network on the nodes where the baremetal operator and Ironic runs. e.g eno1, eth3 or whatever.

This shouldn't be part of the BareMetalHost CRD because that describes the nodes to be provisioned, this is an implementation detail of how the provisioning service runs on the cluster.

@abhinavdahiya perhaps a bit more context would help here:

We integrated the baremetal operator via the MAO in Integrate baremetal-operator with machine-api-operator machine-api-operator#302, and this starts some provisioning services in the same pod as the BMO (Ironic and dependencies). To start these services we require some configuration data which is environment/hardware dependent but this data is not related to the actual hosts we deploy, it's required for the deployment services (to configure the PXE boot services primarily).

Currently, we have to manually create a ConfigMap with that data ref https://github.com/openshift-metal3/dev-scripts/blob/master/metal3-config.yaml - this works but it's a barrier to users deploying via openshift-install without scripts/workarounds, essentially this is a gap from the previous MAO integration (and thus IMHO it's a bug not really an enhancement as such).

We initially proposed to add the ConfigMap via the installer, since most of the data we need should be requested via the install-config at deploy time. This was rejected in WIP: Add baremetal operator configmap to installer. installer#2149 by @wking with the suggestion this data should be added to the infrastructure config instead.

Consequently we reworked the installer changes in WIP: Configure Metal3 from the installer installer#2449 and added corresponding API changes in Add Metal3 specific config items to BareMetalPlatformStatus api#480 and the MAO integration consumes this data in WIP: Get Baremetal config from the Infrastructure CR machine-api-operator#402

AFAICS the changes proposed by @sadasu are aligned with the guidance previously provided by @wking, but perhaps you can provide more detailed guidance as to how we can refine this approach to make progress.

We've been trying to add this small amount of additional configuration data for nearly 3 months now, and it seems like conflicting feedback on the various PRs is part of the reason we're making such slow progress - perhaps you and @wking could chat and provide some very specific guidance on how to proceed? Thanks!

Is this something like talk to this interface if you want to talk to the API, like QEMU URL/amazonaws.com

Also this seems like BareMetalHost Custom Resource should have this information, because i would assume users might want to have different interface on each baremetal-server that is used to talk to provisioning network??

@abhinavdahiya That is a good point. But, currently we are operating under the assumption that in a given installation, we would have baremetal servers that will all use an interface of the same name to be physically connected to the provisioning network.
Secondly, the BareMetalHost Custom Resourse currently contains information about baremetal hosts known to the BareMetal Operator. We also have an additional set of configs that are used by Ironic, which is a service that provisions baremetal servers and gets them ready for use as hosts and then as nodes. This config cannot be added to the BareMetalHost Custom Resource.
So, instead of creating a brand new Custom Resourse, we are augmenting the BareMetalPlatformStatus Custom Resource. Agreed that adding this make the BareMetalPlatformStatus CR look a lot different from similar CRs used by other clouds. But, in the Baremetal case, we are also responsible for provisioning of the underlying physical servers.

An example value for this might be "eth1" or "ens8" or the name of a bridge (we use "provisioning" in the VM-based development environment for this work).

So the provision network information is for the baremetal-controller so that it knows how to reach the provisioning network on the host it's running to do the provisioning of a new node?

So the provision network information is for the baremetal-controller so that it knows how to reach the provisioning network on the host it's running to do the provisioning of a new node?

Wait, why can't this just be a routing config on the controller node? Controller: "I'm trying to reach the provisioning API at https://whatever.example.com", DNS: "whatever.example.com is 1.2.3.4", kernel: "take eth1 to reach 1.2.3.4". Doesn't seem bare-metal-specific to me.

A config similar to that might be the IngressIP and it is present in the BareMetalPlatformStatus CR. https://github.com/openshift/api/blob/master/config/v1/types_infrastructure.go#L176.

So the provision network information is for the baremetal-controller so that it knows how to reach the provisioning network on the host it's running to do the provisioning of a new node?

Not quite, it's so that the baremetal-operator pod can configure a dedicated interface on the node where that pod runs, this happens via https://github.com/openshift/ironic-static-ip-manager which is a bit hacky but works in the absence of an external DHCP server or any other external dependencies.

So to reply to @wking routing doesn't help us here, we have to bring up a nic on the node so we can serve DHCP/TFTP etc for pxe booting the nodes.

In future this may become optional, when we support use of external DHCP servers on the provisioning network, but right now it's required to make pxe boot work.

Controller: "I'm trying to reach the provisioning API at https://whatever.example.com"

FWIW I think this is a fundamental misunderstanding here - there is no existing provisioning API we have to reach - the metal3 pod deployed by the MAO runs that API, and most of the data under discussion here is just the configuration needed to make that self-hosted API work.

This is very different to the typical "cloud provider" API interaction, which I think is why some of this discussion is difficult without prior context :)

abhinavdahiya · 2019-10-29T23:14:37Z

enhancements/baremetal-config.md

+cluster. This is an optional parameter and is expected to be provided
+in the format : http://111.111.111.1/images
+
+6. RHCOSImageURL : This config is expected to be the location of the


This most definitely is not a cluster global configuration.

How would you suggest the MAO gets this data instead?

To clarify why this is needed, https://github.com/openshift/ironic-rhcos-downloader is one of the containers we start as part of the ironic deployment.

This dowloads the required RHCOS image, and the URL for that is set in the installer via the data/data/rhcos.json - openshift/os#381 discusses in future updating bootimages but for now we need some way to download the installer specified version of the image to enable deployment of worker nodes via the BMO.

The installer has that data, so how would you prefer that we pass the URL to the downloader container, which is started via the MAO?

One option I guess would be to use the URL from the machine providerspec for this, but currently that points at the cached URL:

openshift/installer@ac89074

One problem is that at the point where the MAO starts the BMO pod, we don't necessarily have any machines defined.

I guess we could block metal3 deployment on the machine existence, with logic that checks if the cache already contains a specific image, but then somewhere we'd have to rewrite the image used by the deployment to point at the cache (to avoid non-local download for potentially $many machine deployments).

As Steve points out, we do have an image URL in the machine provider spec but we don't want to have to download that image from the internet for every deployment. This setting would allow us to populate a cache, and then have the machine provider spec refer to the cached location.

I think the machine objects should have the link to the RHCOS image as known in rhcos.json
The provisioning controllers can see that the image already exists in their cache and then configure the server to use the cache URL (hopefully the controller knows that because it created it?) to download the image.

The controller can do the caching when it sees the fir machine object.

And as for allowing caching across deployments, the user's should be free to override the RHCOS url chosen by the installer and set in the machine object such that the controller gets the image from local network hosted link..

Thanks @abhinavdahiya I think we're in agreement re the RHCOSImageURL as per my reply to @sadasu below.

One question/clarification though:

The provisioning controllers can see that the image already exists in their cache and then configure the server to use the cache URL

So should we add the logic to check the machineconfig URL and compare it to images available in the cache to the MAO (for example by doing a HEAD on the image, or downloading the md5sum we expose there), or do we need to bake that logic into the baremetal-operator?

The latter may be cleaner from an interface perspective, but I'm not sure if it aligns with the upstream vision for the BMO interfaces - @dhellmann do you think it would be reasonable to add logic to the BMO that checks the cache for an image, and rewrites the deployment image if it's found?

abhinavdahiya · 2019-10-29T23:15:20Z

enhancements/baremetal-config.md

+a contiguous range of IPs that are available to be allocated. It is expected
+to be provided in the format : 10.1.0.10, 10.1.0.100.
+
+5. CacheURL : This is a location where image required to bring up baremetal


So the machine object should have this set, how is this a global configuration.

This is unrelated to the machine object, it's an implementation detail of the baremetal provisioning service.

might be good to expand on "image" (maybe "bootimage"? Something else), so folks don't read this and assume "container image" like I did ;)

Ah yeah that's a good point, it is where we'll cache the qcow2 bootimage indeed

abhinavdahiya · 2019-10-29T23:17:02Z

enhancements/baremetal-config.md

+workarounds like creating a ConfigMap for the config (which is the
+current approach we are using to test baremetal deployments.) 
+
+The new config items would be set by the installer and will be used by


Since the only use-case is some machine-api-operator component, I am inclined towards making these fields available to just that component and not add these to the global configuration.

The additions are specific to the baremetal platform in openshift/api#480 - and this data is required for all deployments when the baremetal platform is selected.

Can you suggest how we'd make this available only to a component? I'm not entirely clear what you are proposing, and how it differs from using the BareMetalPlatformStatus?

Can you suggest how we'd make this available only to a component?

E.g. ingress-specific configuration or the registry-specific config or other operator.openshift.io stuff.

... how it differs from using the BareMetalPlatformStatus?

Mostly in scoping. It's easier to reason about the impact of config changes if you can scope read access down more tightly. If the MAO is the only component with creds to read a metal config object, you don't have to worry about other operators or cluster components sneaking in to borrow your settings and give you later trouble when you need to migrate to a different schema. But if multiple components need access, a central location like Infrastructure is DRYer for cluster admins.

squeed · 2019-10-30T10:25:25Z

enhancements/baremetal-config.md

+be within this network. The value for this config is expected to be
+provided in the format : 10.1.0.0/24.
+
+4. ProvisioningDHCPRange : This is a range of IP addresses from which


Is there a reason we shouldn't just use the MachineCIDR?

This is a completely different network to the MachineCIDR, it's used for baremetal deployment to pxe boot the servers and access their BMCs via IPMI.

In the installer there is a validation to ensure this network does not overlap with MachineCIDR ref openshift/installer#2449

Right, for baremetal we assume at least 2 networks. The MachineCIDR describes the usual network where control plane and application traffic go. This new ProvisioningCIDR describes the network where provisioning traffic goes. That traffic needs to be isolated from the control plane and applications because it involves protocols like DHCP and PXE, and you don't want to mix those with publicly facing traffic, so they are typically on different L2 network segments.

Gotcha, Fun.

hardys · 2019-11-01T16:34:35Z

@abhinavdahiya and @deads2k Hi please can you take another look at this when you get a moment ref my comments above - this is a blocker to openshift/api#480 which blocks openshift/machine-api-operator#402 and openshift/installer#2449

Those PRs are also the first step to fixing openshift/installer#2091 which is a prerequisite for ipv6 support, and that's something we're being asked to prioritize so any help unblocking this series would be much appreciated, thanks!

EricArrakis · 2019-11-01T20:45:31Z

enhancements/baremetal-config.md

+type is Baremetal.
+
+MAO reads this configuration and uses these to derive additional configuration
+required to bring up a metal3 cluster. E2e testing should make sure that MAO


What type of negative testing do you plan on doing? Malformed strings, IP addresses/ranges overlapping, not in the expected ranges, poorly formed IP addresses, images not where they are specified, slow or interrupted downloads. Is this going to be tested and within scope of this enhancement testing?

What type of negative testing do you plan on doing? Malformed strings, IP addresses/ranges overlapping, not in the expected ranges, poorly formed IP addresses, images not where they are specified, slow or interrupted downloads. Is this going to be tested and within scope of this enhancement testing?

All of these values will be provided by the installer ref openshift/installer#2449 - that does have unit tests for the installer level validation of those values, and I'd assume we'll have similar validation and unit tests in the MAO @sadasu ?

For e2e testing we've got a 3rd party metal3 CI job which can be used to validate these changes, but not yet fully integrated openshift e2e CI for this new platform - @derekhiggins is working on that.

Also note that the only way to use this platform at present is to manually create a ConfigMap resource, e.g a completely unconstrained interface, so IMO having the validation at the installer layer will be a big improvement.

dhellmann · 2019-11-04T17:03:26Z

enhancements/baremetal-config.md

+that is specific to a Baremetal deployment.
+
+This proposal aims to add the following to BareMetalPlatformStatus:
+


Right. The host CR is the inventory that the installer uses to fulfill a Machine. These settings tell the provisioning system how to do that work, by telling it which network it can use and what IPs it should use on that network.

dhellmann · 2019-11-04T17:08:27Z

enhancements/baremetal-config.md

+baremetal server which would be physically connected to the provisioning
+network.
+
+2. ProvisioningIP : This is the IP address used to to bring up a NIC on


This is the IP address that the interface above is configured with - we've got a mechanism that statically configures ProvisioningInterface on whatever master node has the metal3 baremetal-operator pod running, but this is a short lease such that if/when the pod gets rescheduled the IP goes down on the old host, and is the brought back up on the new host.

@sadasu this would be good info to include in the body of the doc. Maybe link to any metal3 docs about it, too?

dhellmann · 2019-11-04T17:09:53Z

enhancements/baremetal-config.md

+be within this network. The value for this config is expected to be
+provided in the format : 10.1.0.0/24.
+
+4. ProvisioningDHCPRange : This is a range of IP addresses from which


Right, for baremetal we assume at least 2 networks. The MachineCIDR describes the usual network where control plane and application traffic go. This new ProvisioningCIDR describes the network where provisioning traffic goes. That traffic needs to be isolated from the control plane and applications because it involves protocols like DHCP and PXE, and you don't want to mix those with publicly facing traffic, so they are typically on different L2 network segments.

dhellmann · 2019-11-04T17:21:25Z

enhancements/baremetal-config.md

+
+This proposal aims to add the following to BareMetalPlatformStatus:
+
+1. ProvisioningInterface : This is the interface name on the underlying


An example value for this might be "eth1" or "ens8" or the name of a bridge (we use "provisioning" in the VM-based development environment for this work).

dhellmann · 2019-11-04T17:24:33Z

enhancements/baremetal-config.md

+a contiguous range of IPs that are available to be allocated. It is expected
+to be provided in the format : 10.1.0.10, 10.1.0.100.
+
+5. CacheURL : This is a location where image required to bring up baremetal


Suggested change

5. CacheURL : This is a location where image required to bring up baremetal

5. CacheURL : This is a location where the image required to bring up baremetal

dhellmann · 2019-11-04T17:24:51Z

enhancements/baremetal-config.md

+to be provided in the format : 10.1.0.10, 10.1.0.100.
+
+5. CacheURL : This is a location where image required to bring up baremetal
+hosts have been previously downloaded for faster image downloads within the


Suggested change

hosts have been previously downloaded for faster image downloads within the

hosts has been previously downloaded for faster image downloads within the

dhellmann · 2019-11-04T17:30:33Z

enhancements/baremetal-config.md

+cluster. This is an optional parameter and is expected to be provided
+in the format : http://111.111.111.1/images
+
+6. RHCOSImageURL : This config is expected to be the location of the


As Steve points out, we do have an image URL in the machine provider spec but we don't want to have to download that image from the internet for every deployment. This setting would allow us to populate a cache, and then have the machine provider spec refer to the cached location.

imain · 2019-11-05T17:56:52Z

All of the configuration options we are proposing control how the Bare Metal Operator works.

In a baremetal deployment we have multiple network interfaces, one of which is connected to the provisioning network, which is used only for provisioning new hosts. The baremetal pod contains various services and will run DHCP and PXE to provision new nodes on this second network. So, while some of these configuration options sound similar to ones that already exist in OpenShift, they are actually a new set of options with values that are expected to be different.

The provisioning interface value specifies the network interface on the host (which could be any host in the custer) to be used for provisioning. This interface will be assigned a static IP as specified in the configuration. It will then be used to PXE boot new hosts using this network and assign an IP to them via DHCP for the provisioning step.

The RHCOS image URL is used to download the image used in the provisioning step. This will be downloaded during pod initialization and then served to the host being provisioned using a local URL. This is done to avoid downloading it from the internet every time a host is provisioned.

We could certainly move this to its own CRD and out of platformStatus. Is this the reasonable next step here?

hardys · 2019-11-06T09:54:00Z

@imain IMHO the only interface here that we can reasonably remove is the RHCOSImageURL - instead of another CRD I think we could have the MAO look at the machinesets and retrieve the image URL from the providerSpec, then we'd restart the downloader container if the target bootimage ever changes.

If that is the main objection here, my proposal is to remove RHCOSImageURL and merge this so we can make progress? @abhinavdahiya and/or @deads2k what are your thoughts?

deads2k · 2019-11-06T15:06:31Z

If that is the main objection here, my proposal is to remove RHCOSImageURL and merge this so we can make progress? @abhinavdahiya and/or @deads2k what are your thoughts?

I will defer to @abhinavdahiya. The discussion with @abhinavdahiya and @squeed is what I wanted out of this. Once they're convinced, make the API doc match (more tersely if necessary) and we can get it in.

sadasu · 2019-11-13T15:37:17Z

I have created a "enhancement" PR to build context around the BareMetal IPI deploymet solution. Providing background for Baremetal IPI based enhancements #102
I see 2 main themes evolving here :
a. BareMetalPlatformStatus CR is not a good place for baremetal provisioning config since it is a global CR. Since, the machine-api-operator is the only component that is currently consuming this config, we should put it in a different CR that only the Installer and MAO have access to. Since, no existing CR in the baremetal case, fits this critreria, I would have to create a new CR.
```
 b. RHCOSImageURL should be in the Machine CR and not in the BareMetalPlatformStatus CR.
```

We are currently blocked on 2a.

hardys · 2019-11-13T15:58:35Z

We are currently blocked on 2a.

I think we could remove the RHCOSImageURL from whatever CR is created by the installer, and instead have the MAO look at the machinesets that exist, then restart the ironic-rhcos-downloader container if the URL found is not accessible (e.g via HEAD) from the cache webserver.

Personally I don't see this as a blocker, since the information we need can be made available via the installer, the main blocker IMHO is agreeing where the remainder of the data should reside.

abhinavdahiya · 2019-11-13T19:51:49Z

a. BareMetalPlatformStatus CR is not a good place for baremetal provisioning config since it is a global CR. Since, the machine-api-operator is the only component that is currently consuming this

So I think the best way to move forward on this will be to create a new Custom Resource for the baremetal-operator that includes the Provisioning* details.

similar to other operator level configuration objects we have here https://github.com/openshift/api/tree/master/operator/v1
_ doesn't have to be in that directory/group but would be nice to have it there_

and then the installer can configure the operator in baremetal cases to include the correct information.
we already do it for ingress-operator https://github.com/openshift/installer/blob/9f62449aecce77dff5eff8f8f8dda98386fb2df9/pkg/asset/manifests/ingress.go#L97 and I don't think doing it for baremetal platform would be problematic.

dhellmann · 2019-11-13T20:02:57Z

a. BareMetalPlatformStatus CR is not a good place for baremetal provisioning config since it is a global CR. Since, the machine-api-operator is the only component that is currently consuming this

So I think the best way to move forward on this will be to create a new Custom Resource for the baremetal-operator that includes the Provisioning* details.

similar to other operator level configuration objects we have here https://github.com/openshift/api/tree/master/operator/v1
_ doesn't have to be in that directory/group but would be nice to have it there_

and then the installer can configure the operator in baremetal cases to include the correct information.
we already do it for ingress-operator https://github.com/openshift/installer/blob/9f62449aecce77dff5eff8f8f8dda98386fb2df9/pkg/asset/manifests/ingress.go#L97 and I don't think doing it for baremetal platform would be problematic.

The operator that configures the bare metal components is the machine-api-operator. Other platform-specific config for the machine-api-operator is in the infrastructure resource in https://github.com/openshift/api/blob/master/config/v1/types_infrastructure.go#L102. Can you help me understand why you see the bare metal configuration as needing to be handled differently from all of the other platforms?

enxebre · 2019-11-29T14:17:33Z

@dhellmann

Other platform-specific config for the machine-api-operator is in the infrastructure resource in

Mao does not use any platform-specific config from the infra resource. It only fetches the platform type to choose the provider controller image to run because the cvo does not support manifest discrimination per platform (or per anything else).

@abhinavdahiya

and then the installer can configure the operator in baremetal cases to include the correct information.
we already do it for ingress-operator

Wouldn’t this be different in the sense that https://github.com/openshift/installer/blob/9f62449aecce77dff5eff8f8f8dda98386fb2df9/pkg/asset/manifests/ingress.go#L97 is instantiating an orthogonal resource object to the bootstrapping process (an ingress controller instance) whereas here we discuss to instantiate the config that blocks the operator? Which means the creation of the CRD definition would block the instantiation of the object from the installer, which would block the operator. Unless the creation of the CRD definition is done by installer/CVO (which seems overcomplicate)? So wouldn’t it be https://github.com/openshift/cluster-version-operator/blob/master/vendor/github.com/openshift/api/config/v1/0000_10_config-operator_01_ingress.crd.yaml a better analogy?

This is the config that enables bareMetal as an underlying "platform" for consumers so having it in BareMetalPlatformStatus does not sound that bad to me.

enxebre · 2019-12-05T10:02:59Z

I actually think this fits well for the infrastructure object and BareMetalPlatformStatus See desc here Configuration details allowing the cluster to interact with its cloud provider. https://github.com/openshift/openshift-docs/blob/c5b7bdc145d8f79a60062934375117f5a22fa5ca/installing/install_config/customizations.adoc#informational-resources. This should be cluster wide config which never changes.

I'm good with any of the items below and with whatever @abhinavdahiya prefers here but I think my order of preference would be:
instance name :cluster resource name: infrastructure.config.openshift.io
instance name :cluster resource name: baremetal.config.openshift.io
instance name :cluster resource name: baremetal.operator.openshift.io / baremetalconfig.metal3.io

@abhinavdahiya who is installing *.config.openshift.io CRDs today?
If mao would need to install baremetal.operator.openshift.io / baremetalconfig.metal3.io that'd temporary block instantiation of the object from the installer and that temporary block the deployment of the controllers which I guess is fine.

Update:
This config might be read my multiple operators e.g mco #119 (comment) which makes me feel stronger about going with:
infrastructure.config.openshift.io or baremetal.config.openshift.io

hardys · 2019-12-05T10:34:18Z

that'd temporary block instantiation of the object from the installer and that temporary block the deployment of the controllers which I guess is fine.

That is true, but we do already have that problem since the metal3 BareMetalHost CRD is installed via the MAO, so the installer has to wait for that to instantiate the BareMetalHost objects - to deploy workers we need both those objects and this configuration data (either via this interface or the one in #119)

https://github.com/openshift/machine-api-operator/blob/master/install/0000_30_machine-api-operator_08_baremetalhost.crd.yaml

hardys · 2019-12-05T16:42:40Z

@abhinavdahiya @enxebre

A good question is raised by @enxebre on #119 which is what will install the new CRD - if we want to move the ironic pod[1] from the installer to the MCO, what will install that CRD such that we can use the data for templating a staticpod definition like in [2] ?

My initial assumption was that the MAO would install it, but that won't work for the bootstrap case?

The other option is probably to pass all this data into the installer ignition config templating [3] such that we can fix openshift/installer#2091 but then the only way we can move from the podman/systemd stuff to a static pod ref openshift/installer#2251 would be rendering the staticpod definition in the installer, not the MCO?

[1] https://github.com/openshift/installer/blob/master/data/data/bootstrap/baremetal/files/usr/local/bin/startironic.sh.template#L12
[2] https://github.com/openshift/machine-config-operator/blob/master/manifests/baremetal/coredns.yaml#L32
[3] https://github.com/openshift/installer/blob/master/pkg/asset/ignition/bootstrap/bootstrap.go#L51

abhinavdahiya · 2019-12-05T17:33:45Z

@abhinavdahiya @enxebre

I actually think this fits well for the infrastructure object and BareMetalPlatformStatus See desc here Configuration details allowing the cluster to interact with its cloud provider. https://github.com/openshift/openshift-docs/blob/c5b7bdc145d8f79a60062934375117f5a22fa5ca/installing/install_config/customizations.adoc#informational-resources. This should be cluster wide config which never changes.

Looking at the comments, it doesn't look like it's equivalent to cloudprovider configuration that is universally used by operators. this information is only used for one service i.e the provisioning network for PXEing machines.

And therefore i think it best belongs in a specific operator config.

I'm good with any of the items below and with whatever @abhinavdahiya prefers here but I think my order of preference would be:
instance name :cluster resource name: infrastructure.config.openshift.io
instance name :cluster resource name: baremetal.config.openshift.io
instance name :cluster resource name: baremetal.operator.openshift.io / baremetalconfig.metal3.io

@abhinavdahiya who is installing *.config.openshift.io CRDs today?
If mao would need to install baremetal.operator.openshift.io / baremetalconfig.metal3.io that'd temporary block instantiation of the object from the installer and that temporary block the deployment of the controllers which I guess is fine.

Update:
This config might be read my multiple operators e.g mco #119 (comment) which makes me feel stronger about going with:
infrastructure.config.openshift.io or baremetal.config.openshift.io

The object will be read by the MAO, and perhaps in future the MCO, to influence the configuration of the provisioning containers in the metal3 pod (started by the MAO), and perhaps the keepalived configuration created by the MCO cc @celebdor

the current use is to setup provisioning network and if the team tries to expand it to ANY baremetal configuration that would be ill-advised.

if we want to move the ironic pod[1] from the installer to the MCO, what will install that CRD such that we can use the data for templating a staticpod definition like in [2] ?

if the ironic is running in baremetalcontroller in machine-api it shouldn't be run as static pod using machine-config-operator on bootstrap-host. MCO is not a generic static pod generator :(

The baremetalcontroller will have to take responsibility to do it on bootstrap-host. and the configuration will be available to the controller/operator from disk.. it doesn't need the CRD installed..

hardys · 2019-12-05T17:39:35Z

if the ironic is running in baremetalcontroller in machine-api it shouldn't be run as static pod using machine-config-operator on bootstrap-host. MCO is not a generic static pod generator :(
The baremetalcontroller will have to take responsibility to do it on bootstrap-host. and the configuration will be available to the controller/operator from disk.. it doesn't need the CRD installed..

The MAO and other machine-api components don't run on the bootstrap VM yet? So I don't think we can use the same approach to launch ironic on the bootstrap VM (that's the reason there are two implementations atm).

It sounds like you'd prefer to just render the config directly in the installer (as we do now), then at some future point we can switch to the MAO starting it, when the full machine API stack is used to deploy masters via the bootstrap VM. We could still switch to a staticpod rendered by the installer I guess.

hardys · 2019-12-05T17:44:37Z

/cc @stbenjam it seems from feedback above that @abhinavdahiya would prefer to keep rendering of the bootstrap Ironic things inside the installer, not move to the MCO, so we may want to go ahead and break out the install-config changes from openshift/installer#2449 such that we can unblock rendering the config appropriately for ipv6 and external-DHCP use-cases.

abhinavdahiya · 2019-12-05T17:48:35Z

It sounds like you'd prefer to just render the config directly in the installer (as we do now), then at some future point we can switch to the MAO starting it, when the full machine API stack is used to deploy masters via the bootstrap VM. We could still switch to a staticpod rendered by the installer I guess.

it seems from feedback above that @abhinavdahiya would prefer to keep rendering of the bootstrap Ironic things inside the installer, not move to the MCO, so we may want to go ahead and break out the install-config changes from openshift/installer#2449 such that we can unblock rendering the config appropriately for ipv6 and external-DHCP use-cases.

My preference is to have the controller/operator that manages the operand in the cluster to manage the service on the bootstrap-host.

hardys · 2019-12-05T17:49:42Z

My preference is to have the controller/operator that manages the operand in the cluster to manage the service on the bootstrap-host.

That is the machine-api-operator, it starts the metal3 pod which contains the ironic components used for deploying workers. IIUC the machine-api-operator doesn't run on the bootstrap VM currently, so how would you envisage this working?

hardys · 2019-12-05T18:34:18Z

Ok I discussed this with @abhinavdahiya and explained that it's currently the machine-api-operator which starts the cluster-hosted provisioning services, including the baremetal-operator controller.

There was a misunderstanding that the BMO started the ironic services, which currently is not the case, so I don't think that will work for us in the bootstrap case.

To make progress I think we should focus on #119 to enable configuration of the metal3 pod started by the MAO, and the bootstrap VM provisioning services can remain something that's handled inside the installer.

If @sadasu and @abhinavdahiya are in agreement I'd suggest we abandon this PR, focus on landing #119, then rework or replace the API/Installer/MAO PRs already posted. This will fix configuration of the cluster-hosted provisioning services.

The gaps around configuration of the bootstrap-hosted provisioning services will be tracked via the existing installer bug openshift/installer#2091

sadasu · 2019-12-09T14:57:12Z

@enxebre Thank you so much for your comments regarding why the BareMetalPlatformStatus would be a good place to add the config. We had pursued this direction during the 4.3 time frame and gave up in favor of creating a new CR.
@abhinavdahiya I am ready to close this PR.

sdodson · 2019-12-09T15:20:18Z

/close
Any additional feedback should be taken to #119

openshift-ci-robot · 2019-12-09T15:20:21Z

@sdodson: Closed this PR.

In response to this:

/close
Any additional feedback should be taken to #119

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

openshift-ci-robot added the size/M Denotes a PR that changes 30-99 lines, ignoring generated files. label Oct 24, 2019

sadasu force-pushed the baremetal-config branch from 53b3131 to 071e505 Compare October 24, 2019 18:54

openshift-ci-robot requested review from enxebre and squeed October 24, 2019 18:55

sadasu mentioned this pull request Oct 24, 2019

Add Metal3 specific config items to BareMetalPlatformStatus openshift/api#480

Closed

openshift-ci-robot requested review from abhinavdahiya and deads2k October 25, 2019 19:19

sadasu force-pushed the baremetal-config branch from 071e505 to 369c5b6 Compare October 28, 2019 04:45

openshift-ci-robot added size/L Denotes a PR that changes 100-499 lines, ignoring generated files. and removed size/M Denotes a PR that changes 30-99 lines, ignoring generated files. labels Oct 28, 2019

sadasu force-pushed the baremetal-config branch from 369c5b6 to 9db8776 Compare October 28, 2019 05:39

sadasu mentioned this pull request Oct 28, 2019

WIP: Configure Metal3 from the installer openshift/installer#2449

Closed

deads2k reviewed Oct 28, 2019

View reviewed changes

sadasu force-pushed the baremetal-config branch from 9db8776 to b7a0cea Compare October 28, 2019 15:09

hardys reviewed Oct 28, 2019

View reviewed changes

sadasu force-pushed the baremetal-config branch 2 times, most recently from 015f169 to 3f1f92e Compare October 29, 2019 03:09

Proposing enhancement to BareMetalPlatformStatus

6bd5e34

BareMetalPlatformStatus CR is being augmented to support configuration required for successful Baremetal IPI deployments.

sadasu force-pushed the baremetal-config branch from 3f1f92e to 6bd5e34 Compare October 29, 2019 14:22

abhinavdahiya reviewed Oct 29, 2019

View reviewed changes

squeed reviewed Oct 30, 2019

View reviewed changes

EricArrakis reviewed Nov 1, 2019

View reviewed changes

dhellmann reviewed Nov 4, 2019

View reviewed changes

hardys mentioned this pull request Nov 27, 2019

Add baremetal provisioning configuration to a new CR #119

Merged

openshift-ci-robot closed this Dec 9, 2019

enxebre mentioned this pull request Jan 8, 2020

Add baremetal provisioning configuration in a new CR openshift/api#540

Closed


		This proposal aims to add the following to BareMetalPlatformStatus:

		1. ProvisioningInterface : This is the interface name on the underlying

		that is specific to a Baremetal deployment.

		This proposal aims to add the following to BareMetalPlatformStatus:

	5. CacheURL : This is a location where image required to bring up baremetal
	5. CacheURL : This is a location where the image required to bring up baremetal

	hosts have been previously downloaded for faster image downloads within the
	hosts has been previously downloaded for faster image downloads within the

Proposing enhancement to BareMetalPlatformStatus #90

Proposing enhancement to BareMetalPlatformStatus #90

Uh oh!

Conversation

sadasu commented Oct 24, 2019

Uh oh!

openshift-ci-robot commented Oct 24, 2019

Uh oh!

sadasu commented Oct 25, 2019

Uh oh!

deads2k commented Oct 25, 2019

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

sadasu commented Oct 28, 2019

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

hardys left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

hardys Nov 13, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

hardys Nov 13, 2019 •

edited

Loading