-
Notifications
You must be signed in to change notification settings - Fork 525
Providing background for Baremetal IPI based enhancements #102
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
| @@ -0,0 +1,220 @@ | |||
| --- | |||
| title: Neat-Enhancement-Idea | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is a neat idea, but I think you want the actual doc title here? ;-)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you say so, @dhellmann :-). Needed to check if it had to be a hyphenated string and didn't get back to it.
|
|
||
| ## Alternatives | ||
|
|
||
| Similar to the `Drawbacks` section the `Alternatives` section is used to |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One alternative would be to deliver metal3 on top of the installed cluster, except that doesn't help us install the cluster in the first place. That's why the integration with the installer is there.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The installer tries to address this sort of thing today with its bootstrap node:
OpenShift breaks this dependency loop using a temporary bootstrap machine. This bootstrap machine is booted with a concrete Ignition Config which describes how to create the cluster. This machine acts as a temporary control plane whose sole purpose is launching the rest of the cluster.
The full bootstrap model hasn't been fully actualized: he installer still creates the master node resources rather than bootstrapping through the cluster-api like workers are. It might be worth looking into helping that effort along if it would solve some of your use cases.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Baremetal IPI is already integrated with the installer and the bootstrap node/terraform:
https://github.com/openshift/installer/blob/master/docs/user/metal/install_ipi.md
But yes it would be interesting if we could help move away from terraform and towards using the metal3 operator on the bootstrap node.
|
This looks like a cool idea. Are we talking about setting up RHCOS nodes only? RHEL only? Both? Would existing items such as |
The current implementation relies on RHCOS. There's no particular reason we couldn't make it work with RHEL, but that hasn't been added to any roadmap. The OS image is written to disk using the Ironic provisioning tool, rather than running an installer. |
imain
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could be a good reference, thanks Sandhya.
| run on all baremetal servers to monitor the hardware inventory on each | ||
| of these servers. | ||
|
|
||
| ### User Stories [optional] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder id we shouldn't describe how a new node gets provisioned from start to finish in here somewhere to really explain the idea in full?
a11002f to
cf6cc6c
Compare
| a functioning cluster starting with a set of baremetal servers. As | ||
| mentioned earlier, these enhancements rely on the Baremetal Operator (BMO) | ||
| [1] running within the "metal3" pod to manage baremetal hosts. The BMO in | ||
| turn relies on the Ironic service [3] to manage and provision baremetal |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
would be nice to format the user stories as "As a role I want to feature so I... "
|
this is a very nice write up! thanks @sadasu |
cf6cc6c to
34f0055
Compare
|
|
||
| Control Plane Deployment | ||
|
|
||
| 1. A minimin Baremetal IPI deployment consists of 4 hosts, one to be used |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This doc needs to take into account or reference the future path for installer - removing bootstrap nodes, configuring a full master on the first instance, replacing the hardcoded load balancers with service load balancers for the apiserver, etc. we don’t want to have a doc that embeds point in time context, or fails to move in the direction of the platform.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed. Will reference other enhancement docs from this one. The plan is to keep this doc alive with updates whenever new features are being worked on. Also, as new enhancement requests are written for new features affecting the baremetal platform, the idea is that these new enhancements will point back to this for context.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@sadasu I see clayton's request as outstanding. The load balancers in particular. How are they being created? By which actor? When?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The load balancer handling is documented here: https://github.com/openshift/installer/blob/master/docs/design/baremetal/networking-infrastructure.md
This would be a good reference to link from this doc
| to both the provisioning and external networks. | ||
|
|
||
| 2. Installation can be kicked off by downloading and running | ||
| "openshift-baremetal-install". This image differs from the "openshift-install" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a deficiency - how are you planning to fix this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The reason we're not part of the main openshift-install binary is our dependency on libvirt libraries. One proposal we discussed previously is to use https://github.com/digitalocean/go-libvirt which just hits the RPC directly, but we'd have to spend the time and rewrite the libvirt terraform provider.
This isn't on our radar to fix in the near term. There's a spike on the installer team to investigate removing the bootstrap host which would make this easier.
Do you want this documented here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@smarterclayton, as mentioned by @stbenjam removing the bootstrap node would remove our dependency on libvirt. That way baremetal IPI installs can be part of the Openshift installer. This is part of our road map. I have updated the document to reflect that.
I think this "enhancement" is essentially provides a background that reviewers of baremetal IPI related code patches need to get context for what they are reviewing. We will make sure to keep this document alive. For these reasons, I think it would be helpful to merge this document.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Where and how is this separate installer maintained?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is a separate installer binary, it is part of the same repo.
It is distributed as part of the baremetal-installer image (like libvirt-installer is distributed as part of libvirt-installer). It's extractable using oc by extracting the openshift-baremetal-install command.
@sadasu You may want to include this information here.
This enhancement request is intented to provide context for all the work that is in progress for BareMetal IPI deployments and a backdrop for all the future enhancement requests in this area.
7066df8 to
63167f7
Compare
|
|
||
| ### Goals | ||
|
|
||
| The goal of this enhancement request is to provide context for all the changes |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this doc should answer
- comparison of metal3 to the AWS cloud provider.
- components outside the cluster that have to be provisioned outside of a kube control plane, that will be used from inside the kube control plane. (High level with links is fine).
- new components inside the cluster that will speak to the outside metal3 components. (High level with links is fine)
- how internal apiserver LB, external apiserver LB, and ingresscontroller LBs will be managed an provisioned. This needs to be pretty explicit
- the network section from https://github.com/openshift/installer/blob/master/docs/user/metal/install_ipi.md#network-requirements with clear labeling about what, if anything, is managed on the cluster
- top level item for the bootstrap host that the installer will use.
- what information does a cluster-admin provide to the installer about metal3
- explanation of bring-up when all machines in the cluster are stopped and masters are started back up.
I may edit this comment to add more items as I have questions.
| configuring baremetal servers to be part of the cluster. The traffic on the | ||
| provisioning network needs to be isolated from the traffic on the external | ||
| network (hence 2 seperate networks.). The external network is used to carry | ||
| cluster traffic which which includes cluster control plane traffic, application |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lift the network section from the install doc. It's nice and crisp. Make a section for it here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| and then baremetal IPI installs can be part of the normal Openshift installer. | ||
| This is in the roadmap for this work and being investigated. | ||
|
|
||
| 3. The installer starts a bootstrap VM on the provisioning host. With other |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
when you describe what a provisioning host is (that's missing at the moment), indicate what needs to be installed to make this VM startable.
| the network interface on the provisioning host that is connected to the | ||
| provisioning network needs to be provided to the installer. | ||
|
|
||
| 5. The bootstrap VM must be configured with a special well-known IP within the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
well-known like "always the same" or well-known like the cluster-admin informs the installer?
| 5. The bootstrap VM must be configured with a special well-known IP within the | ||
| provisioning network that needs to provided as input to the installer. | ||
|
|
||
| 6. The installer user Ironic in the bootstrap VM to provision each host that |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
when you describe components, list ironic. Is this running a static pod, is it a daemon on the bootstrap VM, something else? Does this mean the bootstrap VM has to live past installation?
| that configures each host to boot over the provisioning network using DHCP | ||
| and PXE. | ||
|
|
||
| 7. The bootstrap VM runs a DHCP server and responds with network infomation and |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
dhcp server to provide IPs for which network. How long does this DHCP server need to exist? is it user configurable (IP allocation is often dictated).
russellb
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see some suggestions for things to be improved, and some content that is now out of date, but I think we should merge this and continue to improve it with follow up PRs. It's best to get this in so we can start updating it since this support has already merged throughout OpenShift.
|
|
||
| Control Plane Deployment | ||
|
|
||
| 1. A minimin Baremetal IPI deployment consists of 4 hosts, one to be used |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The load balancer handling is documented here: https://github.com/openshift/installer/blob/master/docs/design/baremetal/networking-infrastructure.md
This would be a good reference to link from this doc
|
|
||
| ### Graduation Criteria | ||
|
|
||
| Metal3 integration is in tech preview in 4.2 and is targetted for GA in 4.4. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should update the release target to GA in 4.6
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
also fix the "targetted" typo
| Metal3 integration is in tech preview in 4.2 and is targetted for GA in 4.4. | |
| Metal3 integration is in tech preview in 4.2 and is targeted for GA in 4.6. |
|
|
||
| ### Upgrade / Downgrade Strategy | ||
|
|
||
| Metal3 integration is in tech preview n 4.2 and missing key pieces that allows |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| Metal3 integration is in tech preview n 4.2 and missing key pieces that allows | |
| Metal3 integration is in tech preview in 4.2 and missing key pieces that allows |
|
|
||
| Baremetal IPI deployments enable OpenShift to enroll baremetal servers to become | ||
| Nodes that can run K8s workloads. | ||
| The Baremetal Operator [1] along with other provisioning services (Ironic and |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think this syntax works for links. I've suggested an alternative that should work.
| The Baremetal Operator [1] along with other provisioning services (Ironic and | |
| The [Baremetal Operator](1) along with other provisioning services (Ironic and |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note that there are several instances of this in the doc
|
Here's what I'm going to do to move forward with this enhancement:
/approve |
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: russellb, sadasu The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
Add some text throughout the enhancement to clarifying things in response to some questions in the review of openshift#102
baremetal: Address comments on #102
This enhancement request is intented to provide context for
all the work that is in progress for BareMetal IPI deployments
and a backdrop for all the future enhancement requests in this
area.