Skip to content

Conversation

@darkmuggle
Copy link
Contributor

@darkmuggle darkmuggle commented Mar 24, 2020

Proposal to change RHCOS to support a "fail to a live root", to drop UPI installed platforms into an interactive recovery console.

@openshift-ci-robot
Copy link

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: darkmuggle
To complete the pull request process, please assign joelanford
You can assign the PR to them by writing /assign @joelanford in a comment when ready.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@darkmuggle
Copy link
Contributor Author

darkmuggle commented Mar 24, 2020


## Proposal

Rather than failing to the `emergency.target` upon an Ignition Failure.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove .

## Proposal

Rather than failing to the `emergency.target` upon an Ignition Failure.
Specific platforms (Qemu and Metal) will have platform-specific configurations
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is continuing from previous fragment.

@miabbott
Copy link
Member

Notes from an internal discussion around the proposal - https://hackmd.io/JUzhpTJaRtCD0LdJNaOsHQ

@cgwalters
Copy link
Member

Notes from an internal discussion around the proposal - https://hackmd.io/JUzhpTJaRtCD0LdJNaOsHQ

Awesome, thanks a ton for taking these notes and posting them. I think it's critically important that we continually fight the tendency to discuss and make decisions behind the RHT firewall and instead behave as much as possible as part of a collaborative FOSS project.

A good perspective on this is this blog post about Rust and the "core team".

In particular:

No New Rationale: decisions must be made only on the basis of rationale already debated in public (to a steady state)

So if we didn't post about it publicly, we shouldn't be making any decisions based on it.

@darkmuggle
Copy link
Contributor Author

We don't have consensus yet. Another model was proposed that may supersede this idea.
/hold

@openshift-ci-robot openshift-ci-robot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Mar 24, 2020

*Using the CoreOS Installer to drive network information* was considered and rejected
since it is specific to "metal" images. With this solution, its concievable to use
UPI installations on unsupported platforms such as Azure-like installations.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/Azure/Hyper-V/

@cgwalters
Copy link
Member

cgwalters commented Mar 24, 2020

OK so Dusty's suggestion is basically "Just use the Live ISO" even for e.g. the VMWare case which is cool because we're already on track to ship the Live ISO so there'd be nothing new at all to ship for "give me something to boot to a console where I can generate a network and Ignition config".

This also resolves concerns raised in the meeting about making Ignition failures explicit etc. because we aren't changing how the non-Live-ISO boots work at all.

The only detail though is we need to handle networking in the no-config case.

@darkmuggle
Copy link
Contributor Author

I'm not familiar with enough with the installer ISO to flesh out @dustymabe's idea. And I'm not sure the notes captured the nuance of it.

@miabbott
Copy link
Member

I think we can provide this "fail to live" functionality as well as providing the Live ISO to do similar kinds of configuration. One does not appear to preclude the other.

I'd recommend we continue to pursue this proposal, as we have received early feedback from folks in the field that it is desirable.

@cgwalters
Copy link
Member

It's really simple, for vmware or bare metal:

  • Download pristine -live.iso
  • Boot from it (image to USB, add as device in vmware, add via IMPI on metal, etc.)
  • You get auto-logged into a live CoreOS system without network
  • Generate an Ignition config however you like, plus e.g. kernel cmdline arguments for network
  • Run coreos-installer install -i /path/to/config.ign --firstboot-args <network kargs> /dev/whatever

All of this exists today and works except for the last issue mentioned in #256 (comment)

What feedback from the field isn't covered by this?

@cgwalters
Copy link
Member

That said, linking this to the "nmtui" thing - we should ship that as part of the live ISO, and then have support for synthesizing the network config into kernel args - or special support for dropping network config into the initramfs.

@darkmuggle
Copy link
Contributor Author

In follow up discussions, this idea has been superseded for an installer-based path that would handle injecting setting the network configuration.

@darkmuggle darkmuggle closed this Mar 25, 2020
@miabbott
Copy link
Member

It's really simple, for vmware or bare metal:

* Download pristine `-live.iso`

* Boot from it (image to USB, add as device in vmware, add via IMPI on metal, etc.)

* You get auto-logged into a live CoreOS system without network

* Generate an Ignition config however you like, plus e.g. kernel cmdline arguments for network

* Run `coreos-installer install -i /path/to/config.ign --firstboot-args <network kargs> /dev/whatever`

All of this exists today and works except for the last issue mentioned in #256 (comment)

Do we need a separate enhancement to cover the behavior change of dropping to a live system?

@cgwalters
Copy link
Member

Do we need a separate enhancement to cover the behavior change of dropping to a live system?

That's just it - there isn't a behavior change, it's what the Live ISO you can download from e.g. https://getfedora.org/coreos/ does today (and will for RHCOS too).

@miabbott
Copy link
Member

That's just it - there isn't a behavior change, it's what the Live ISO you can download from e.g. https://getfedora.org/coreos/ does today (and will for RHCOS too).

So any enhancement required would really be #210

@darkmuggle darkmuggle deleted the pr/rhcos-fail-live branch March 25, 2020 18:18
@darkmuggle
Copy link
Contributor Author

@cgwalters
Copy link
Member

So any enhancement required would really be #210

Well...the basics are that yes. It avoids the "catch the grub prompt" required in this blog.

We could call it done there. But I think there's a lot more we could do beyond solving the grub prompt problem, including more easily generating those kargs, and dealing with fetching the base Ignition configs.

And all that stuff could be a separate enhancement or we could append it to the existing one.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants