Skip to content

Stale Ansible SSH control master failures in securedrop-admin install #4364

@rmol

Description

@rmol

Description

When running securedrop-admin install, steps immediately after reboots of the servers can fail with errors like Timeout (62s) waiting for privilege escalating prompt, as Ansible's SSH control masters are stale.

This might be related to #4358, though it's not clear if all SSH failures happen after reboots.

Steps to Reproduce

I was able to reliably induce this problem with a clean install of the 0.12.2 RC, at Set sysctl flags for grsecurity, which happens immediately after a reboot.

Expected Behavior

That the Ansible playbook could still connect to the servers after they've been rebooted.

Actual Behavior

Instead, it fails as it tries to use the stale SSH control masters.

Comments

An effective fix is already used in restart-tor-carefully.yml: the Ansible control path directory is simply removed, preventing communication with the old master processes.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions