Skip to content

Conversation

@Squiddim
Copy link

@Squiddim Squiddim commented Jan 27, 2026

This PR aims to fix #6219 by swapping the restart policy of the sssd systemd unit back to on-failure.

The change to Restart=on-abnormal was introduced in Commit a049ac7 to fix an infinite loop on system boot caused by misconfiguration (see #5753 / https://bugzilla.suse.com/show_bug.cgi?id=1188999).

While this fixed their issue, I'd argue that the StartLimit settings introduced alongside it should accomplish the goal of avoiding an infinite loop in most scenarios:

     StartLimitIntervalSec=50s
     StartLimitBurst=5

I believe that the issues fixed (#6219) by setting the restart policy back to on-failure take precedence over the edge cases where sssd could inhibit system boot when sssd/subsystems are too slow to trigger the StartLimit.

Copy link

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

The pull request changes the systemd Restart policy for sssd from on-abnormal to on-failure. This aims to address issue #6219 by allowing sssd to restart in more failure scenarios, including watchdog kills. The author acknowledges that on-abnormal was previously introduced to prevent infinite restart loops due to misconfiguration, and argues that existing StartLimit directives should mitigate this risk. However, there's a potential for the original issue to re-emerge in certain edge cases if the StartLimit configuration is not fully comprehensive.

Type=notify
NotifyAccess=main
Restart=on-abnormal
Restart=on-failure

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The change from Restart=on-abnormal to Restart=on-failure reintroduces the risk of an infinite restart loop on system boot due to misconfiguration, which on-abnormal was specifically intended to prevent (as noted in the PR description regarding #5753). While StartLimitIntervalSec and StartLimitBurst are in place, the description mentions "most Scenarios" and "Edge Cases where sssd could inhibit a System boot". This suggests that these limits might not fully mitigate the original problem in all situations. Re-evaluating the StartLimit values or considering a more robust mechanism to prevent boot inhibition under misconfiguration is crucial for system stability.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

sssd don't restart properly after being killed by watchdog

1 participant