Details
-
Bug
-
Status: Resolved
-
Normal
-
Resolution: Fixed
-
PUP 6.11.1
-
None
-
CentOS 7
-
Night's Watch
-
2
-
NW - 2020-02-05, NW - 2020-02-19
-
Needs Assessment
-
Bug Fix
-
Fixed pidfile lock removal for when Puppet Agent is started as a LightWeight Process and is incorrectly terminated on POSIX operating systems.
-
Needs Assessment
Description
Puppet Version: 6.11.1
Puppet Server Version: 6.11.1
OS Name/Version: CentOS 7
When Puppet agent is incorrectly terminated (eg. killed by KILL signal) it might have a problem in detecting stale PID file. The code in question is this:
puppet/lib/ruby/vendor_ruby/puppet/util/pidlock.rb |
def clear_if_stale |
begin |
Process.kill(0, lock_pid) |
rescue *errors |
return @lockfile.unlock |
end |
|
if Puppet.features.posix? |
procname = Puppet::Util::Execution.execute(["ps", "-p", lock_pid, "-o", "comm="]).strip |
args = Puppet::Util::Execution.execute(["ps", "-p", lock_pid, "-o", "args="]).strip |
@lockfile.unlock unless procname =~ /ruby/ && args =~ /puppet/ || procname =~ /puppet(-.*)?$/ |
elsif Puppet.features.microsoft_windows? |
# On Windows, we're checking if the filesystem path name of the running |
# process is our vendored ruby: |
exe_path = Puppet::Util::Windows::Process::get_process_image_name_by_pid(lock_pid) |
@lockfile.unlock unless exe_path =~ /\\bin\\ruby.exe$/ |
end |
Process.kill(0, pid) tries to find out if process with certain pid exists. The problem is that Process.kill checks regular processes as well as LightWeight Processes (LWP), so it will verify if certain process or lightweight process currently exists. If it exists it will try to find the name of the command for it. Unfortunately ps -p command only cares about processes and not LightWeight Processes, so if stale file contains PID of LWP Puppet will never be able to recover (unless we remove stale lock file) as LWP usually are spawned by long running daemons. Please find an attached patch (tested on CentOS7) which addresses the above issue: it makes sure ps command also considers LWPs otherwise you might run into error shown below.
Desired Behavior:
Puppet agent starts up and runs correctly.
Actual Behavior:
# puppet agent --test
|
Error: Could not run Puppet configuration client: Execution of 'ps -p 2181 -o comm=' returned 1:
|