Uploaded image for project: 'Puppet'
  1. Puppet
  2. PUP-5422

Daemonized agent's pidfile never removed if stopped while waiting for a certificate

    XMLWordPrintable

    Details

    • Type: Bug
    • Status: Closed
    • Priority: Normal
    • Resolution: Fixed
    • Affects Version/s: PUP 4.2.2
    • Fix Version/s: PUP 4.3.0
    • Component/s: None
    • Labels:
      None
    • Environment:

      Solaris

    • Template:
    • Story Points:
      2
    • Sprint:
      Client 2015-11-11
    • Release Notes:
      Bug Fix
    • Release Notes Summary:
      If the daemonized agent was waiting for a cert to be issued, and the process was killed, e.g. SIGTERM or SIGINT, then the agent would exit ungracefully and leave its pid file behind. Now the agent gracefully exits and deletes its pid file.

      Description

      In Solaris, starting the puppet service with puppet resource service puppet ensure=running creates /var/run/puppetlabs/agent.pid with the PID of the daemonized process. However, stopping the service correctly kills the service but leaves the pidfile in place with the old PID.

      -bash-3.2# puppet resource service puppet ensure=running
      Notice: /Service[puppet]/ensure: ensure changed 'stopped' to 'running'
      service { 'puppet':
        ensure => 'running',
      }
       
      -bash-3.2# puppet resource service puppet
      service { 'puppet':
        ensure => 'running',
        enable => 'true',
      }
       
      -bash-3.2# cat /var/run/puppetlabs/agent.pid
      1343
       
      -bash-3.2# ps -elf | grep 1343
       0 S     root  1343     1   0  40 20        ?   9011        ? 11:07:40 ?           0:00 /opt/puppetlabs/puppet/bin/ruby /op
       
      # Stopping the service
      -bash-3.2# puppet resource service puppet ensure=stopped
      Notice: /Service[puppet]/ensure: ensure changed 'running' to 'stopped'
      service { 'puppet':
        ensure => 'stopped',
      }
       
      -bash-3.2# puppet resource service puppet
      service { 'puppet':
        ensure => 'stopped',
        enable => 'false',
      }
       
      -bash-3.2# cat /var/run/puppetlabs/agent.pid
      1343
       
      -bash-3.2# ps -elf | grep puppet
       
      -bash-3.2# svcs puppet
      STATE          STIME    FMRI
      disabled       11:08:31 svc:/network/puppet:default
      

      Note: this only affects agents which don't have a signed certificate, and it only affects Solaris. This is due to the fact that our puppet service start script in Solaris (/lib/svc/method/puppet) only runs exec /opt/puppetlabs/bin/puppet agent, unlike other platforms which use the --no-daemonize option. This means that other platforms don't even have a pidfile to begin with, since the daemon itself is not running.

      The certificate aspect is due to the fact that when we can't find a cert, we cycle and wait for it in Puppet::SSL::Host.wait_for_cert and eventually get hit with the stop signal. Something is preventing Puppet::Daemon.stop from ever being called, and thus the pidfile remains behind.

      This fix will affect all daemonization code paths (master [--daemonize], master --no-daemonize, agent --onetime, agent --no-daemonize, agent [--daemonize]); we should verify all those modes still work, and when running as a service that shutdown cleans up correctly. See reproduction steps by William Hopper below.

        Attachments

          Issue Links

            Activity

              People

              Assignee:
              Unassigned
              Reporter:
              whopper William Hopper
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

                Dates

                Created:
                Updated:
                Resolved:

                  Zendesk Support