Uploaded image for project: 'Puppet'
  1. Puppet
  2. PUP-10946

Recursive file resources generate dangerous numbers of resource events

    XMLWordPrintable

    Details

    • Template:
      PUP Bug Template
    • Team:
      Night's Watch
    • Story Points:
      5
    • Sprint:
      NW - 2021-03-31, NW - 2021-04-14, NW-2021-04-28, NW-2021-05-19
    • Method Found:
      Customer Feedback
    • CS Priority:
      Major
    • Zendesk Ticket IDs:
      35285,35946,36746,36888,37704,39044,40977,41896,41998,42904
    • Zendesk Ticket Count:
      10
    • CS Rank:
      7,580
    • Release Notes:
      Enhancement
    • Release Notes Summary:
      Hide
      By default, the file and tidy resource types will generate a warning on the console and report if puppet tries to manage more than 1000 files when the "recurse" parameter is true. In that situation it is better to use a package or an exec command, respectively, to manage large numbers of files.

      In addition, the file and tidy resource types support a new parameter "max_files" that will enforce a hard limit. If the number of recursive files is greater than the limit, then the agent run will fail.
      Show
      By default, the file and tidy resource types will generate a warning on the console and report if puppet tries to manage more than 1000 files when the "recurse" parameter is true. In that situation it is better to use a package or an exec command, respectively, to manage large numbers of files. In addition, the file and tidy resource types support a new parameter "max_files" that will enforce a hard limit. If the number of recursive files is greater than the limit, then the agent run will fail.
    • QA Risk Assessment:
      Needs Assessment

      Description

      The file and tidy resource types have recursion options that allow the management of entire directory trees. This is accomplished by walking the tree and generating a new file resource that manages the state of each entry separately. This approach works well for small numbers of files (100s) but generates excessive numbers of events when applied to directory trees that contain large numbers of files. At high file counts, the behavior is detrimental to Puppet Agent performance and can cause Puppet Server outages by generating large reports. This behavior also combines poorly with noop => true which causes the large number of events to be reported every time Puppet runs instead of once when the state is enforced.

      Reporduction Case

      • Install Puppet Server 7 on a CentOS 7 node:

      yum install -y http://yum.puppetlabs.com/puppet7-release-el-7.noarch.rpm
      yum install -y puppetserver
      source /etc/profile.d/puppet-agent.sh
       
      puppet config set server $(hostname -f)
      puppetserver ca setup
       
      systemctl start puppetserver
      

      • Install PuppetDB 7 and configure it as a report processor:

      puppet module install puppetlabs-puppetdb
       
      puppet apply <<'EOF'
      class { 'puppetdb':
        postgres_version => '11',
      }
       
      class { 'puppetdb::master::config':
        enable_reports          => true,
        manage_report_processor => true,
      }
      EOF
      

      • Next, configure Puppet Server to use one JRuby instance with the default allocation of 512 MB of RAM:

      puppet module install puppetlabs-hocon
       
      puppet apply <<'EOF'
      service { 'puppetserver':
        ensure => running,
      }
       
      ini_subsetting {
        default:
          ensure            => present,
          path              => '/etc/sysconfig/puppetserver',
          section           => '',
          key_val_separator => '=',
          setting           => 'JAVA_ARGS',
          notify            => Service['puppetserver'],
        ;
        'puppetserver min ram':
          subsetting => '-Xms',
          value      => '512m',
        ;
        'puppetserver max ram':
          subsetting => '-Xmx',
          value      => '512m',
        ;
      }
       
      file_line { 'fix EZ-142':
        ensure => present,
        path   => '/opt/puppetlabs/server/apps/puppetserver/cli/apps/start',
        line   => "out_of_memory_flag='-XX:OnOutOfMemoryError=kill -9 %p'",
        match  => '^out_of_memory_flag=',
        notify => Service['puppetserver'],
      }
       
      hocon_setting { 'puppetserver jruby instances':
        ensure  => present,
        path    => '/etc/puppetlabs/puppetserver/conf.d/puppetserver.conf',
        setting => 'jruby-puppet.max-active-instances',
        value   => 1,
        notify  => Service['puppetserver'],
      }
      EOF
      

      • Generate certificates for a test node and configure it to recursively purge a deep directory tree:

      curl -L https://raw.githubusercontent.com/LLNL/fdtree/master/fdtree.bash -o /usr/local/bin/fdtree
      mkdir -p /tmp/recursion_test
       
      # Create, 1 level, 19 directories per level, 999 files per directory, 0 bytes per file
      bash /usr/local/bin/fdtree -C -l 1 -d 19 -f 999 -s 0 -o /tmp/recursion_test
       
      puppetserver ca generate --certname recursion.test
       
      cat <<'EOF' >>/etc/puppetlabs/code/environments/production/manifests/site.pp
      node 'recursion.test' {
        file {"/tmp/recursion_test":
          ensure  => directory,
          recurse => true,
          purge   => true,
          noop    => true,
        }
      }
      EOF
      

      • Run `puppet agent` to enforce the resource and submit a report (this will take about 5 minutes):

      # Direct output to /dev/null to avoid spamming the console
      puppet agent -t --certname recursion.test &>/dev/null
      

      • Check the puppetserver service to see if it survived.

      Outcome

      When faced with the prospect of processing such a large quantity of events, puppetserver chose to reincarnate:

      # journalctl SYSLOG_IDENTIFIER=puppetserver + UNIT=puppetserver.service
      -- Logs begin at Tue 2021-03-02 23:07:14 UTC, end at Tue 2021-03-02 23:31:35 UTC. --
      Mar 02 23:11:23 wine-appointee systemd[1]: Starting puppetserver Service...
      Mar 02 23:11:41 wine-appointee systemd[1]: Started puppetserver Service.
      Mar 02 23:12:43 wine-appointee systemd[1]: Stopping puppetserver Service...
      Mar 02 23:12:44 wine-appointee systemd[1]: Starting puppetserver Service...
      Mar 02 23:13:02 wine-appointee systemd[1]: Started puppetserver Service.
      Mar 02 23:26:54 wine-appointee systemd[1]: Stopping puppetserver Service...
      Mar 02 23:26:55 wine-appointee systemd[1]: Starting puppetserver Service...
      Mar 02 23:27:13 wine-appointee systemd[1]: Started puppetserver Service.
      Mar 0 23:31:35 wine-appointee puppetserver[2930]: #
      Mar 02 23:31:35 wine-appointee puppetserver[2930]: # java.lang.OutOfMemoryError: GC overhead limit exceeded
      Mar 02 23:31:35 wine-appointee puppetserver[2930]: # -XX:OnOutOfMemoryError="kill -9 %p"
      Mar 02 23:31:35 wine-appointee puppetserver[2930]: #   Executing /bin/sh -c "kill -9 2955"...
      Mar 02 23:31:35 wine-appointee systemd[1]: puppetserver.service: main process exited, code=killed, status=9/KILL
      Mar 02 23:31:35 wine-appointee systemd[1]: Unit puppetserver.service entered failed state.
      Mar 02 23:31:35 wine-appointee systemd[1]: puppetserver.service failed.
      Mar 02 23:31:35 wine-appointee systemd[1]: puppetserver.service holdoff time over, scheduling restart.
      Mar 02 23:31:35 wine-appointee systemd[1]: Starting puppetserver Service...2
      

      Expected Outcome

      File recursion can't DOS the puppetserver service.

        Attachments

          Issue Links

            Activity

              People

              Assignee:
              ciprian.badescu Ciprian Badescu
              Reporter:
              chuck Charlie Sharpsteen
              Votes:
              0 Vote for this issue
              Watchers:
              11 Start watching this issue

                Dates

                Created:
                Updated:
                Resolved:

                  Zendesk Support