Uploaded image for project: 'Puppet'
  1. Puppet
  2. PUP-8663

http_keepalive_timeout setting doesn't work correctly under Ruby 2.x

    Details

    • Type: Bug
    • Status: Closed
    • Priority: Normal
    • Resolution: Fixed
    • Affects Version/s: PUP 4.10.10, PUP 5.5.0
    • Fix Version/s: PUP 4.10.12, PUP 5.3.7, PUP 5.5.2
    • Component/s: None
    • Labels:
      None
    • Template:
      PUP Bug Template
    • Epic Link:
    • Team:
      Coremunity
    • Sprint:
      Platform Core KANBAN
    • Method Found:
      Inspection
    • CS Priority:
      Major
    • CS Frequency:
      5 - >90% of Customers
    • CS Severity:
      2 - Annoyance
    • CS Business Value:
      4 - $$$$$
    • CS Impact:
      Hide
      When we introduced puppetserver one of the gains was from connection reuse, this bug would seem to indicate that in some situations connections may not be getting reused because the ruby agent times out too soon.

      Support has been seeing mysteriously high numbers of connections which are believe to be related this issue. We're unsure if this may also be impacting intermediary network devices in situations with a large number of agents that are no longer reusing connections.
      Show
      When we introduced puppetserver one of the gains was from connection reuse, this bug would seem to indicate that in some situations connections may not be getting reused because the ruby agent times out too soon. Support has been seeing mysteriously high numbers of connections which are believe to be related this issue. We're unsure if this may also be impacting intermediary network devices in situations with a large number of agents that are no longer reusing connections.
    • Release Notes:
      Bug Fix
    • Release Notes Summary:
      Hide
      Puppet on ruby 2.0 or greater would close and reopen HTTP connection that were idle for more than 2 seconds, causing increased load on puppetservers. This change ensures the agent always uses our `http_keepalive_timeout` setting when determining when to close idle connections.
      Show
      Puppet on ruby 2.0 or greater would close and reopen HTTP connection that were idle for more than 2 seconds, causing increased load on puppetservers. This change ensures the agent always uses our `http_keepalive_timeout` setting when determining when to close idle connections.
    • QA Risk Assessment:
      Needs Assessment

      Description

      Back when the http_keepalive_timeout setting was introduced, Puppet ran under Ruby 1.9.3 and life was good:

      https://github.com/puppetlabs/puppet/blob/5.5.0/lib/puppet/network/http/pool.rb

      Then Ruby 2.x descended from the mountains and brought with it a Net::HTTP#keep_alive_timeout:

      https://github.com/ruby/ruby/blob/v2_4_3/lib/net/http.rb#L664

      Ruby's timeout defaults to 2 seconds and always overrides the timeout used by the Puppet agent. This causes HTTP connections to be closed early instead of being re-used which increases the TLS handshake load on Puppet Server along with the amount of connection state that network devices between the agent and server have to contend with.

      Reproduction Case

      • Install puppet-agent 5.5.0 and puppetserver 5.3.0 on a CentOS 7
        node:

      rpm -Uvh http://yum.puppetlabs.com/puppet5/puppet5-release-el-7.noarch.rpm
      yum install -y puppet-agent-5.5.0 puppetserver-5.3.0
      

      • Configure the puppet agent to connect to the local host and start
        puppetserver:

      /opt/puppetlabs/bin/puppet config set --section main server $(hostname -f)
      systemctl start puppetserver
      

      • Run puppet agent -t to verify things work.
      • Add a custom fact that will sleep for 4 seconds, which delays the
        HTTP POST request for a catalog:

      cat <<-'EOF' > /etc/puppetlabs/code/modules/testmod/lib/facter/sleep_fact.rb
      Facter.add(:sleep_fact) do
        setcode do
          $stdout.puts("Sleeping for 4 seconds...")
          Kernel.sleep(4)
          nil
        end
      end
      EOF
      

      • Set Puppet's http_keepalive_timeout to 15 seconds:

      /opt/puppetlabs/bin/puppet config set --section main http_keepalive_timeout 15s
      

      • Run puppet agent -t and track HTTP connections:

      /opt/puppetlabs/bin/puppet agent -t --http_debug 2>&1|grep -v '^->\|<-'
      

      Outcome

      The Puppet Agent opens a new TCP connection after loading facts despite
      http_keepalive_timeout being set to 15 seconds:

      # /opt/puppetlabs/bin/puppet agent -t --http_debug 2>&1|grep -v '^->\|<-'
      opening connection to m4t83ynafhampi9.delivery.puppetlabs.net:8140...
      opened
      starting SSL for m4t83ynafhampi9.delivery.puppetlabs.net:8140...
      SSL established
      reading 3339 bytes...
      read 3339 bytes
      Conn keep-alive
      Info: Using configured environment 'production'
      Info: Retrieving pluginfacts
      reading 204 bytes...
      read 204 bytes
      Conn keep-alive
      Info: Retrieving plugin
      reading 259 bytes...
      read 259 bytes
      Conn keep-alive
      Info: Retrieving locales
      reading 204 bytes...
      read 204 bytes
      Conn keep-alive
       
      Info: Loading facts
      Sleeping for 4 seconds...
      Conn close because of keep_alive_timeout
      opening connection to m4t83ynafhampi9.delivery.puppetlabs.net:8140...
      opened
      starting SSL for m4t83ynafhampi9.delivery.puppetlabs.net:8140...
      SSL established
       
      reading 310 bytes...
      read 310 bytes
      Conn keep-alive
      Info: Caching catalog for m4t83ynafhampi9.delivery.puppetlabs.net
      Info: Applying configuration version '1523668333'
      Notice: Applied catalog in 0.11 seconds
      reading 9 bytes...
      read 9 bytes
      Conn keep-alive
      

      Expected Outcome

      The Puppet agent should re-use the connection opened at the beginning of the run.

        Attachments

          Activity

            People

            • Assignee:
              jacob.helwig Jacob Helwig
              Reporter:
              chuck Charlie Sharpsteen
            • Votes:
              0 Vote for this issue
              Watchers:
              6 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Zendesk Support