Uploaded image for project: 'PuppetDB'
  1. PuppetDB
  2. PDB-4020

Puppetserver should handle 503s from PuppetDB

    XMLWordPrintable

Details

    • PuppetDB
    • Customer Feedback
    • Major
    • 2 - 5-25% of Customers
    • 3 - Serious
    • 3 - $$$$
    • Customers using HA will expect that all data is preserved and that the catalog compilation is also function.
    • 32647,32663,32694,49292
    • 4
    • Bug Fix
    • Hide
      Prior to this fix, the http submission with command_broadcast enabled
      always returned the last response. As a result, a failure would be shown if
      the last connection produced a 503 response even though there was
      previously a successful PuppetDB response and the minimum successful
      responses have been met. This issue does not occur with responses that
      raised an exception. Since the puppet http_pool does not raise 503
      as an exception, this issue can be seen when the PuppetDB is in
      maintenance mode.

      This fix changes the behavior to send the last successful response
      when the minimum successful submissions have been met.
      Show
      Prior to this fix, the http submission with command_broadcast enabled always returned the last response. As a result, a failure would be shown if the last connection produced a 503 response even though there was previously a successful PuppetDB response and the minimum successful responses have been met. This issue does not occur with responses that raised an exception. Since the puppet http_pool does not raise 503 as an exception, this issue can be seen when the PuppetDB is in maintenance mode. This fix changes the behavior to send the last successful response when the minimum successful submissions have been met.

    Description

      As a Puppet user I expect the puppetserver to gracefully handle a transient PuppetDB maintenance mode state without reporting agent failures when there are multiple PuppetDB instances configured. 

      When one of the PuppetDB instances is in maintenance mode, it returns a 503 to the Puppetserver which should be handled when command_broadcast = true and min_successful_submissions = 1. However, puppetserver sees this as a failure and sends a 500 error to the agent. Since the other PuppetDB instance is available, and min_successful_submissions = 1, the 503 should be ignored and the command to the other PuppetDB should be successful. The actual result is that Puppetserver sends a 500 back to the agent and the agent runs are all failures for the duration that any of the PuppetDB nodes are in maintenance mode.

      Steps to reproduce:

      1. Configure PuppetDB replication
      2. Configure command_broadcast = true in the puppetdb.conf on the master
      3. Run a puppet agent while one of the PuppetDBs is in maintenance mode and the other one is available.

      Logs:

       

      From the puppetserver.log

       

      2018-06-08T10:35:23.831-07:00 WARN [qtp2042713953-1209] [puppetserver] Puppet Error connecting to pe-201810-agent-replica.puppetdebug.vlan on 8081 at route /pdb/cmd/v1?checksum=c36ef6428032c548e53a17e5c22e5b4447748574&version=9&certname=pe-201810-agent.puppetdebug.vlan&command=replace_catalog&producer-timestam p=1528479323, error message received was ''. Failing over to the next PuppetDB server_url in the 'server_urls' list 2018-06-08T10:35:23.835-07:00 ERROR [qtp2042713953-1209] [puppetserver] Puppet [503 ] PuppetDB is currently down. Try again later. 2018-06-08T10:35:23.835-07:00 ERROR [qtp2042713953-1209] [puppetserver] Puppet /opt/puppetlabs/puppet/lib/ruby/vendor_ruby/puppet/util/puppetdb/command.rb:8 2:in `submit' /opt/puppetlabs/puppet/lib/ruby/vendor_ruby/puppet/util/puppetdb.rb:62:in `block in submit_command' /opt/puppetlabs/puppet/lib/ruby/vendor_rub y/puppet/util/profiler/around_profiler.rb:58:in `profile' /opt/puppetlabs/puppet/lib/ruby/vendor_ruby/puppet/util/profiler.rb:51:in `profile' /opt/puppetlab s/puppet/lib/ruby/vendor_ruby/puppet/util/puppetdb.rb:99:in `profile' /opt/puppetlabs/puppet/lib/ruby/vendor_ruby/puppet/util/puppetdb.rb:59:in `submit_comm and' /opt/puppetlabs/puppet/lib/ruby/vendor_ruby/puppet/indirector/catalog/puppetdb.rb:14:in `block in save' /opt/puppetlabs/puppet/lib/ruby/vendor_ruby/pup pet/util/profiler/around_profiler.rb:58:in `profile' /opt/puppetlabs/puppet/lib/ruby/vendor_ruby/puppet/util/profiler.rb:51:in `profile' /opt/puppetlabs/pup pet/lib/ruby/vendor_ruby/puppet/util/puppetdb.rb:99:in `profile' /opt/puppetlabs/puppet/lib/ruby/vendor_ruby/puppet/indirector/catalog/puppetdb.rb:11:in `sa ve' /opt/puppetlabs/puppet/lib/ruby/vendor_ruby/puppet/indirector/store_configs.rb:24:in `save' /opt/puppetlabs/puppet/lib/ruby/vendor_ruby/puppet/indirecto r/indirection.rb:204:in `find' /opt/puppetlabs/puppet/lib/ruby/vendor_ruby/puppet/network/http/api/indirected_routes.rb:121:in `do_find' /opt/puppetlabs/pup pet/lib/ruby/vendor_ruby/puppet/network/http/api/indirected_routes.rb:48:in `block in call' /opt/puppetlabs/puppet/lib/ruby/vendor_ruby/puppet/context.rb:65 :in `override' /opt/puppetlabs/puppet/lib/ruby/vendor_ruby/puppet.rb:260:in `override' /opt/puppetlabs/puppet/lib/ruby/vendor_ruby/puppet/network/http/api/i ndirected_routes.rb:47:in `call' /opt/puppetlabs/puppet/lib/ruby/vendor_ruby/puppet/network/http/route.rb:82:in `block in process' org/jruby/RubyArray.java: 1735:in `each' /opt/puppetlabs/puppet/lib/ruby/vendor_ruby/puppet/network/http/route.rb:81:in `process' /opt/puppetlabs/puppet/lib/ruby/vendor_ruby/puppet/n etwork/http/route.rb:87:in `process' /opt/puppetlabs/puppet/lib/ruby/vendor_ruby/puppet/network/http/route.rb:87:in `process' /opt/puppetlabs/puppet/lib/rub y/vendor_ruby/puppet/network/http/handler.rb:64:in `block in process' /opt/puppetlabs/puppet/lib/ruby/vendor_ruby/puppet/util/profiler/around_profiler.rb:58 :in `profile' /opt/puppetlabs/puppet/lib/ruby/vendor_ruby/puppet/util/profiler.rb:51:in `profile' /opt/puppetlabs/puppet/lib/ruby/vendor_ruby/puppet/network /http/handler.rb:62:in `process' uri:classloader:/puppetserver-lib/puppet/server/master.rb:42:in `handleRequest'

       

      From the agent, which gets a 500 instead of a 503.

       

       -> "HTTP/1.1 500 Server Error\r\n"
      -> "Date: Fri, 08 Jun 2018 17:35:33 GMT\r\n"
      -> "Content-Type: application/json;charset=utf-8\r\n"
      -> "X-Puppet-Version: 5.5.1\r\n"
      -> "Content-Length: 108\r\n"
      -> "Server: Jetty(9.4.z-SNAPSHOT)\r\n"
      -> "\r\n"
      reading 108 bytes...
      -> "{\"message\":\"Server Error: [503 ] PuppetDB is currently down. Try again later.\",\"issue_kind\":\"RUNTIME_ERROR\"}"
      read 108 bytes
      Conn keep-alive
      Error: Could not retrieve catalog from remote server: Error 500 on SERVER: Server Error: [503 ] PuppetDB is currently down. Try again later.
      Warning: Not using cache on failed catalog
      Error: Could not retrieve catalog; skipping run

       

      I suspect this is coming from an unhandled or unsent exception in https://github.com/puppetlabs/puppetdb/blob/master/puppet/lib/puppet/util/puppetdb/http.rb#L162-L192

      Attachments

        Activity

          People

            Unassigned Unassigned
            jarret.lavallee Jarret Lavallee
            Votes:
            0 Vote for this issue
            Watchers:
            7 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Zendesk Support