Uploaded image for project: 'Puppet'
  1. Puppet
  2. PUP-9570

Catalog failure on first run due to pluginsync and environment switch

    XMLWordPrintable

    Details

    • Template:
      PUP Bug Template
    • Acceptance Criteria:
      Hide

      1. Agents sends pluginsync_environment fact when making a catalog request.

      2. It's possible to access this fact in a manifest to determine that the agent pluginsync'ed in a different environment than the one the compiler is currently using to compile a catalog. For example, the manifest can compare the agent's pluginsync environment against the server's environment, as set by the node classifier:

      node default {
        $server_env = $server_facts['environment']
        $pluginsync_env = $facts['pluginsync_environment']
       
        if $server_env != $pluginsync_env {
          warning("Node's environment has not converged yet")
        } else {
          include foo
        }
      }
      

      3. New fact added to "agent facts" section of the docs: https://puppet.com/docs/puppet/6.4/lang_facts_and_builtin_vars.html#puppet-agent-facts

      Show
      1. Agents sends pluginsync_environment fact when making a catalog request. 2. It's possible to access this fact in a manifest to determine that the agent pluginsync'ed in a different environment than the one the compiler is currently using to compile a catalog. For example, the manifest can compare the agent's pluginsync environment against the server's environment, as set by the node classifier: node default { $server_env = $server_facts['environment'] $pluginsync_env = $facts['pluginsync_environment']   if $server_env != $pluginsync_env { warning("Node's environment has not converged yet") } else { include foo } } 3. New fact added to "agent facts" section of the docs: https://puppet.com/docs/puppet/6.4/lang_facts_and_builtin_vars.html#puppet-agent-facts
    • Team:
      Coremunity
    • Sprint:
      Platform Core KANBAN, Coremunity Kanban
    • Method Found:
      Needs Assessment
    • Zendesk Ticket IDs:
      34157,40072,41507
    • Zendesk Ticket Count:
      3
    • Release Notes:
      Bug Fix
    • Release Notes Summary:
      Hide
      Previously, an agent would fail its run if it switched to a new environment whose manifests relied on a fact that only existed in the new environment. Now the agent will be redirected to the server-specified environment and the run will continue using that environment.
      Show
      Previously, an agent would fail its run if it switched to a new environment whose manifests relied on a fact that only existed in the new environment. Now the agent will be redirected to the server-specified environment and the run will continue using that environment.
    • QA Risk Assessment:
      Needs Assessment

      Description

      On the first run, the Puppet agent can fail to run due to a catalog failure, if its environment assignment in the classifier is based on facts.

      What will happen is:

      1. The new agent queries node terminus to determine environment to pluginsync against
      2. Master queries classifier with no facts to determine environment
      3. The classifier returns some default environment value, because the fact that matches the node to the correct environment node group has not been uploaded and saved to PuppetDB yet
      4. The agent pluginsync's against this default environment
      5. The agent uploads its facts and requests a catalog
      6. The master queries the classifier with the facts
      7. The classifier returns the correct environment
      8. The master starts compiling a catalog using the correct environment
      9. The correct environment's Puppet code expects one or more facts to be available, but they are not. The absence of these facts causes the catalog compilation to fail.
      10. The agent aborts the run due to the catalog compilation failure.

      For example:

      Set up an environment called "default" with no modules. Assign this environment as the default.

      Set up an environment called "production". Install the puppetlabs-stdlib module, and give it a site.pp that looks like:

      if ($facts['puppet_server'] == undef) {
        fail('No value provided for facts.puppet_server!')
      }
      

      Set up the classifier so that nodes are assigned the "Production environment" env node group if they have puppet_environment = production as a fact.

      Then run Puppet for the first time on a new agent. It will get something like this:

      puppet agent [root@pe-xl-core-2 ~]# puppet agent -t
      Notice: Local environment: 'production' doesn't match server specified node environment 'default', switching agent to 'default'.
      Info: Retrieving pluginfacts
      Info: Retrieving plugin
      Notice: /File[/opt/puppetlabs/puppet/cache/lib/facter]/ensure: created
      Notice: /File[/opt/puppetlabs/puppet/cache/lib/facter/aio_agent_build.rb]/ensure: defined content as '{md5}cdcc1ff07bc245c66cc1d46be56b3af5'
      Notice: /File[/opt/puppetlabs/puppet/cache/lib/facter/aio_agent_version.rb]/ensure: defined content as '{md5}d05c8cbf788f47d33efd46a935dda61e'
      [...]
      Notice: /File[/opt/puppetlabs/puppet/cache/lib/shared/pe_build.rb]/ensure: defined content as '{md5}4f4652af20c4f0391b9ca2976940a710'
      Notice: /File[/opt/puppetlabs/puppet/cache/lib/shared/pe_server_version.rb]/ensure: defined content as '{md5}f3d3fc8776512ae73d3293c97b8f3dfe'
      Info: Retrieving locales
      Info: Loading facts
      Error: Could not retrieve catalog from remote server: Error 500 on SERVER: Server Error: Evaluation Error: Error while evaluating a Function Call, No value provided for facts.puppet_server! (file: /et
      c/puppetlabs/code/environments/production/manifests/site.pp, line: 4, column: 3) on node pe-xl-core-2.dev33.puppet.vm
      Warning: Not using cache on failed catalog
      Error: Could not retrieve catalog; skipping run
      [root@pe-xl-core-2 ~]#
      

      What would be better:

      Rather than trying to compile a catalog if the agent pluginsync'd the wrong environment, fail fast. Don't compile a catalog the agent is going to throw away anyway. Tell the agent to try again. The agent already has logic to retry, and wouldn't use a catalog with the "wrong" environment anyway.

        Attachments

          Issue Links

            Activity

              People

              Assignee:
              josh Josh Cooper
              Reporter:
              reid Reid Vandewiele
              Votes:
              5 Vote for this issue
              Watchers:
              21 Start watching this issue

                Dates

                Created:
                Updated:
                Resolved:

                  Zendesk Support