[PUP-8962] Type and provider info should be logged when prefetch fails Created: 2018/06/26  Updated: 2019/12/12  Resolved: 2019/01/22

Status: Closed
Project: Puppet
Component/s: None
Affects Version/s: PUP 4.10.12, PUP 5.5.2, PUP 6.0.0
Fix Version/s: PUP 5.5.12, PUP 6.0.7, PUP 6.2.0

Type: Bug Priority: Normal
Reporter: Charlie Sharpsteen Assignee: Kris Bosland
Resolution: Fixed Votes: 0
Labels: resolved-issue-added
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Relates
relates to PUP-7630 "Successful" puppet agent reports hid... Closed
Template: PUP Bug Template
Epic Link: Agent Logging
Team: Coremunity
Sprint: Platform Core KANBAN
Method Found: Customer Feedback
CS Priority: Reviewed
Zendesk Ticket IDs: 33723
Zendesk Ticket Count: 1
Release Notes: Bug Fix
Release Notes Summary: Puppet will now log exceptions that happen during prefetch. Also exception handling in prefetch was fixed to the correct set of exceptions.
QA Risk Assessment: Needs Assessment

 Description   

While the puppet agent is applying a catalog, it will pre-fetch resource instances that share a common source. Such as packages from the same provider or cron jobs from the same crontab. Catalog application will abort with an error message if this prefetch operation fails. Often, these errors are terse, cryptic, and give no insight into which sort of resource type failed to fetch (cron job? package? SSH key?). In the case of types based on the ParsedFile provider, the prefetch code is shared among all types which means that the stack trace produced by --trace doesn't even help distinguish what type encountered the error.

Error messages should be clear about which type and provider failed so that users can proceed directly to debugging instead of puzzling out what happened.

Reproduction Case

  • Install PE 2018.1.2 on CentOS 7.
  • Configure site.pp in the production environment with a cron resource:

cat <<EOF > /etc/puppetlabs/code/environments/production/manifests/master_node.pp
node '$(hostname -f)' {
  cron {'cron job to trigger prefetch':
    command => '/bin/true',
    user => 'root',
    hour => 1,
  }
}
EOF

  • Write garbage to root's crontab to trigger a parse failure:

printf '\0\0\0\0odelay' > /var/spool/cron/root

  • Run puppet agent -t

Outcome

The catalog fails to apply with a very terse, cryptic error message:

# puppet agent -t
Info: Using configured environment 'production'
Info: Retrieving pluginfacts
Info: Retrieving plugin
Info: Retrieving locales
Info: Loading facts
Info: Caching catalog for pe-201812-master.puppetdebug.vlan
Info: Applying configuration version '1530028114'
Error: Failed to apply catalog: Could not parse line "\u0000\u0000\u0000\u0000odelay" (file: root, line: 2)

Expected Outcome

Error message should include the type and provider name as is done when prefetch fails due to a LoadError or Puppet::MissingCommand error:

# puppet agent -t
Info: Using configured environment 'production'
Info: Retrieving pluginfacts
Info: Retrieving plugin
Info: Retrieving locales
Info: Loading facts
Info: Caching catalog for pe-201812-master.puppetdebug.vlan
Info: Applying configuration version '1530028653'
Error: Failed to apply catalog: Could not prefetch cron provider 'crontab': Could not parse line "\u0000\u0000\u0000\u0000odelay" (file: root, line: 2)



 Comments   
Comment by Jacob Helwig [ 2018/10/02 ]

Charlie Sharpsteen, it looks like the error message should get translated into something a bit more useful if you turn on future_features in the puppet.conf (prefetch in transaction.rb). I'm not sure what the plan was for when to move the new (actually old) prefetch behavior from PUP-7630 to being the default behavior, though. Might have been something that was missed for 6.0?

Comment by Jorie Tappa [ 2018/10/19 ]

Grooming session:

Looks like we forgot to un-gate some things attached to the future_features flag that should've been on by default in pup 6.

 

ping Charlie Sharpsteen

 

Comment by Charlie Sharpsteen [ 2018/10/22 ]

The crontab example still fails mysteriously with Puppet 6.0.3 in the latest PE 2019.0 nightly:

[root@pe-20190nightly-master ~]# puppet --version
6.0.3
 
[root@pe-20190nightly-master ~]# puppet agent -t
Info: Using configured environment 'production'
Info: Retrieving pluginfacts
Info: Retrieving plugin
Info: Retrieving locales
Info: Loading facts
Info: Caching catalog for pe-20190nightly-master.puppetdebug.vlan
Info: Applying configuration version '1540246020'
Error: Failed to apply catalog: Could not parse line "\u0000\u0000\u0000\u0000odelay" (file: root, line: 2)

The error message is much better if future_features is enabled and allows for a full catalog application to occur:

[root@pe-20190nightly-master ~]# puppet config set future_features true
 
[root@pe-20190nightly-master ~]# puppet agent -t
Info: Using configured environment 'production'
Info: Retrieving pluginfacts
Info: Retrieving plugin
Info: Retrieving locales
Info: Loading facts
Info: Caching catalog for pe-20190nightly-master.puppetdebug.vlan
Info: Applying configuration version '1540246148'
Error: Could not prefetch cron provider 'crontab': Could not parse line "\u0000\u0000\u0000\u0000odelay" (file: root, line: 2)
Warning: /Stage[main]/Main/Node[pe-20190nightly-master.puppetdebug.vlan]/Cron[cron job to trigger prefetch]: Skipping because provider prefetch failed
Warning: /Stage[main]/Pe_repo::Bulk_pluginsync/Cron[create tar.gz of pluginsync cache]: Skipping because provider prefetch failed
Notice: Applied catalog in 15.21 seconds

Comment by Josh Cooper [ 2018/10/22 ]

Yeah, looks like we missed it for 6.0. That said we should be able to log the exception before raising, which was the behavior for load errors and missing commands before PUP-7630 was implemented:

rescue => detail
  #TRANSLATORS `prefetch` is a function name and should not be translated
  Puppet.log_exception(detail, _("Could not prefetch %{type_name} provider '%{name}': %{detail}") % { type_name: type_name, name: provider_class.name, detail: detail })
  message = _("Could not prefetch %{type_name} provider '%{name}': %{detail}") % { type_name: type_name, name: provider_class.name, detail: detail }
  Puppet.log_exception(detail, message)
 
  if !detail.is_a?(LoadError) && !detail.is_a?(Puppet::MissingCommand)
    raise unless Puppet.settings[:future_features]
    @prefetch_failed_providers[type_name][provider_class.name] = true
  end
end

Secondary issue is that we're currently doing rescue => Exception which isn't desirable, because it will catch all exceptions like SystemExit, which we shouldn't.

{strike}We should do rescue LoadError, StandardError instead.{strike}

Update: ignore the rescue comment. The code is structured that way so that we can mark all resources for that provider as failed.

Comment by Josh Cooper [ 2019/01/10 ]

Merged to 5.5.x in https://github.com/puppetlabs/puppet/commit/049430128a583f56e55b643420956934a2a2ce55

Comment by Josh Cooper [ 2019/01/11 ]

Follow up merged to 5.5.x in https://github.com/puppetlabs/puppet/commit/0afbf4d44b557224a74a47e484138e274460d20a

Generated at Fri Aug 07 09:39:19 PDT 2020 using Jira 8.5.2#805002-sha1:a66f9354b9e12ac788984e5d84669c903a370049.