Affects Version/s: PUP 6.8.1
Fix Version/s: None
Template:PUP Bug Template customfield_10700 327890
Agent OS:CentOS 7
Master OS:CentOS 7
Sprint:Platform Core KANBAN
Method Found:Needs Assessment
QA Risk Assessment:Needs Assessment
Puppet Version: 6.8.1
Puppet Server Version: 6.5.0
OS Name/Version: CentOS 7
Exec resources fail unnecessarily when the working directory from which the 'agent' or 'apply' face is launched does not exist or is inaccessible to puppet. Additionally, in the event that the working directory exists but is inaccessible to the process, the diagnostic emitted incorrectly reports that the directory does not exist.
I typically observe this on machines where my home directory is NFS-mounted, such that it is inaccessible to the local root account. With that as my working directory, I sudo a privileged shell in which to run the Puppet agent, but catalog application fails on account of errors from multiple Exec resources. Changing directory to a root-accessible one allows catalog application to succeed. In at least some cases, the command being executed is insensitive to the accessibility of the working directory, so there's no particular reason that the Exec should fail.
Exec resources should not automatically fail on account of a missing or inaccessible working directory, unless a specific working directory is requested via the cwd parameter.
Puppet, generally, should not rely on the working directory being accessible except where it actually needs to access it for some configuration-, face-, or catalog-specific reason.
Exec resources with no cwd specified fail spuriously when Puppet's working directory cannot be accessed, even when execution of the command does not depend on accessing that directory. The specific failure mode seems to depend on details of the situation, but the one I normally run into can be reproduced with puppet apply, too, like so:
1. log in to a machine on which my home directory is on an automounted NFS file system with root squashing in effect
2. sudo bash to obtain a root shell
Notice: Compiled catalog for xxx.yyy.zzz in environment production in 0.04 seconds
Error: Working directory /mnt/auto/home/a_group/jbolling does not exist!
Error: /Stage[main]/Main/Exec[/bin/echo]/returns: change from 'notrun' to ['0'] failed: Working directory /mnt/auto/home/miller2grp/jbolling does not exist!
Notice: Applied catalog in 0.12 seconds
Errors of the same form are also reported by the agent under these circumstances if the catalog being applied contains `Exec` resources without cwd parameters. On the other hand, these particular failures do not occur if the Exec in question has a cwd parameter designating an existing, accessible directory, notwithstanding that the Puppet process's own working directory is inaccessible to it.
Oddly enough, a different failure occurs if Puppet's working directory actually doesn't exist. That situation can be reproduced like so:
1. Create a fresh directory for the experiment: mkdir doomed.
2. Make that directory the working directory: cd doomed.
3. In a separate shell, remove the directory: rm doomed.
4. In the first shell, run any Puppet command at all, for example
shell-init: error retrieving current directory: getcwd: cannot access parent directories: No such file or directory
terminate called after throwing an instance of 'boost::filesystem::filesystem_error'
what(): boost::filesystem::current_path: No such file or directory
As described in conjunction with Example 1, however, Puppet does not actually need to access the current working directory to compile or apply a catalog, at least not as a general rule, so this failure is spurious.
Of course, these issues can be worked around by ensuring that the working directory is an existing, accessible one before running Puppet, and I imagine that both failure scenarios presented are unlikely for most people. However, I personally hit the Example 1 case often enough to inspire me to file this report. I note also that it's distinctly more annoying when it occurs as a result of puppet agent --test ing a catalog that takes a minute to retrieve and a few minutes to apply, and there is some risk that such a failure leaves the system in an unexpected state requiring manual intervention.