Affects Version/s: None
Fix Version/s: None
Affects all platforms
Template:PUP Bug Template customfield_10700 209036
Method Found:Needs Assessment
Release Notes Summary:If a client cert, CA bundle or CRL bundle are invalid, then they will be discarded and not saved to disk, so the agent won't be stuck the next time it runs.
QA Risk Assessment:Needs Assessment
Considering a running environment, if a machine gets replaced with the same hostname and the node certificate is not removed from the master prior to the agent running for the first time on the new node, it will generate a new SSL key and pull the old certificate from the puppet master, notice the key and certificate mismatch and provide the following error:
Info: Creating a new SSL key for agenthostname-domain.com
Info: Caching certificate for ca
Info: Caching certificate for agenthostname-domain.com
Error: Could not request certificate: The certificate retrieved from the master does not match the agent's private key.
Certificate fingerprint: XX:XX:XX:XX:XX:XX:XX:XX:XX:XX:XX:XX:XX:XX:XX:XX:XX:XX:XX:XX:XX:XX:XX:XX:XX:XX:XX:XX:XX:XX:XX:XX
To fix this, remove the certificate from both the master and the agent and then start a puppet run, which will automatically regenerate a certificate.
On the master:
puppet cert clean agenthostname-domain.com
On the agent:
1a. On most platforms: find C:/ProgramData/PuppetLabs/puppet/etc/ssl -name agenthostname-domain.com.pem -delete
1b. On Windows: del "C:\ProgramData\PuppetLabs\puppet\etc\ssl\certs\agenthostname-domain.com.pem" /f
2. puppet agent -t
Exiting; failed to retrieve certificate and waitforcert is disabled
The problem with the above is that if the mismatched certificate is cached on the host, which it is, it will print that error each time puppet agent runs and not attempt to contact the puppet master again, resulting in a "stalemate" whereby an admin needs to remove the above file, before the agent can contact the master again.
I don't believe that this is intended behaviour. It results in admin intervention which can be avoided if the node certificate is not kept on local disk if it it mismatches the SSL key, thereby allowing the scenario of an admin removing the certificate from the master and allowing the agent to perform the enrollment process. This is necessary in situations where access to the agent is limited/restricted for various reasons.
In summary: if an agent retrieves a certificate from a puppet master and it mismatches the local private key it should be discarded and not kept on local disk to avoid having to remove the cert manually.