Details
-
Bug
-
Status: Closed
-
Normal
-
Resolution: Fixed
-
PUP 6.15.0
-
None
-
Night's Watch
-
5
-
NW - 2021-01-20, NW - 2021-02-03, NW - 2021-04-28
-
Customer Feedback
-
40535
-
1
-
Bug Fix
-
-
Needs Assessment
Description
Puppet Version: 6.15.0+
Puppet Server Version: 6.x
OS Name/Version: Any
After the changes in 6.15.0 the server_list setting has different behavior. Previously when server_list was configured and the first puppetserver in the list failed, the agent would continue to run by connecting to the next puppetserver on the list. In 6.15.0, if the primary puppetserver fails while an agent is running, it results in a failed agent run.
Desired Behavior:
When the first puppetserver in the server_list goes offline, the agents should automatically try to connect to the second puppetserver in the server_list even if it is mid agent run.
Actual Behavior:
The agent run fails if the first puppetserver in the server_list goes offline while the agent is in the middle of a run.
Some failures are below.
Could not evaluate: Could not retrieve file metadata for puppet:///pe_packages/2019.8.1/windows-x86_64/puppet-agent-x64.msi: Request to https://primary.example.com:8140/puppet/v3/file_metadata/pe_packages/2019.8.1/windows-x86_64/puppet-agent-x64.msi?links=manage&checksum_type=sha256lite&source_permissions=ignore&environment=windows_testing failed after 21.011 seconds: Failed to open TCP connection to primary.example.com:8140 (A connection attempt failed because the connected party did not properly respond after a period of time, or established connection failed because connected host has failed to respond. - connect(2) for "primary.example.com" port 8140)
|
puppet-agent[7053]: Could not retrieve catalog from remote server: Request to https://primary.example.com:8140/puppet/v3/catalog/agent.example.com?environment=development failed after 0.004 seconds: Failed to open TCP connection to primary.example.com:8140 (Connection refused - connect(2) for "primary.example.com" port 8140)
|
Reproduction
1. Configure the server_list for two Puppetservers
2. Configure 10 agents with the server_list and a run interval of a minute
3. Shutdown the Puppetserver service on the first server in the server_list
Likely one of the agents will have the failure. It seems to be more reproducible with file resources inside the catalog.
We believe this is related to the changes in PUP-10363