Resolution: Won't Fix
Affects Version/s: PUP 3.7.2, PUP 3.7.3
Fix Version/s: None
Ubuntu 12.04.5 / 14.04.1
There may be a race condition when retrieving cached connections in network/http/pool.rb, which causes sporadic failures of Puppet when using KeepAlive. We have keepalive set to 4 seconds as default on the client, and have increased the server keepalive from 5 to 10 seconds with no improvement.
The race condition appears to be between selecting and returning a cached connection, and using that connection. The following debug output is typical of the failure (hostnames removed):
Dec 17 21:17:34 * puppet-agent: Using cached connection for https://*:8140
Dec 17 21:17:34 * puppet-agent: Closing connection for https://*:8140
Dec 17 21:17:34 * puppet-agent: Could not retrieve catalog from remote server: end of file reached
Dec 17 21:17:35 * puppet-agent: Using cached catalog for *
Dec 17 21:17:35 * puppet-agent: Using cached catalog
This log output appears to show that Puppet is using the cached connection, but then is closing it before it retrieves the catalogue. This behaviour occurs randomly on clients with KeepAlive. On further inspections, the connection close appears to occur around 4 seconds from the connection starting rather then when it was released.
I've added extra debugging lines locally to try to get more information. I will update this ticket once I've got more data.