Status: Ready for Engineering
Affects Version/s: SERVER 7.0.3
Fix Version/s: None
Component/s: Puppet Server
Template:PUP Bug Template customfield_10700 389770
Method Found:Customer Feedback
Zendesk Ticket IDs:44196
Zendesk Ticket Count:1
QA Risk Assessment:Needs Assessment
When processing agent requests such as fact submissions, catalog requests,
or report uploads, Puppet Server creates several copies of the request data
as part of processing. However, many of these copies outlive their useful
context and are retained in memory until a response is delivered to the
client and the request is closed.
The following reproduction case examines report submission — which creates
at least 7 copies of the request data by the time the report is handed
off to PuppetDB. This behavior magnifies the impact of large run reports
and makes it easier for a single agent or group of agents to exhaust the
memory available to Puppet Server.
- Install Puppet Server 7 on a CentOS 7 node:
- Install PuppetDB 7 and configure it as a report processor:
- Next, configure Puppet Server to use one JRuby instance with the default allocation of 512 MB of RAM and configure JRuby to produce extra debugging information in heap dumps:
- Generate certificates for a test node and configure it to recursively purge a deep directory tree in order to generate a large report:
- Install mitmproxy and configure it to dump the Puppet Server heap when a report is submitted to PuppetDB:
- Run puppet agent to enforce the resource and submit a report (this will take about 5 minutes):
- Analyze the *.hprof file written to /tmp
At the time data is being handed off to PuppetDB, 8 of the 10 largest objects on the heap are related to the report processing request:
- A java.lang.String instance containing the HTTP request body submitted by the puppet agent and recieved by Java. The content of this string is UTF-16 encoded, which means it uses twice the memory a UTF-8 encoded string would need to store the same ASCII data. Retains 39,835,016 bytes.
- A org.jruby.RubyString instance containing a copy of the HTTP request body after conversion from Java to a Puppet::Network::HTTP::Request. Retains 21,909,328 bytes.
- An org.jruby.RubyHash instance representing the report data after the Puppet::Network::HTTP::Request body is parsed to create a Puppet::Transaction::Report instance. Retains 26,189,328 bytes.
- An org.jruby.RubyArray instance holding the log entries of the report. Created when the Puppet::Transaction::Report instance is duplicated before processing by PuppetDB. Retains 8,705,088 bytes.
- An org.jruby.RubyHash instance representing a copy of the report data, transformed by the PuppetDB report processor. Retains 27,387,848 bytes.
- An org.jruby.RubyString instance created by serializing the above hash to JSON for submission to PuppetDB. Retains 13,334,192 bytes.
- An org.jruby.RubyString instance created by duplicating the above string and adding some metadata. Used soley for computing a PuppetDB command checksum. Retains 13,334,272 bytes.
- A com.puppetlabs.http.client.RequestOptions instance used to make the actual POST request to PuppetDB that contains a copy of the above strings as the request body. The request body in this object is a java.lang.String which also pays the UTF-16 tax. Retains 26,668,368 bytes.
End result: a 19,917,508 byte report submission by the agent is magnified to 177,142,440 bytes of memory usage for the Puppet Server by the time the data is handed off to PuppetDB and the request starts closing out — an overhead of nearly 10x.
Puppet Server retains minimal copies of large data blocks while serving agent request.
Dig through this information and create tickets describing any work we can do to streamline this.