Details
-
Bug
-
Status: Closed
-
Major
-
Resolution: Fixed
-
None
-
None
-
None
-
Coremunity
-
Platform Core KANBAN
-
Customer Feedback
-
Major
-
3 - 25-50% of Customers
-
3 - Serious
-
4 - $$$$$
-
Many users would like to use recurse. Having it lead to poor performance isn't good.
-
Bug Fix
-
Description
We noticed the following today on one of our production clusters: https://cloudup.com/cgJm3eD4wu4. Essentially Puppet Agent has been using gradually more CPU and Memory over time and was taking longer and longer to execute. On some nodes its been using 2G during each run.
Digging in, we discovered why. We have configuration like the following:
file {
|
"/var/db/":
|
ensure => directory,
|
group => "dbuser",
|
owner => "dbuser",
|
recurse => true;
|
}
|
It turns out that the /var/db directory on this machine sometimes includes temporary files, because of the data system using this directory. This means there are a few new files in /var/db every time we run puppet agent.
We discovered today that each time puppet sees a file, it writes a line to /var/lib/puppet/state/state.yaml. Since there were always new files each time puppet ran, this file grew and grew. Its about 125MB on some of our machines. With a state.yaml file of that size, puppet agent took this long:
real 6m31.792s
user 5m56.434s
sys 0m33.867s
After deleting /var/lib/puppet/state/state.yaml it dropped to:
real 0m25.232s
user 0m16.073s
sys 0m4.112s
We've removed our use of `recurse` for now, which wasn't really necessary in this case, but we consider this a dangerous situation and a bug in Puppet; it means that we need to be sure Puppet isn't touching new files over time. Happy to provide more details if its helpful.