Uploaded image for project: 'Puppet'
  1. Puppet
  2. PUP-9577

Large numbers of facts cause slow performance



    • Bug
    • Status: Accepted
    • Normal
    • Resolution: Unresolved
    • None
    • None
    • None
    • Froyo
    • Needs Assessment
    • Major
    • 44770
    • 1
    • Needs Assessment


      Puppet Version: mix of 5.58 and 6.3.0
      Puppet Server Version: 6.2.1
      OS Name/Version: CentOS 7

      Many of our hosts have relatively large numbers of facts (~6k or so).  We've been seeing performance issues on our masters that seem to be related to memory pressure, and believe these to be related to the fact count.

      I hacked up the full_catalog benchmark to use an exported copy of facts from our environment, and saw that catalog compilation times went up by 1.1s (about a 50% increase over the unmodified test).

      Looking at the memory profile, I see what appears to be a very significant increase in the number of objects allocated.

      Is this behavior expected?  I wouldn't really expect 6k facts to cause such a dramatic increase here.


      I've attached a few things here:

      • mem_profile_defaultfacts -> Memory profile output from an unmodified full_catalog benchmark
      • mem_profile_6kfacts -> Memory profile output from the modified benchmark with a high fact count.
      • heap_profiles.zip -> Output from heap_dump benchmarks, named as above
      • cpu_profiles.zip -> Output from profile benchmarks
      • facts.json -> Sanitized version of our facts


      I had just changed full_catalog/benchmarker.rb to have `facts = Puppet::Node::Facts.new("testing", JSON.parse(`cat /root/facts`))` ... my ruby skills are terrible.

      I should note that we don't care about 99% of these facts, but I don't really see any way of filtering them.  For example, all of the IXXXXX network interfaces are useless to us, along with most of the entries for /dev/mapper/

      I don't know if the performance issues are fixable here, or if I'm asking for a regex-based blocklist in facter, or what.  My suspicion is that the number of template calls is adding up:


      During benchmarks, this takes 0.01s per template (as compared to 0.00s with a smaller fact count)... multiplied by all the templates we use seems pretty significant.



        1. 2019-03-21_12-50-30-gc.log.0.current
          207 kB
        2. cpu_profiles.zip
          5.64 MB
        3. cpu_usage.png
          23 kB
        4. facts.json
          221 kB
        5. fsl.patch
          0.6 kB
        6. heap_profiles.zip
          15.94 MB
        7. mem_profile_6kfacts
          83 kB
        8. mem_profile_defaultfacts
          87 kB
        9. repro.rb
          1 kB
        10. test3.patch
          6 kB

        Issue Links



              Unassigned Unassigned
              devicenull Brian Rak
              1 Vote for this issue
              17 Start watching this issue



                Zendesk Support