Uploaded image for project: 'Puppet'
  1. Puppet
  2. PUP-5646

fqdn_rand in real world - only 336 of possible 1440 daily cron slots ever used.

    Details

    • Type: Bug
    • Status: Closed
    • Priority: Normal
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: PUP 4.4.0
    • Component/s: None
    • Labels:
      None
    • Template:
    • Story Points:
      1
    • Sprint:
      Language 2016-01-27, Language 2016-02-10
    • Release Notes:
      Bug Fix
    • Release Notes Summary:
      Hide
      The fqdn_rand function when used to produce a series of random values for a node in a given range did not produce an even (random) spread over the range. This has now been improved by lengthening the salt that is used to produce the series.

      As a consequence those resources (cron entries, resources with titles generated containing a random number, etc.) may be reported as having changed the first time the version containing this fix is in use.
      Show
      The fqdn_rand function when used to produce a series of random values for a node in a given range did not produce an even (random) spread over the range. This has now been improved by lengthening the salt that is used to produce the series. As a consequence those resources (cron entries, resources with titles generated containing a random number, etc.) may be reported as having changed the first time the version containing this fix is in use.

      Description

      Take the very typical code for once per day cron on 50000 hosts named
      1.example.org, 2.example.org , ...

      cron{'/bin/true':
         minute => fqdn_rand(60),
         hour    => fqdn_rand(24)
      }
      

      There are 24 * 60 = 1440 cron slots in day but in reality even with 50,000 hosts
      only 336 slots are ever used.

      The current fqdn_rand code is something like:

      seed = Digest::MD5.hexdigest(h).hex
      hour = Puppet::Util.deterministic_rand(seed,24)
      min  = Puppet::Util.deterministic_rand(seed,60)
      

      Our suspicion is that some where there is a construct
      N % 24 and an N % 60 which forces only the minute to ever be
      significant value with the hour always predictable from the minute. This
      can not be the whole picture because 336 cannot be explained.

      Whilst ignoring the underlying cause for now making following change to
      use the max number within the seed helps things
      significantly. This breaks the relationship between fqdn_rand(60) and
      fqdn_rand(24) and we now get a flat distribution
      with all 1440 cron slots being used.

         seed = Digest::MD5.hexdigest("#{h}:24").hex
          hour = Puppet::Util.deterministic_rand(seed,24)
          seed = Digest::MD5.hexdigest("#{h}:60").hex
          min  = Puppet::Util.deterministic_rand(seed,60)
      

      The above would be a trivial patch to fqdn_rand.rb.

      Please find attached

      • test_fqdn.rb simulates 50,000 hosts with the existing and alternate seed method.
      • results.dat and results.png data and gnuplot of existing puppet code.
      • newresults.dat and newresults.png data and gnuplot of alternate method above.

      The .dat files are 24 by 60 matrix with a number of the occurrence of each cron slot used.
      The plots are similarly x and y of 24 hours by 60 minutes with a scratter for each cron slot being used.

      The results.png clearly show most of slots being empty (at zero). The newresults.png
      is a much flatter distribution.

      It would of course be nice to understand where 336 comes from.

      We have tested both with ruby 1.8 and ruby 2.0 but find no difference in behavior.

        Attachments

        1. newresults.dat
          4 kB
        2. newresults.png
          newresults.png
          17 kB
        3. random_nums.txt
          48 kB
        4. results.dat
          3 kB
        5. results.png
          results.png
          11 kB
        6. test_fqdn.rb
          2 kB

          Activity

            jsd-sla-details-panel

              People

              • Assignee:
                Unassigned
                Reporter:
                traylenator Steve Traylen
              • Votes:
                0 Vote for this issue
                Watchers:
                8 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved:

                  Zendesk Support