Uploaded image for project: 'Puppet Server'
  1. Puppet Server
  2. SERVER-1704

JRuby pool lock request will block indefinitely for a stalled interpreter

    Details

    • Type: Bug
    • Status: Closed
    • Priority: Normal
    • Resolution: Done
    • Affects Version/s: SERVER 2.7.1
    • Fix Version/s: SERVER 5.0.0
    • Component/s: File Sync, Puppet Server
    • Labels:
      None
    • Template:
    • Team:
      Systems Engineering
    • Story Points:
      5
    • Sprint:
      SE 2017-02-08, SE 2017-02-22, SE 2017-03-08
    • Release Notes:
      Bug Fix
    • Release Notes Summary:
      Hide
      If a single JRuby instance in the JRuby pool was hung and a request to lock all instances was made, the resulting lock attempt would stall indefinitely. This would effectively deadlock all of Puppetserver and required manual intervention to release the lock. This scenario would occur when a JRuby instance was blocked on an external call (such as an external call to a process that slept indefinitely) and the file sync service attempted to lock the JRuby pool in order to make a code sync.

      This has been resolved by adding a timeout to the pool lock request, and upon expiration of that request an exception would be thrown. This prevents Puppetserver from completely locking up when a single JRuby instance is stalled.
      Show
      If a single JRuby instance in the JRuby pool was hung and a request to lock all instances was made, the resulting lock attempt would stall indefinitely. This would effectively deadlock all of Puppetserver and required manual intervention to release the lock. This scenario would occur when a JRuby instance was blocked on an external call (such as an external call to a process that slept indefinitely) and the file sync service attempted to lock the JRuby pool in order to make a code sync. This has been resolved by adding a timeout to the pool lock request, and upon expiration of that request an exception would be thrown. This prevents Puppetserver from completely locking up when a single JRuby instance is stalled.
    • QA Risk Assessment:
      Needs Assessment

      Description

      When the Puppet Server JRuby pool lock is requested, notably by the File Sync client during code deployment, all incoming requests are blocked and the call to lock-pool blocks until all currently borrowed instances are returned to the pool. If a single JRuby instance is stalled waiting for an operation that won't complete, access to the entire pool will be locked indefinitely until some action is taken to clear the stalled instance.

      Reproduction Case

      • Install PE 2016.5.1, enable Code Manager, and set up a user account that can execute puppet code deploy.
      • Add the following node definition to the site manifest of the production environment and deploy it via Code Manager:

        node "blocking.test" {
         
          $will_never_return = generate('/bin/sleep', 'infinity')
         
        }
        

      • Generate a certificate that matches the added node definition: puppet cert generate blocking.test
      • Execute an agent run using the generated certificate, it will time out:

        puppet agent -t --certname=blocking.test --configtimeout=15s
        

      • Trigger a code deployment via: puppet code deploy --all --wait
      • When the deployment finishes, try a normal agent run via puppet agent -t

      Outcome

      The call to puppet agent -t, along with any other agent request that requires a JRuby will hang until the JRuby borrow timeout kicks in and fails the request. A request to /status/v1/services/pe-jruby-metrics?level=debug will show that a JRuby pool lock has been requested, but there is still a borrowed instance working on catalog compilation for blocking.test.

      Expected Outcome

      The request for a JRuby pool lock should terminate with an error message after a reasonable timeout instead of blocking incoming requests to the pool indefinitely.

        Attachments

          Issue Links

            Activity

              jsd-sla-details-panel

                People

                • Assignee:
                  joe.pinsonault Joe Pinsonault
                  Reporter:
                  chuck Charlie Sharpsteen
                • Votes:
                  0 Vote for this issue
                  Watchers:
                  5 Start watching this issue

                  Dates

                  • Created:
                    Updated:
                    Resolved: