[PA-1743] puppet-agent stuck on sched_yield Created: 2017/12/15  Updated: 2018/02/14  Resolved: 2018/02/12

Status: Closed
Project: Puppet Agent
Component/s: None
Affects Version/s: puppet-agent 5.3.3
Fix Version/s: puppet-agent 5.3.5, puppet-agent 5.4.0

Type: Bug Priority: Normal
Reporter: Phil Oester Assignee: Unassigned
Resolution: Fixed Votes: 3
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified
Environment:

CentOS 7.4
puppet-agent 5.3.3


Issue Links:
Relates
Template:
Team: Platform OS
Sprint: Platform OS Kanban
Method Found: Needs Assessment
Release Notes: Bug Fix
Release Notes Summary: Apply an upstream Ruby patch to resolve a lockup in Exec resources
QA Risk Assessment: Needs Assessment

 Description   

Using PA 5.3.3, we are experiencing the sched_yield busyloop documented here:

https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=876377

Example showing excessive CPU time, and process which has been around for 11 days:

root 27413 1 99 Dec04 ? 11-01:52:23 puppet agent: applying configuration



 Comments   
Comment by Phil Oester [ 2017/12/20 ]

Also, the bug report at upstream ruby:

https://bugs.ruby-lang.org/issues/13794

Comment by Wiebe Verweij [ 2017/12/22 ]

We are experiencing the same issue on Debian Stretch, also with Puppet 5.3.3. It happens about 2 to 4 times a week and everytime on a random machine. The only error we see in the puppet logs when this happens is an exceeded timeout error for an exec command.

I can reproduce it with the sched_yield_loop.rb script attached from the ruby issue tracker and the following command:

`while nice -n19 /opt/puppetlabs/puppet/bin/ruby sched_yield_loop.rb; do :; done`

This commit should fix it but it looks like it wont be included until Ruby 2.5. Maybe it could be backported to the version shipped with puppet?
https://github.com/ruby/ruby/commit/b860f06413fa7db83c39fe7572982cc1c26ca1e6#diff-9f35429de5515dc369bd39e14ef2ab85

Comment by Matthias Baur [ 2018/01/17 ]

Having the same issue. Any update on this?

OS: Gentoo/Ubuntu (14|16).04
Puppet Version: 5.3.3
Ruby Version: 2.4.2p198

Comment by Matthias Baur [ 2018/01/22 ]

As we're also using the Puppet Ruby for r10k, this also effects our Deployments. Even worse, i think it also effects the Puppetserver as we're currently seeing strange performance degradation from time to time.

Comment by Kenn Hussey [ 2018/02/12 ]

Branan Riley please add release notes for this issue, if needed. Thanks!

Generated at Sat Aug 24 21:50:41 PDT 2019 using JIRA 7.7.1#77002-sha1:e75ca93d5574d9409c0630b81c894d9065296414.