SERVER-246 introduced a timeout for allowing JRuby pool borrow attempts. The work for it, however, didn't directly cover what the external user experience should be if one of these timeouts were hit in the context of an incoming agent request. Presumably, the exception that the borrow attempt would throw would be translated into an HTTP failure response on the wire. The internal exception message has "Error: Attempt to borrow a JRuby instance from the pool timed out". This message should probably be more descriptive, e.g., including some informative text and what the user might be able to do to address the issue.
This ticket would cover:
1) Evaluate what the current behavior is for a failure.
2) Investigate what would be involved in catching the internal exception and rethrowing as a more user-friendly failure in the HTTP response.
3) Solicit opinions - maybe from UX - on what a more appropriate message for a consumer of the HTTP response would be.
For reference, the following is what appears on the agent side when the JRuby pool borrow times out:
Warning: Unable to fetch my node definition, but the agent run will continue:
Warning: Error 500 on SERVER: Internal Server Error: java.lang.IllegalStateException: Error: Attempt to borrow a JRuby instance from the pool timed out
The full stacktrace that follows this exception message is printed out in the Server log.
The agent run continues after receiving this warning/error, likely using some cached state from the previous run.
First, the setup:
- Install a custom "sleep" function on the Server:
- Sleep 15 seconds in a manifest:
- Use only 1 JRuby instance, and lower the Puppet Server borrow timeout to 10 seconds:
Now, trigger the timeout:
1. Borrow the only JRuby directly with curl (since you can't do two simultaneous agent runs):
2. Immediately after the curl command, do an agent run:
Risk Assessment: Low
Severity: Low. Only affects the error message we throw when we time out a JRuby request.