We currently limit the number of commands that we process concurrently based only on a configuration setting and/or the number of CPU cores. We don't take into account memory usage at all.
As long as this is the case, it will be possible for clients to cause an OOM and crash PuppetDB by submitting several simultaneous commands with very large payloads.
It seems like there are some options we could consider for taking into account memory usage when determining how many commands to process in parallel. For example, if it's possible for us to peek at an ActiveMQ message before we read it from the queue, and determine what its size is, then we could potentially do something with that data.
One idea might be to just keep an atom around whose value is an integer containing the sum of the original size of all of the messages that are currently being processed. We could provide a configuration setting that limited this to some upper bound, and then simply block new command processor threads from starting work on a message until the value of the atom was below the threshold.
If we wanted to get fancier, we could try to do some calculation based on the knowledge that whatever the message size is, we probably need 4x that amount of RAM in order to process the command, so we could do something like:
if (MAX_HEAP - 100MB) - (4 * CURRENT_VALUE_OF_ATOM) < (4 * NEXT_MESSAGE_SIZE)
These are obviously tricky and a bit risky, and I admittedly have not thought through them very far... but it seems like we could probably come up with something that would make it significantly more difficult to OOM the server.