[PDB-700] java.io.IOException: Frame size larger than max allowed 100 MB Created: 2014/06/04  Updated: 2015/08/12  Resolved: 2015/07/21

Status: Closed
Project: PuppetDB
Component/s: None
Affects Version/s: PDB 2.0.0
Fix Version/s: PDB 2.3.7, PDB 3.0.2

Type: Bug Priority: Critical
Reporter: Bernd Zeimetz Assignee: Russell Mull
Resolution: Fixed Votes: 0
Labels: puppetdb
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified
Environment:

puppetdb-2.0.0-1.el6.noarch from the puppetlabs yum repo,
java version "1.7.0_55"
OpenJDK Runtime Environment (rhel-2.4.7.1.el6_5-x86_64 u55-b13)
OpenJDK 64-Bit Server VM (build 24.51-b03, mixed mode)


Issue Links:
Relates
Template:
Story Points: 3
Sprint: 20140604 to 20140618, 20140618 to 20140702, PuppetDB 2015-07-29

 Description   

We are seeing the following backtrace appearing after puppetdb running for ~5-15 Minutes, forcing us to restart puppetdb pretty often to keep it working. After a restart outstanding queued tasks are applied without problems at all. I think the problem was introduced with the upgrad to 2.0.0, but due to adding a largish number of new hosts at about the same time I'm not 100% sure about that.

014-06-04 16:25:28,323 ERROR [o.a.a.b.r.c.AbstractStoreCursor] org.apache.activemq.broker.region.cursors.QueueStorePrefetch@66ed66e4:com.puppetlabs.puppetdb.commands,batchResetNeeded=false,storeHasMessages=true,size=2207,cacheEnabled=false - Failed to fill batch
java.lang.RuntimeException: java.io.IOException: Frame size of 194 MB larger than max allowed 100 MB
        at org.apache.activemq.broker.region.cursors.AbstractStoreCursor.fillBatch(AbstractStoreCursor.java:280) ~[puppetdb.jar:na]
        at org.apache.activemq.broker.region.cursors.AbstractStoreCursor.reset(AbstractStoreCursor.java:113) ~[puppetdb.jar:na]
        at org.apache.activemq.broker.region.cursors.StoreQueueCursor.reset(StoreQueueCursor.java:157) [puppetdb.jar:na]
        at org.apache.activemq.broker.region.Queue.doPageInForDispatch(Queue.java:1766) [puppetdb.jar:na]
        at org.apache.activemq.broker.region.Queue.pageInMessages(Queue.java:1995) [puppetdb.jar:na]
        at org.apache.activemq.broker.region.Queue.iterate(Queue.java:1488) [puppetdb.jar:na]
        at org.apache.activemq.thread.PooledTaskRunner.runTask(PooledTaskRunner.java:122) [puppetdb.jar:na]
        at org.apache.activemq.thread.PooledTaskRunner$1.run(PooledTaskRunner.java:43) [puppetdb.jar:na]
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) [na:1.7.0_55]
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) [na:1.7.0_55]
        at java.lang.Thread.run(Thread.java:744) [na:1.7.0_55]
Caused by: java.io.IOException: Frame size of 194 MB larger than max allowed 100 MB
        at org.apache.activemq.openwire.OpenWireFormat.unmarshal(OpenWireFormat.java:277) ~[puppetdb.jar:na]
        at org.apache.activemq.store.kahadb.KahaDBStore.loadMessage(KahaDBStore.java:1016) ~[puppetdb.jar:na]
        at org.apache.activemq.store.kahadb.KahaDBStore$KahaDBMessageStore$4.execute(KahaDBStore.java:556) ~[puppetdb.jar:na]
        at org.apache.kahadb.page.Transaction.execute(Transaction.java:769) ~[puppetdb.jar:na]
        at org.apache.activemq.store.kahadb.KahaDBStore$KahaDBMessageStore.recoverNextMessages(KahaDBStore.java:545) ~[puppetdb.jar:na]
        at org.apache.activemq.store.ProxyMessageStore.recoverNextMessages(ProxyMessageStore.java:106) ~[puppetdb.jar:na]
        at org.apache.activemq.broker.region.cursors.QueueStorePrefetch.doFillBatch(QueueStorePrefetch.java:97) ~[puppetdb.jar:na]
        at org.apache.activemq.broker.region.cursors.AbstractStoreCursor.fillBatch(AbstractStoreCursor.java:277) ~[puppetdb.jar:na]
        ... 10 common frames omitted



 Comments   
Comment by Ken Barber [ 2014/07/08 ]

Closed with release 2.1.0.

Comment by Russell Mull [ 2015/07/17 ]

I don't think this patch actually fixed this problem. There are two places we supply a connection string to amq: for command ingestion (the producer) and for the workers (the consumer). We changed it for only the producer, but this error and PE-11069 are both show a stack trace indicating the problem is in the consumer.

Generated at Sun Jun 16 16:34:26 PDT 2019 using JIRA 7.7.1#77002-sha1:e75ca93d5574d9409c0630b81c894d9065296414.