[PDB-1205] facts gc is too slow Created: 2015/01/30  Updated: 2015/06/23  Resolved: 2015/06/23

Status: Closed
Project: PuppetDB
Component/s: None
Affects Version/s: PDB 2.2.2
Fix Version/s: None

Type: Bug Priority: Normal
Reporter: Wyatt Alt Assignee: Unassigned
Resolution: Duplicate Votes: 2
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Relates
relates to PDB-1031 ERROR: update or delete on table "fac... Closed
relates to PDB-1124 PuppetDB high CPU usage Closed
Template:

 Description   

We have multiple customers experiencing issues with slow facts garbage collection (see PE-7392 and PDB-1124). From a user's logs,

< 2015-01-05 12:41:51.008 CET >STATEMENT: COMMIT                                                                                                                                                                    
< 2015-01-05 12:41:51.009 CET >WARNING:  there is already a transaction in progress                                                                                                                                  
< 2015-01-05 12:41:51.016 CET >WARNING:  there is already a transaction in progress                                                                                                                                  
< 2015-01-05 12:41:51.075 CET >WARNING:  there is already a transaction in progress                                                                                                                                  
< 2015-01-05 12:41:51.196 CET >LOG:  duration: 13591.153 ms  execute S_20: DELETE FROM fact_paths fp                                                                                                                 
                                        WHERE fp.id in ( SELECT fp.id                                                                                                                                                
                                                         FROM fact_paths fp                                                                                                                                          
                                                              inner join fact_values fv on fp.id = fv.path_id                                                                                                        
                                                              inner join facts f on fv.id = f.fact_value_id                                                                                                          
                                                         WHERE fp.id in ( select fv.path_id from fact_values fv where fv.id in ($1,$2,$3,$4) )                                                                       
                                                         GROUP BY fp.id                                                                                                                                              
                                                         HAVING COUNT(fv.id) = 1)                                                                                                                                    
< 2015-01-05 12:41:51.196 CET >DETAIL:  parameters: $1 = '34861202', $2 = '34861204', $3 = '34861203', $4 = '34861205'                                                                                               
< 2015-01-05 12:41:51.234 CET >ERROR:  could not serialize access due to concurrent update 

The customer is using a replicated postgres setup with separate read/write dbs so the "could not serialize access due to concurrent update" errors are somewhat expected, but they may be being exacerbated by the slow gc because they are incredibly frequent in the logs. The error that says "there is already a transaction in progress" may be caused by nested transactions; I think Ken Barber believes that this should not happen using jdbc.

This may be tied to known issues with fact_values (PDB-1031)



 Comments   
Comment by Chuck Schweizer [ 2015/03/31 ]

I am seeing this on a puppetdb server with 34,800 nodes. Storeconfigs is disabled.

puppetdb 2.3.0
postgres 9.2.7

                                                                         QUERY PLAN                                                                         
------------------------------------------------------------------------------------------------------------------------------------------------------------
 Delete on fact_paths fp  (cost=203758.02..203786.98 rows=273 width=38)
   ->  Hash Join  (cost=203758.02..203786.98 rows=273 width=38)
         Hash Cond: (fp.id = "ANY_subquery".id)
         ->  Seq Scan on fact_paths fp  (cost=0.00..21.46 rows=546 width=14)
         ->  Hash  (cost=203751.19..203751.19 rows=546 width=40)
               ->  Subquery Scan on "ANY_subquery"  (cost=203738.91..203751.19 rows=546 width=40)
                     ->  HashAggregate  (cost=203738.91..203745.73 rows=546 width=16)
                           Filter: (count(fv.id) = 1)
                           ->  Hash Join  (cost=2813.07..203243.91 rows=98999 width=16)
                                 Hash Cond: (f.fact_value_id = fv.id)
                                 ->  Seq Scan on facts f  (cost=0.00..165657.36 rows=9008936 width=8)
                                 ->  Hash  (cost=2651.01..2651.01 rows=12965 width=16)
                                       ->  Nested Loop  (cost=15.75..2651.01 rows=12965 width=16)
                                             ->  Merge Join  (cost=15.75..37.54 rows=6 width=16)
                                                   Merge Cond: (fp.id = fv.path_id)
                                                   ->  Index Only Scan using fact_paths_pkey on fact_paths fp  (cost=0.00..21.45 rows=546 width=8)
                                                   ->  Sort  (cost=15.71..15.72 rows=1 width=8)
                                                         Sort Key: fv.path_id
                                                         ->  HashAggregate  (cost=15.69..15.70 rows=1 width=8)
                                                               ->  Index Scan using fact_values_pkey on fact_values fv  (cost=0.00..15.68 rows=6 width=8)
                                                                     Index Cond: (id = ANY ('{2445984,2445982,1426425,2196261,2196248,2445983}'::bigint[]))
                                             ->  Index Scan using fact_values_path_id_value_key on fact_values fv  (cost=0.00..344.12 rows=9146 width=16)
                                                   Index Cond: (path_id = fp.id)
(23 rows)

2015-03-31 18:25:03 UTC|puppetdb|33639|DELETE|10.0.0.50|puppet LOG:  duration: 46422.002 ms  execute <unnamed>: DELETE FROM fact_paths fp
	                                      WHERE fp.id in ( SELECT fp.id
	                                                       FROM fact_paths fp
	                                                            inner join fact_values fv on fp.id = fv.path_id
	                                                            inner join facts f on fv.id = f.fact_value_id
	                                                       WHERE fp.id in ( select fv.path_id from fact_values fv where fv.id in ($1,$2,$3,$4,$5) )
	                                                       GROUP BY fp.id
	                                                       HAVING COUNT(fv.id) = 1)
2015-03-31 18:25:03 UTC|puppetdb|33639|DELETE|10.0.0.50|puppet DETAIL:  parameters: $1 = '2445776', $2 = '2445774', $3 = '2445777', $4 = '2445775', $5 = '1406719'

                               List of relations
 Schema |         Name          | Type  |   Owner    |    Size    | Description 
--------+-----------------------+-------+------------+------------+-------------
 puppet | catalog_resources     | table | puppetdb   | 8192 bytes | 
 puppet | catalogs              | table | puppetdb   | 8192 bytes | 
 puppet | certnames             | table | puppetdb   | 3024 kB    | 
 puppet | edges                 | table | puppetdb   | 8192 bytes | 
 puppet | environments          | table | puppetdb   | 16 kB      | 
 puppet | fact_paths            | table | puppetdb   | 168 kB     | 
 puppet | fact_values           | table | puppetdb   | 216 MB     | 
 puppet | facts                 | table | puppetdb   | 591 MB     | 
 puppet | factsets              | table | puppetdb   | 6176 kB    | 
 puppet | latest_reports        | table | puppetdb   | 8192 bytes | 
 puppet | report_statuses       | table | puppetdb   | 8192 bytes | 
 puppet | reports               | table | puppetdb   | 8192 bytes | 
 puppet | resource_events       | table | puppetdb   | 8192 bytes | 
 puppet | resource_params       | table | puppetdb   | 8192 bytes | 
 puppet | resource_params_cache | table | puppetdb   | 8192 bytes | 
 puppet | schema_migrations     | table | puppetdb   | 40 kB      | 
 puppet | value_types           | table | puppetdb   | 40 kB      | 
(17 rows)

Comment by Wyatt Alt [ 2015/03/31 ]

Hey Chuck Schweizer, we actually just released PuppetDB 2.3.1 today, which is partly geared toward addressing this issue. Would you mind trying an upgrade?

Comment by Chuck Schweizer [ 2015/03/31 ]

Wyatt Alt, Looks like puppetDB 2.3.1 breaks in my environment because we don't use the public schema.

https://tickets.puppetlabs.com/browse/PDB-1363?jql=project%20%3D%20PDB

Comment by Wyatt Alt [ 2015/03/31 ]

Chuck Schweizer yes, thanks for reporting that. We'll get a fix out as quick as we can.

Comment by Chuck Schweizer [ 2015/03/31 ]

I manually fixed the schema issue in the JAR file and I can CONFIRM that the performance issue with fact replacement is fixed in my 38,000 node environment.

Comment by Wyatt Alt [ 2015/03/31 ]

Chuck Schweizer Really appreciate you taking the time, that's a very valuable datapoint.

Comment by Wyatt Alt [ 2015/06/23 ]

duplicates PDB-1031

Generated at Sun Oct 20 19:35:19 PDT 2019 using JIRA 7.7.1#77002-sha1:e75ca93d5574d9409c0630b81c894d9065296414.