Uploaded image for project: 'PuppetDB'
  1. PuppetDB
  2. PDB-4769

active_nodes CTE does a costly HashAggregate

    XMLWordPrintable

    Details

    • Type: Improvement
    • Status: Resolved
    • Priority: Normal
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: PDB 6.11.1, PDB 6.11.2, PDB 5.2.19
    • Component/s: None
    • Labels:
      None
    • Template:
    • Team:
      HA
    • Release Notes:
      Bug Fix
    • Release Notes Summary:
      Fixes a bug that caused PuppetDB to use a CTE that materialized a large table, slowing queries
    • QA Risk Assessment:
      Needs Assessment

      Description

      Before

      EXPLAIN ANALYZE WITH inactive_nodes AS (SELECT certname FROM certnames WHERE (deactivated IS NOT NULL OR expired IS NOT NULL)) SELECT fs.certname AS certname, env.environment AS environment, fs.key AS name, fs.value AS value FROM (select certname,        environment_id,        (jsonb_each((stable||volatile))).*  from factsets) fs LEFT JOIN environments env ON fs.environment_id = env.id WHERE ((fs.certname = 'wine-underworld.delivery.puppetlabs.net') AND ((fs.certname) in (SELECT certname FROM  ( (SELECT inactive_nodes.certname AS certname FROM inactive_nodes) )  sub)));
                                                                                    QUERY PLAN
      ----------------------------------------------------------------------------------------------------------------------------------------------------------------------
       Nested Loop Semi Join  (cost=28.12..70.31 rows=100 width=128) (actual time=0.416..0.416 rows=0 loops=1)
         CTE inactive_nodes
           ->  Seq Scan on certnames  (cost=0.00..15.20 rows=520 width=32) (actual time=0.004..0.004 rows=0 loops=1)
                 Filter: ((deactivated IS NOT NULL) OR (expired IS NOT NULL))
                 Rows Removed by Filter: 1
         ->  Hash Right Join  (cost=12.93..40.42 rows=100 width=128) (actual time=0.373..0.392 rows=130 loops=1)
               Hash Cond: (env.id = fs.environment_id)
               ->  Seq Scan on environments env  (cost=0.00..22.00 rows=1200 width=40) (actual time=0.005..0.006 rows=1 loops=1)
               ->  Hash  (cost=11.68..11.68 rows=100 width=104) (actual time=0.362..0.362 rows=130 loops=1)
                     Buckets: 1024  Batches: 1  Memory Usage: 34kB
                     ->  Subquery Scan on fs  (cost=0.15..11.68 rows=100 width=104) (actual time=0.258..0.312 rows=130 loops=1)
                           ->  Result  (cost=0.15..10.68 rows=100 width=104) (actual time=0.257..0.299 rows=130 loops=1)
                                 ->  ProjectSet  (cost=0.15..8.68 rows=100 width=72) (actual time=0.254..0.280 rows=130 loops=1)
                                       ->  Index Scan using factsets_certname_idx on factsets  (cost=0.15..8.17 rows=1 width=104) (actual time=0.013..0.013 rows=1 loops=1)
                                             Index Cond: (certname = 'wine-underworld.delivery.puppetlabs.net'::text)
         ->  CTE Scan on inactive_nodes  (cost=0.00..11.70 rows=3 width=32) (actual time=0.000..0.000 rows=0 loops=130)
               Filter: (certname = 'wine-underworld.delivery.puppetlabs.net'::text)
       Planning Time: 0.254 ms
       Execution Time: 0.480 ms
      

      After

      (19 rows)pe-puppetdb=# EXPLAIN ANALYZE WITH inactive_nodes AS (SELECT certname FROM certnames WHERE (deactivated IS NULL OR expired IS NULL)) SELECT fs.certname AS certname, env.environment AS environment, fs.key AS name,
       fs.value AS value FROM (select certname,        environment_id,        (jsonb_each((stable||volatile))).*  from factsets) fs LEFT JOIN environments env ON fs.environment_id = env.id WHERE ((fs.certname = 'wine-u
      nderworld.delivery.puppetlabs.net') AND ((fs.certname) in (SELECT certname FROM  ( (SELECT inactive_nodes.certname AS certname FROM inactive_nodes) )  sub)));
                                                                                    QUERY PLAN
      ----------------------------------------------------------------------------------------------------------------------------------------------------------------------
       Nested Loop  (cost=28.24..56.75 rows=100 width=128) (actual time=0.404..0.445 rows=130 loops=1)
         CTE inactive_nodes
           ->  Seq Scan on certnames  (cost=0.00..15.20 rows=5 width=32) (actual time=0.010..0.010 rows=1 loops=1)
                 Filter: ((deactivated IS NULL) OR (expired IS NULL))
         ->  HashAggregate  (cost=0.11..0.12 rows=1 width=32) (actual time=0.016..0.016 rows=1 loops=1)
               Group Key: inactive_nodes.certname
               ->  CTE Scan on inactive_nodes  (cost=0.00..0.11 rows=1 width=32) (actual time=0.012..0.013 rows=1 loops=1)
                     Filter: (certname = 'wine-underworld.delivery.puppetlabs.net'::text)
         ->  Hash Right Join  (cost=12.93..40.42 rows=100 width=128) (actual time=0.386..0.406 rows=130 loops=1)
               Hash Cond: (env.id = fs.environment_id)
               ->  Seq Scan on environments env  (cost=0.00..22.00 rows=1200 width=40) (actual time=0.009..0.010 rows=1 loops=1)
               ->  Hash  (cost=11.68..11.68 rows=100 width=104) (actual time=0.372..0.372 rows=130 loops=1)
                     Buckets: 1024  Batches: 1  Memory Usage: 34kB
                     ->  Subquery Scan on fs  (cost=0.15..11.68 rows=100 width=104) (actual time=0.259..0.315 rows=130 loops=1)
                           ->  Result  (cost=0.15..10.68 rows=100 width=104) (actual time=0.258..0.302 rows=130 loops=1)
                                 ->  ProjectSet  (cost=0.15..8.68 rows=100 width=72) (actual time=0.256..0.283 rows=130 loops=1)
                                       ->  Index Scan using factsets_certname_idx on factsets  (cost=0.15..8.17 rows=1 width=104) (actual time=0.017..0.017 rows=1 loops=1)
                                             Index Cond: (certname = 'wine-underworld.delivery.puppetlabs.net'::text)
       Planning Time: 0.240 ms
       Execution Time: 0.542 ms
      (20 rows)
      

      In a large database, for a reports query, this HashAggregate step took more than two thirds the total execution time.

        Attachments

          Issue Links

            Activity

              People

              Assignee:
              austin.blatt Austin Blatt
              Reporter:
              austin.blatt Austin Blatt
              Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

                Dates

                Created:
                Updated:
                Resolved:

                  Zendesk Support