Uploaded image for project: 'PuppetDB'
  1. PuppetDB
  2. PDB-5051

Slow queries from PuppetDB to PostgreSQL after migration 4->7

    XMLWordPrintable

Details

    • Bug
    • Status: Open
    • Normal
    • Resolution: Unresolved
    • None
    • None
    • PuppetDB
    • None
    • Needs Assessment

    Description

      We are in the process of upgrading our infrastructure from Puppet 4 to Puppet 7 and we've run to an issue with slow queries to PostgreSQL created by Puppetdb.
      It takes between 30 and 60s to execute the query on PostgreSQL server when it's executed manually and freezes the whole database server when there are plenty of nodes plugged in.

      During the query execution one of CPU cores is 100% consumed on the PSQL server.

      Migration process:
      Puppetdb 4.4.0 -> Puppetdb 5.2.19 (versions earlier than 5.2.9 failed while starting Puppetdb service - possibly migration issues) -> Puppetdb 6.14.0 -> Puppetdb 7.1.0
      PostgreSQL 9.4 -> 13 (although tried with version 9.6 and there is no difference/the performance is even worse)

       

      Curl with query execution:

      curl 127.0.0.1:8080/pdb/query/v4 -H 'Content-Type:application/json' -XPOST -d '{"query": ["from", "facts", ["and", ["=", "name", "ipaddress"], ["in", "certname", ["extract", "certname", ["select_resources", ["and", ["=", ["fact", "project"], "project_name"], ["=", ["fact", "server_class"], "server_class_name"], ["=", ["fact", "server_type"], "server_type_name"]]]]]]]}' 

      where project/server_class/server_type are taken from ENC

      Query that is executed on PostgreSQL server:

      WITH inactive_nodes AS (SELECT certname FROM certnames WHERE (deactivated IS NOT NULL AND deactivated > '2021-02-09T12:42:14.899Z') OR (expired IS NOT NULL and expired > '2021-02-09T12:42:14.899Z')), not_active_nodes AS (SELECT certname FROM certnames WHERE (deactivated IS NOT NULL OR expired IS NOT NULL)) SELECT fs.certname AS certname, env.environment AS environment, fs.key AS name, fs.value AS value FROM (select certname, environment_id, 'ipaddress'::text as key, (stable||volatile)->'ipaddress' as value from factsets where (stable||volatile) ? 'ipaddress') fs LEFT JOIN environments env ON fs.environment_id = env.id WHERE (((fs.key = 'ipaddress') AND ((fs.certname) in (SELECT certname FROM ( (SELECT c.certname AS certname FROM catalog_resources resources INNER JOIN certnames ON resources.certname_id = certnames.id INNER JOIN catalogs c ON c.certname = certnames.certname LEFT JOIN environments e ON c.environment_id = e.id LEFT JOIN resource_params_cache rpc ON rpc.resource = resources.resource WHERE (((c.certname) in (SELECT certname FROM ( (SELECT certnames.certname AS certname FROM factsets fs LEFT JOIN environments ON fs.environment_id = environments.id LEFT JOIN certnames ON fs.certname = certnames.certname WHERE (fs.stable||fs.volatile) @> '{"project": "project_name"}') ) sub)) AND ((c.certname) in (SELECT certname FROM ( (SELECT certnames.certname AS certname FROM factsets fs LEFT JOIN environments ON fs.environment_id = environments.id LEFT JOIN certnames ON fs.certname = certnames.certname WHERE (fs.stable||fs.volatile) @> '{"server_class":"server_class_name"}') ) sub)) AND ((c.certname) in (SELECT certname FROM ( (SELECT certnames.certname AS certname FROM factsets fs LEFT JOIN environments ON fs.environment_id = environments.id LEFT JOIN certnames ON fs.certname = certnames.certname WHERE (fs.stable||fs.volatile) @> '{"server_type":"server_type_name"}') ) sub)))) ) sub))) AND NOT ((fs.certname) in (SELECT certname FROM ( (SELECT not_active_nodes.certname AS certname FROM not_active_nodes) ) sub)));

      explain of the query:

       Nested Loop Left Join  (cost=19.97..1767.19 rows=1 width=118)
         Join Filter: (factsets.environment_id = env.id)
         ->  Nested Loop Semi Join  (cost=19.97..1766.11 rows=1 width=53)
               ->  Seq Scan on factsets  (cost=17.22..64.32 rows=5 width=53)
                     Filter: ((NOT (hashed SubPlan 1)) AND ((stable || volatile) ? 'ipaddress'::text))
                     SubPlan 1
                       ->  Seq Scan on certnames certnames_4  (cost=0.00..17.20 rows=8 width=22)
                             Filter: ((deactivated IS NOT NULL) OR (expired IS NOT NULL))
               ->  Nested Loop Semi Join  (cost=2.75..340.35 rows=1 width=176)
                     ->  Nested Loop Semi Join  (cost=2.20..339.48 rows=1 width=132)
                           ->  Nested Loop  (cost=1.65..338.62 rows=1 width=88)
                                 ->  Nested Loop Semi Join  (cost=1.10..7.98 rows=1 width=96)
                                       ->  Nested Loop  (cost=0.55..7.12 rows=1 width=52)
                                             Join Filter: (factsets.certname = c.certname)
                                             ->  Index Scan using certnames_transform_certname_key on certnames  (cost=0.28..6.69 rows=1 width=30)
                                                   Index Cond: (certname = factsets.certname)
                                             ->  Index Scan using catalogs_certname_idx on catalogs c  (cost=0.28..0.41 rows=1 width=30)
                                                   Index Cond: (certname = certnames.certname)
                                       ->  Nested Loop  (cost=0.55..0.85 rows=1 width=44)
                                             Join Filter: (fs.certname = certnames_1.certname)
                                             ->  Index Scan using factsets_certname_idx on factsets fs  (cost=0.28..0.48 rows=1 width=30)
                                                   Index Cond: (certname = c.certname)
                                                   Filter: ((stable || volatile) @> '{"project": "project_name"}'::jsonb)
                                             ->  Index Only Scan using certnames_transform_certname_key on certnames certnames_1  (cost=0.28..0.36 rows=1 width=22)
                                                   Index Cond: (certname = c.certname)
                                 ->  Index Scan using catalog_resources_pkey1 on catalog_resources resources  (cost=0.55..318.82 rows=1182 width=29)
                                       Index Cond: (certname_id = certnames.id)
                           ->  Nested Loop  (cost=0.55..0.85 rows=1 width=44)
                                 Join Filter: (fs_2.certname = certnames_3.certname)
                                 ->  Index Scan using factsets_certname_idx on factsets fs_2  (cost=0.28..0.48 rows=1 width=30)
                                       Index Cond: (certname = c.certname)
                                       Filter: ((stable || volatile) @> '{"server_type": "server_type_name"}'::jsonb)
                                 ->  Index Only Scan using certnames_transform_certname_key on certnames certnames_3  (cost=0.28..0.36 rows=1 width=22)
                                       Index Cond: (certname = c.certname)
                     ->  Nested Loop  (cost=0.55..0.85 rows=1 width=44)
                           Join Filter: (fs_1.certname = certnames_2.certname)
                           ->  Index Scan using factsets_certname_idx on factsets fs_1  (cost=0.28..0.48 rows=1 width=30)
                                 Index Cond: (certname = c.certname)
                                 Filter: ((stable || volatile) @> '{"server_class": "server_class_name"}'::jsonb)
                           ->  Index Only Scan using certnames_transform_certname_key on certnames certnames_2  (cost=0.28..0.36 rows=1 width=22)
                                 Index Cond: (certname = c.certname)
         ->  Seq Scan on environments env  (cost=0.00..1.03 rows=3 width=40)
      (42 rows)

      Query execution time on Puppetdb 4 is ~1s.
      Issue appears even if migrating just from Puppetdb 4 to Puppetdb 5 and seems to improve a little in Puppetdb7, but still it's really slow

      Attachments

        Issue Links

          Activity

            People

              Unassigned Unassigned
              kaswob Kamil Swoboda
              Votes:
              1 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

                Created:
                Updated:

                Zendesk Support