[PDB-3857] Fix fact path GC incompatibility with Postgres 10 Created: 2018/02/26  Updated: 2019/04/22  Resolved: 2019/01/08

Status: Closed
Project: PuppetDB
Component/s: PuppetDB
Affects Version/s: PDB 5.2.0
Fix Version/s: PDB 5.2.7, PDB 6.0.2, PDB 6.2.0

Type: Bug Priority: Normal
Reporter: Sebastian Reitenbach Assignee: Rob Browning
Resolution: Fixed Votes: 5
Labels: resolved-issue-added
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified
Environment:

Puppet 5.4.0

postgresql-server and client: 10.1

puppetdb: 5.2.0

ruby: 2.4.3

java: jdk-1.8.0.144

OS/Version: OpenBSD 6.2-current amd64

 

 

{{cat bootstrap.cfg }}
# This file is used by the application framework (trapperkeeper) to
# determine what services should be loaded at boot time.
# For more info, see:
# https://github.com/puppetlabs/trapperkeeper/wiki/Bootstrapping

# Web Server
puppetlabs.trapperkeeper.services.webserver.jetty9-service/jetty9-service

# Webrouting
puppetlabs.trapperkeeper.services.webrouting.webrouting-service/webrouting-service

# TK status
puppetlabs.trapperkeeper.services.metrics.metrics-service/metrics-webservice
puppetlabs.trapperkeeper.services.status.status-service/status-service
puppetlabs.trapperkeeper.services.scheduler.scheduler-service/scheduler-service

# PuppetDB Services
puppetlabs.puppetdb.cli.services/puppetdb-service
puppetlabs.puppetdb.command/command-service
puppetlabs.puppetdb.pdb-routing/maint-mode-service
puppetlabs.puppetdb.pdb-routing/pdb-routing-service
puppetlabs.puppetdb.config/config-service# NREPL
puppetlabs.trapperkeeper.services.nrepl.nrepl-service/nrepl-service# Dashboard redirect: remove to disable
puppetlabs.puppetdb.dashboard/dashboard-redirect-service

 

{{}}{{cat database.ini }}
[database]
classname = org.postgresql.Driver
subprotocol = postgresq
subname = //localhost:5432/puppetdb

username = puppetdb
password = password
gc-interval = 60
log-slow-statements = 10
syntax_pgs = true
node-ttl = 0s
node-purge-ttl = 0s
report-ttl = 14d
conn-max-age = 60
conn-keep-alive = 45
conn-lifetime = 0


Issue Links:
Relates
relates to PDB-4347 Don't use a static temp table name fo... Resolved
relates to PDB-3924 Avoid temp file flood during fact pat... Closed
Template:
Team: PuppetDB
Sprint: Hopper/Triage
Method Found: Needs Assessment
Release Notes: Enhancement
Release Notes Summary: PuppetDB should no longer be incompatible with PostgreSQL versions 10 and newer.
QA Risk Assessment: Needs Assessment

 Description   

Puppet Version: 5.4.0
Puppet Server Version: ---
OS Name/Version: OpenBSD 6.2-current

Since the update of puppetdb to 5.2.0 (prior version was 5.1.3), I see following exception every hour in the puppetdb logs, seems to have to do with regular garbage collection:

 

2018-02-26 21:53:10,731 WARN [p.p.q.engine] The event-counts entity is experimental and may be altered or removed in the future.
2018-02-26 21:53:45,261 INFO [p.p.c.services] Starting sweep of stale reports (threshold: 14 days)
2018-02-26 21:53:45,280 INFO [p.p.c.services] Finished sweep of stale reports (threshold: 14 days)
2018-02-26 21:53:45,281 INFO [p.p.c.services] Starting gc packages
2018-02-26 21:53:45,285 INFO [p.p.c.services] Finished gc packages
2018-02-26 21:53:45,285 INFO [p.p.c.services] Starting database garbage collection
2018-02-26 21:53:45,334 ERROR [p.p.c.services] Error during garbage collection
java.sql.BatchUpdateException: Batch entry 0 with recursive live_paths(key, path, value) as (select key, key as path, value from (select (jsonb_each(stable||volatile)).* from factsets) as base_case union all select sub_path as key, sub_paths.path||'#~'||sub_path as path, sub_value as value from (select * from (select path, case jsonb_typeof(value) when 'object' then (jsonb_each(value)).key when 'array' then generate_series(0, jsonb_array_length(value - 1))::text end as sub_path, case jsonb_typeof(value) when 'object' then (jsonb_each(value)).value when 'array' then jsonb_array_elements(value) end as sub_value from live_paths) as candidates where candidates.sub_path is not null) as sub_paths) delete from fact_paths fp where not exists (select 1 from live_paths where live_paths.path = fp.path) was aborted. Call getNextException to see the cause.
{{ at org.postgresql.jdbc.BatchResultHandler.handleError(BatchResultHandler.java:133)}}
{{ at org.postgresql.core.v3.QueryExecutorImpl$1.handleError(QueryExecutorImpl.java:419)}}
{{ at org.postgresql.core.v3.QueryExecutorImpl.processResults(QueryExecutorImpl.java:2004)}}
{{ at org.postgresql.core.v3.QueryExecutorImpl.execute(QueryExecutorImpl.java:360)}}
{{ at org.postgresql.jdbc.PgStatement.executeBatch(PgStatement.java:1019)}}
{{ at com.zaxxer.hikari.pool.ProxyStatement.executeBatch(ProxyStatement.java:125)}}
{{ at com.zaxxer.hikari.pool.HikariProxyStatement.executeBatch(HikariProxyStatement.java)}}
{{ at clojure.java.jdbc$execute_batch.invokeStatic(jdbc.clj:439)}}
{{ at clojure.java.jdbc$execute_batch.invoke(jdbc.clj:432)}}
{{ at clojure.java.jdbc$db_do_commands$fn__21875.invoke(jdbc.clj:748)}}
{{ at clojure.java.jdbc$db_transaction_STAR_.invokeStatic(jdbc.clj:662)}}
{{ at clojure.java.jdbc$db_transaction_STAR_.invoke(jdbc.clj:598)}}
{{ at clojure.java.jdbc$db_transaction_STAR_.invokeStatic(jdbc.clj:611)}}
{{ at clojure.java.jdbc$db_transaction_STAR_.invoke(jdbc.clj:598)}}
{{ at clojure.java.jdbc$db_do_commands.invokeStatic(jdbc.clj:747)}}
{{ at clojure.java.jdbc$db_do_commands.invoke(jdbc.clj:732)}}
{{ at puppetlabs.puppetdb.jdbc$do_commands.invokeStatic(jdbc.clj:38)}}
{{ at puppetlabs.puppetdb.jdbc$do_commands.doInvoke(jdbc.clj:33)}}
{{ at clojure.lang.RestFn.invoke(RestFn.java:408)}}
{{ at puppetlabs.puppetdb.scf.storage$delete_unused_fact_paths.invokeStatic(storage.clj:1127)}}
{{ at puppetlabs.puppetdb.scf.storage$delete_unused_fact_paths.invoke(storage.clj:1123)}}
{{ at puppetlabs.puppetdb.scf.storage$garbage_collect_BANG_$fn_28745$fn_28748.invoke(storage.clj:1439)}}
{{ at puppetlabs.puppetdb.jdbc$with_transacted_connection_fn$fn_22474$fn_22475.invoke(jdbc.clj:483)}}
{{ at clojure.java.jdbc$db_transaction_STAR_.invokeStatic(jdbc.clj:628)}}
{{ at clojure.java.jdbc$db_transaction_STAR_.invoke(jdbc.clj:598)}}
{{ at puppetlabs.puppetdb.jdbc$with_transacted_connection_fn$fn__22474.invoke(jdbc.clj:482)}}
{{ at puppetlabs.puppetdb.jdbc$eval22448$retry_sql_STAR__22453$fn22454$fn_22455.invoke(jdbc.clj:454)}}
{{ at puppetlabs.puppetdb.jdbc$eval22448$retry_sql_STAR__22453$fn_22454.invoke(jdbc.clj:453)}}
{{ at puppetlabs.puppetdb.jdbc$eval22448$retry_sql_STAR___22453.invoke(jdbc.clj:444)}}
{{ at puppetlabs.puppetdb.jdbc$with_transacted_connection_fn.invokeStatic(jdbc.clj:480)}}
{{ at puppetlabs.puppetdb.jdbc$with_transacted_connection_fn.invoke(jdbc.clj:469)}}
{{ at puppetlabs.puppetdb.scf.storage$garbage_collect_BANG_$fn__28745.invoke(storage.clj:1438)}}
{{ at puppetlabs.puppetdb.scf.storage.proxy$java.lang.Object$Callable$7da976d4.call(Unknown Source)}}
{{ at com.codahale.metrics.Timer.time(Timer.java:101)}}
{{ at puppetlabs.puppetdb.scf.storage$garbage_collect_BANG_.invokeStatic(storage.clj:1432)}}
{{ at puppetlabs.puppetdb.scf.storage$garbage_collect_BANG_.invoke(storage.clj:1429)}}
{{ at puppetlabs.puppetdb.cli.services$garbage_collect_BANG_.invokeStatic(services.clj:189)}}
{{ at puppetlabs.puppetdb.cli.services$garbage_collect_BANG_.invoke(services.clj:180)}}
{{ at puppetlabs.puppetdb.cli.services$eval36422$clean_up_36427$fn36431$fn_36441.invoke(services.clj:259)}}
{{ at puppetlabs.puppetdb.cli.services.proxy$java.lang.Object$Callable$7da976d4.call(Unknown Source)}}
{{ at com.codahale.metrics.Timer.time(Timer.java:101)}}
{{ at puppetlabs.puppetdb.cli.services$eval36422$clean_up_36427$fn_36431.invoke(services.clj:258)}}
{{ at puppetlabs.puppetdb.cli.services$eval36422$clean_up__36427.invoke(services.clj:220)}}
{{ at puppetlabs.puppetdb.cli.services$collect_garbage.invokeStatic(services.clj:373)}}
{{ at puppetlabs.puppetdb.cli.services$collect_garbage.invoke(services.clj:368)}}
{{ at puppetlabs.puppetdb.cli.services$start_puppetdb$fn__36498.invoke(services.clj:434)}}
{{ at clojure.lang.AFn.run(AFn.java:22)}}
{{ at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)}}
{{ at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)}}
{{ at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)}}
{{ at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)}}
{{ at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)}}
{{ at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)}}
{{ at java.lang.Thread.run(Thread.java:748)}}{{}}

Desired Behavior:

there should be no exception.

 

Actual Behavior:

the exception every hour.



 Comments   
Comment by Russell Mull [ 2018/02/28 ]

Hi Sebastian, thanks for the bug report!

Can you provide any correlated logs from postgresql? I'd expect to see something related to this at the same frequency.

Comment by Sebastian Reitenbach [ 2018/02/28 ]

indeed I get errors logged there as well, should have provided them with the initial report:

 

2018-03-01 05:53:50.658 CET [89665] ERROR: set-returning functions are not allowed in CASE at character 559
2018-03-01 05:53:50.658 CET [89665] HINT: You might be able to move the set-returning function into a LATERAL FROM item.
2018-03-01 05:53:50.658 CET [89665] STATEMENT: with recursive live_paths(key, path, value) as (select key, key as path, value from (select (jsonb_each(stable||volatile)).* from factsets) as base_case union all select sub_path as key, sub_paths.path||'#~'||sub_path as path, sub_value as value from (select * from (select path, case jsonb_typeof(value) when 'object' then (jsonb_each(value)).key when 'array' then generate_series(0, jsonb_array_length(value - 1))::text end as sub_path, case jsonb_typeof(value) when 'object' then (jsonb_each(value)).value when 'array' then jsonb_array_elements(value) end as sub_value from live_paths) as candidates where candidates.sub_path is not null) as sub_paths) delete from fact_paths fp where not exists (select 1 from live_paths where live_paths.path = fp.path)

Comment by Russell Mull [ 2018/03/01 ]

Are you running postgresql 10? It looks like the behaviour around this kind of query may have changed between 9.6 (which we developed against) and 10.

Comment by Sebastian Reitenbach [ 2018/03/01 ]

Russell Mull

yes, as I gave it in the Environment info, I'm running PostgreSQL 10.1.

 

another note: I updated from 5.1.3, which did not bring these errors/warnings.

But reading the CHANGELOG, I guess the garbage collection was redone/update in that version, and now that's new.

 

thanks

Sebastian

Comment by Russell Mull [ 2018/03/01 ]

Ah, I missed that the first time I looked.
PDB 5.2 has a new way of storing facts, which incorporates this new GC query.

I think you're probably ok for this query to just fail for the time being; it cleans up records for fact names which are no longer in use; so this is only really an issue if you're doing some really crazy stuff with custom facts.

Comment by Sebastian Reitenbach [ 2018/03/01 ]

Russell Mull thanks for the info, I don't do funny things with custom facts

I think then I feel safe and will actually update the official OpenBSD pupptedb package for OpenBSD -current

 

Comment by Tim Meusel [ 2018/09/24 ]

Hey people. Is there an ETA for this?

Comment by Rob Browning [ 2018/09/24 ]

While we don't have an firm commitments yet, if there don't end up being a lot of additional issues, I'd expect us to have support by say 6.1 or 6.2, and we definitely plan to fix the fact-path-gc temp file issue for 6.1.

 

Generated at Tue Oct 15 02:00:39 PDT 2019 using JIRA 7.7.1#77002-sha1:e75ca93d5574d9409c0630b81c894d9065296414.