Uploaded image for project: 'PuppetDB'
  1. PuppetDB
  2. PDB-2487

Allow for a "resource-events-ttl" to reduce the number of days of events that are stored

    XMLWordPrintable

Details

    • Reviewed
    • 2 - 5-25% of Customers
    • 3 - Serious
    • 4 - $$$$$
    • Hide
      This impacts mostly large customers or those which run everything in noop. This inflates the size of the table in PDB which can use up disk space and grow the table to a size where the queries from event inspector no longer complete.

      There is a terrible workaround using a cron job with a delete statement directly against this table.
      Show
      This impacts mostly large customers or those which run everything in noop. This inflates the size of the table in PDB which can use up disk space and grow the table to a size where the queries from event inspector no longer complete. There is a terrible workaround using a cron job with a delete statement directly against this table.
    • 32574,35133
    • 2
    • Enhancement
    • Hide
      A configuration parameter, resource-events-ttl, has been added. This parameter rounds up to the nearest day (i.e. 14h rounds up to 1d)

      When the TTL is expired, the table containing that day's events is dropped so that there is no need to vacuum the resource_events table.
      Show
      A configuration parameter, resource-events-ttl, has been added. This parameter rounds up to the nearest day (i.e. 14h rounds up to 1d) When the TTL is expired, the table containing that day's events is dropped so that there is no need to vacuum the resource_events table.

    Description

      Currently we store report-ttl days of events in the resource_events table in the puppetdb database.

      Some customers have performance issues using the API endpoints that read from resource_events that could be mitigated or resolved by reducing the number of days of events that are stored in that table.

      In cases where customers would like to store more reports like 30-60 days of reports they may not desire storing that many days of events as events tend to be more useful for watching things that changed recently.

      I would be good to have an option to seperate these concerns and allow customers to tune in production for their own preferences.

      Side note: I'm not tied to the name "resource-events-ttl" I just figured people would know what I meant by reading that.

      Delete Query:

      DELETE FROM resource_events 
      WHERE timestamp < NOW() - INTERVAL '1 days';
      

      Bash code:

      echo "DELETE FROM resource_events WHERE timestamp < NOW() - INTERVAL '1 days';" > /tmp/delete_resource_events.sql
      su - pe-postgres -s /bin/bash -c "/opt/puppetlabs/server/bin/psql -d pe-puppetdb -f /tmp/delete_resource_events.sql"
      

      Another thought

      If implemented would resource-events-ttl have it's own GC API command or would it just fall under report-ttl? It could probably just be under the report-ttl but should run before the delete from reports does.

      Attachments

        Issue Links

          Activity

            People

              robert.roland Robert Roland
              nick.walker Nick Walker
              Votes:
              7 Vote for this issue
              Watchers:
              30 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Zendesk Support