Uploaded image for project: 'PuppetDB'
  1. PuppetDB
  2. PDB-5135

PuppetDB should warn about resource titles that exceed Postgres index sizes

    XMLWordPrintable

Details

    • Improvement
    • Status: Resolved
    • Normal
    • Resolution: Fixed
    • None
    • PDB 7.6.0, PDB 6.19.0
    • PuppetDB
    • None
    • Ghost
    • 5
    • ghost-28.07.2021, ghost-11.08.2021
    • Enhancement
    • Hide
      On the resource_events_resource_*z partial has the multicolumn resource_events_resource_timestamp_xxxxxz index (timestamp, title and type) that is limited to 2712 bytes for postgres versions up to 11. Starting with postgres 12, the index size was reduced with 8 bytes. Having resource events that exceed this limit will cause PDB to fail to insert the row without to many info about what and where is the resource that caused the error. This pr adds extra logs with details to allow easier debugging. There are two messages printed, when the index is close to the limit (between 2500 and 2704) and when the limit is exceeded (over 2704).
      Show
      On the resource_events_resource_*z partial has the multicolumn resource_events_resource_timestamp_xxxxxz index (timestamp, title and type) that is limited to 2712 bytes for postgres versions up to 11. Starting with postgres 12, the index size was reduced with 8 bytes. Having resource events that exceed this limit will cause PDB to fail to insert the row without to many info about what and where is the resource that caused the error. This pr adds extra logs with details to allow easier debugging. There are two messages printed, when the index is close to the limit (between 2500 and 2704) and when the limit is exceeded (over 2704).
    • Needs Assessment

    Description

      When PuppetDB inserts data into Postgres, the constraints of the database can cause errors to be raised.

      A typical example is that Postgres disallows the use of the null byte, "\0", in strings while UTF-8 generally tolerates it. Another constraint comes into play when large data values are inserted into columns that have database indexes:

      Any data type that can be sorted into a well-defined linear order can be indexed by a btree index. The only limitation is that an index entry cannot exceed approximately one-third of a page (after TOAST compression, if applicable).

      https://www.postgresql.org/docs/11/btree-intro.html

      This constraint is typically encountered with resource title values in catalogs or reports and results in an error similar to the following being raised from the storage attempt:

      2021-05-11T23:59:47.225Z ERROR [p.p.command] [14,654,263] [store report] Retrying after attempt 0 for node.hostname.example, due to: org.postgresql.util.PSQLException: ERROR: index row size 2720 exceeds maximum 2712 for index "resource_events_resource_timestamp_20210511z"
      

      This error is somewhat useful in that it indicates which node tripped the condition. But the error does not help identify which data value needs to be corrected.

      For resource titles, PuppetDB should check input lengths against the 2712 character maximum and emit a warning or error that includes:

      • The certname of the node that produced the data
      • The type of the resource
      • The manifest file and line number where the resource was defined

      Attachments

        Activity

          People

            sebastian.miclea Sebastian Miclea
            chuck Charlie Sharpsteen
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Zendesk Support