[PDB-3928] Facts blacklist should support providing a fact name regex to blacklist facts matching a pattern Created: 2018/05/30  Updated: 2018/11/13  Resolved: 2018/10/18

Status: Closed
Project: PuppetDB
Component/s: None
Affects Version/s: None
Fix Version/s: PDB 5.1.z, PDB 5.2.6, PDB 6.0.1

Type: Improvement Priority: Normal
Reporter: Nick Walker Assignee: Zachary Kent
Resolution: Done Votes: 2
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
relates to FACT-1858 Provide an option to disable legacy f... Needs Information
Team: PuppetDB
Sprint: Hopper/Triage
CS Priority: Major
CS Frequency: 3 - 25-50% of Customers
CS Severity: 4 - Major
CS Business Value: 4 - $$$$$
CS Impact: There is a significant performance boost available for customers if they block unneeded facts from being sent for storage. Without the ability to use regexs, it is not reasonable to blacklist all possible permutations of possible names for network and block device facts since they could be in the hundreds depending on installation size. These are most likely not going to be needed by customers in general.
Release Notes: New Feature
Release Notes Summary:
Add support for fact blacklist regexes. Omit facts whose name
completely matches any of the expressions provided. Add a
"facts-blacklist-type" database configuration option which defaults to
"literal", producing the existing behavior, but can be set to "regex"
to indicate that the facts-blacklist items are java patterns.
QA Risk Assessment: Needs Assessment


The Problem

The puppet agent still sends legacy facts to the puppet master and they are forwarded onto PuppetDB. However, there's not many good reasons to store the legacy facts in addition to the modern structured facts.

Take a look at the list of legacy facts here: https://puppet.com/docs/facter/3.9/core_facts.html

Notice that alot of them include a network interface or something else unique which is bloating PuppetDB's fact storage full of unique facts. This makes lists of facts for populating things like autocomplete in the console UI larger and slower than they need to be.

Suggested Improvement

We already have a facts-blacklist in PuppetDB which could be used to workaround this problem if it allowed for a regex match on fact name so we could exclude facts like `mtu_` or `blockdevice_`


This isn't a full solution, really we should either stop sending legacy facts from the agent to the mater OR stop forwarding legacy facts from the master onto PuppetDB. Since those may have unintended consequences or may just be a breaking change, this improvement allows customers that want to take the initiative a way to remove these facts from being stored in PuppetDB.

Comment by Kenn Hussey [ 2018/09/12 ]

Zachary Kent please provide release notes for this issue if needed, thanks!

Comment by Zachary Kent [ 2018/09/12 ]

Kenn Hussey we ran into last minute complications with the release and this issue. As a result this will be making it into the next y or possibly a z. The fix version has been updated.

Comment by Joel Weierman [ 2018/10/24 ]

Any chance this will get into 2018.1.5?

Generated at Tue Aug 20 09:58:01 PDT 2019 using JIRA 7.7.1#77002-sha1:e75ca93d5574d9409c0630b81c894d9065296414.