[SERVER-1270] HOCON auth.conf, regex backslashes need to be escaped Created: 2016/04/16  Updated: 2016/05/25  Resolved: 2016/04/25

Status: Closed
Project: Puppet Server
Component/s: DOCS
Affects Version/s: None
Fix Version/s: None

Type: Bug Priority: Normal
Reporter: Matthew Gyurgyik Assignee: Unassigned
Resolution: Won't Do Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified



Expectation: provide ruby regex string to path parameter, without needing to escape backslashes.

Actual Results: A ruby regex that does not have backslashes escaped will cause the server to crash.

Authorization block I'm working with. In this example I have the escaped the backslashes and puppetserver is happy.

            match-request: {
                path: "/puppet/v3/catalog/((\\w*?)(?:\\d+|-\\w+|_|\\.)[^/]*)$"
                type: regex
                method: [get, post]
            allow: [ "$2_cluster", "$2_shared", "$1" ]
            name: "my catalog"
            sort-order: 300

Puppet Server fails to start when using

path: "/puppet/v3/catalog/((\w*?)(?:\d+|-\w+|_|\.)[^/]*)$"

Exception in thread "main" com.typesafe.config.ConfigException$Parse: /etc/puppetlabs/puppetserver/conf.d/auth.conf: 7: Reserved character '\' is not allowed outside quotes (if you intended '\' (Reserved character '\' is not allowed outside quotes) to be part of the value for 'path', try enclosing the value in double quotes)

If I use double and single quotes together, I get a slightly different error

path: "'/puppet/v3/catalog/((\w*?)(?:\d+|-\w+|_|\.)[^/]*)$'"

Exception in thread "main" com.typesafe.config.ConfigException$Parse: /etc/puppetlabs/puppetserver/conf.d/auth.conf: 28: in value for key 'path': backslash followed by 'w', this is not a valid escape sequence (quoted strings use JSON escaping, so use double-backslash \\ for literal backslash)

This seems consistent with the HOCON documentation https://github.com/typesafehub/config/blob/master/HOCON.md#unchanged-from-json

quoted strings are in the same format as JSON strings

JSON strings must be escaped

My non-developer/sysadmin suggestion would be to automatically escape backslash before sending the config file to tk-load-config for the path parameter.

Comment by Kylo Ginsberg [ 2016/04/19 ]

Past Haus thoughts on this?

Comment by Kevin Corcoran [ 2016/04/20 ]

Matthew Gyurgyik - After you escape your regex as the documentation suggests, does it work as you expect it to?

Comment by Jeremy Barlow [ 2016/04/20 ]

I've played around with some regular expressions that use "\w" and "\d" with the HOCON auth.conf format that Puppet Server supports. As long as I've escaped the backslashes - e.g., "\\w" and "\\d" - the server will startup properly and the server will match incoming request paths properly against the path from the corresponding auth.conf rule - e.g., an alphanumeric character will match up to the "\\w" and a digit will match up to the "\\d". From a functional perspective, I'm not seeing any issues with the current implementation.

My non-developer/sysadmin suggestion would be to automatically escape backslash before sending the config file to tk-load-config for the path parameter.

Coming from the custom format that the Ruby-based auth.conf supported to the HOCON format that Puppet Server's auth.conf supports, I can see where the idea of having to escape the regexes to conform to HOCON/JSON escaping rules may seem jarring. That said, though, it would make me really nervous to try to compensate for this with special handling for this one setting. Escaping the backslash from the file on disk before the HOCON parser touches it would seem problematic in situations where the original value from the file on disk had been specifically escaped to take advantage of HOCON (from JSON's) internal de-escaping behavior. For example, if someone needed to include a Unicode character in the regex path and had formatted the rule in the file on disk with "\u0061", re-escaping the backslash to "\\u0061" would cause the effective path that would be used by the server to be "\u0061" rather than the latin small 'a' character. Any more sophisticated logic that we'd try to add in to only escape the original value for some characters but not others would seem to lead to cases where the results would sometimes not meet a specific user's expectations.

My inclination is to say that escaping characters per the expectations of the HOCON format needs to be done for the path setting, just as it is for any other HOCON setting that Puppet Server uses, and not try to put any extra compensation into Puppet Server's HOCON auth.conf handling for this.

/CC Eric Sorenson to chime in if he feels differently about this.

Comment by Matthew Gyurgyik [ 2016/04/20 ]

Kevin Corcoran If I escape my regex, everything works as expected (see my first example).

Jeremy Barlow When I see

type => 'regex'

my expectation, as a user, is to provide a regular expression without needing to escape. I don't think I've ever encountered a config fire where I need to escape my regular expressions. Regular expressions can be hard to read to begin with and toss in extra escaping they can be even harder. With this said, I understand your arguments. At minimum I think the documentation for puppetserver auth.conf should be updated to denote escaping is required.

Comment by Kevin Corcoran [ 2016/04/21 ]

I agree that the documentation should be updated.

Comment by Jeremy Barlow [ 2016/04/25 ]

Thanks, Matthew Gyurgyik, for the feedback. I agree that it does seem onerous to have to escape the regex characters where it hadn't been necessary prior to the use of HOCON. I think if HOCON ever adds a native way to express a regex data type for a setting that we should consider utilizing that for this configuration file in Puppet Server. Where we only have the option to represent a setting value within a string type in HOCON for now, I think sticking with the HOCON escaping rules for consistency with other HOCON settings is probably the best approach.

Garrett Guillotte currently has a PR up against the puppetserver repo which will update our documentation to more explicitly call out the need to escape characters within the regex path of a rule - https://github.com/puppetlabs/puppetserver/pull/1023. Hopefully, that will be published to our docs web site soon.

I'm going to close this issue under the assumption that the current implementation is working as expected. /CC Eric Sorenson should he see the need for us to follow up on this further. Thanks again!

Comment by Jeremy Barlow [ 2016/04/26 ]

Matthew Gyurgyik, it was pointed out by Michael Smith that it might be possible to use a triple-quote syntax to represent the regex in the path in HOCON. I tried that out, and it seems to work just as he suggested.

So, for the example you wrote in your description, I tried:

path: """/puppet/v3/catalog/((\w*?)(?:\d+|-\w+|_|\.)[^/]*)$"""

Even without quoting each individual backslash within the regex, my Puppet Server was able to load and utilize that path as equivalent to the form of the path where each internal backslash was escaped:

path: "/puppet/v3/catalog/((\\w*?)(?:\\d+|-\\w+|_|\\.)[^/]*)$"

Does the triple-quote form work for you? Maybe that would be a more reasonable approach for you to use - and one we could document as an alternative (/CC Garrett Guillotte)?

Comment by Matthew Gyurgyik [ 2016/04/26 ]

Jeremy Barlow, triple-quote do the trick. Thanks Michael Smith!

+1 For documenting this.

Comment by Chris Price [ 2016/04/26 ]

Jeremy Barlow we should perhaps also think about whether there'd be a sane way to expose the triple-quote support in the puppet module(s) (either the tk-auth one or the puppetlabs-hocon one, or both).

Comment by Jeremy Barlow [ 2016/04/26 ]

Jeremy Barlow we should perhaps also think about whether there'd be a sane way to expose the triple-quote support in the puppet module(s) (either the tk-auth one or the puppetlabs-hocon one, or both).

Yeah, that might make sense. Seems like if we were going to do that we should figure out a good way to configure it at the puppetlabs-hocon layer first - for the most general benefit - before figuring out how we might want to expose that from the puppet_authorization module configuration.

For what it's worth, someone could just write a puppet_authorization::rule in Puppet code containing the unescaped form of the regex ...

puppet_authorization::rule {...
  match_request_path   => '/puppet/v3/catalog/((\w*?)(?:\d+|-\w+|_|\.)[^/]*)$',

... but have that be generated via the ruby-hocon gem into the HOCON-escaped form in the auth.conf HOCON file:

"path" : "/puppet/v3/catalog/((\\w*?)(?:\\d+|-\\w+|_|\\.)[^/]*)$",

So at least it would make the source authoring a little easier, although it would still be more onerous to see the escaping in the generated file.

Generated at Sun Jan 26 22:50:42 PST 2020 using JIRA 7.7.1#77002-sha1:e75ca93d5574d9409c0630b81c894d9065296414.