[SERVER-345] Fixup usages of cacert / localcacert in master Created: 2015/02/09  Updated: 2015/12/17  Resolved: 2015/03/27

Status: Closed
Project: Puppet Server
Component/s: DOCS, Puppet Server
Affects Version/s: SERVER 1.0.2
Fix Version/s: SERVER 1.0.8, SERVER 2.1.0

Type: Bug Priority: Normal
Reporter: Jeremy Barlow Assignee: Erik Dasher
Resolution: Fixed Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Relates
relates to SERVER-346 More surgical update of hostcrl from ... Accepted
relates to SERVER-498 Need to create some documentation for... Closed
relates to SERVER-500 Consider failing Puppet Server startu... Closed
Template:
Epic Link: Green: Puppet Server 1.0.8 / PE 3.8
Sub-team: emerald
Story Points: 2
Sprint: Server Emerald 2015-03-18, Server Emerald 2015-04-01
QA Contact: Erik Dasher

 Description   

Puppet Server likely isn't handling the use of the cacert / localcacert settings correctly in two cases:

1) The retrieve-ca-cert! function in certificate_authority.clj unconditionally copies the file at the "cacert" setting to the file at the "localcacert" setting.

If a pre-existing file were located at the "localcacert" location, that file would be overwritten from the content of the file at the "cacert" location. This approach would be problematic if a user were to have intended to load multiple CA certificates into the file at the "localcacert" location - e.g., to allow clients issued from different CAs to access the master - since the file at the "cacert" location would presumably only have the CA service's certificate and not the other allowed CA certificates.

Instead, it would seem better for this logic to only copy the file if the file at the "localcacert" location did not already exist.

2) Via the init-webserver! function in puppet_server_config_core.clj, the value of the "cacert" setting from puppet.conf is propagated up to the webserver in the event that no value for "ssl-ca-cert" is present in the webserver configuration.

Assuming that the "cacert" setting is intended to denote the location where the Puppet Server CA service obtains its own certificate, it would seem more appropriate for the init-webserver! function to use the "localcacert" setting to populate the list of CA certificates that the webserver would use to validate clients.

------

From the comments below, I pulled up these items to be covered in the work for this ticket:

1) Change the implementation of retrieve-ca-cert! in certificate_authority.clj to not copy the cacert over the top of the localcacert if there's already a file at the localcacert location.

This would avoid the problem of the user's "external CA" cert possibly being stomped by the Puppet CA service if the "external CA" cert happened to be populated before the Puppet Server were started and the Puppet CA were still enabled as the master tries to retrieve the CA cert during initialization. This doesn't solve the issue of the Puppet CA going ahead and initializing its own certs, which isn't ultimately what the user wants to have happen.

2) Change the init-webserver! function in puppet_server_config_core.clj to, for the case that Jetty webserver ssl settings are being overridden from core Ruby Puppet settings, override the Jetty webserver ssl-ca-cert setting with the core Ruby Puppet localcacert setting.


Risk assesment: High (automated test needed)
Probability: Medium-High; Users have encountered issues already for this.
Severity: Medium. If a user brings up a puppet CA on a master where the user wanted to use an external puppet CA, the local CA will stomp on the certificate from the external CA.



 Comments   
Comment by Chris Price [ 2015/03/02 ]

Ran into this with a user in IRC today. In the case where you are disabling the Puppet Server CA and trying to use an existing CA from elsewhere, then we try to initialize Jetty via the 'cacert' setting, which points to 'ssl/ca/ca_crt.pem', which won't ever exist on a puppet server instance that has had the CA disabled from the get-go. Prioritizing it for 1.1.

Comment by Jeremy Barlow [ 2015/03/02 ]

In this case, did the user not, in addition to disabling the Puppet Server CA, add the ssl-ca-cert setting to the puppet-server webserver.conf file (documented at https://docs.puppetlabs.com/puppetserver/1.0/external_ca_configuration.html#web-server-configuration)? The ssl-ca-cert setting in the webserver.conf, if specified, should trump any CA location settings from puppet.conf.

Comment by Chris Price [ 2015/03/02 ]

That was the solution we eventually got him to, once we figured out what was going wrong... and it worked fine. Just wasn't entirely intuitive to get him there.

Especially since the first run of the server had the CA enabled, so it generated new (incompatible) certs for him and he had to determine that he needed to go delete them.

Comment by Jeremy Barlow [ 2015/03/02 ]

Just for clarification, on the second point above, I wasn't suggesting that the behavior of inferring the location of the CA cert from puppet.conf when the webserver.conf's "ssl-ca-cert" setting is not set would go away completely. Instead, I was suggesting it be changed to:

if the "ssl-ca-cert" setting is defined in the webserver.conf
then
  use the "ssl-ca-cert" setting for the CA cert location
else
  use the "ssl_server_ca_auth" setting for the CA cert location

The implementation in Ruby Puppet settings would allow for a unique value to be specified in the puppet.conf file for "ssl_server_ca_auth". If no explicit value is set for "ssl_server_ca_auth", however, Ruby Puppet will return the value that the "cacert" puppet.conf setting holds when the value for "ssl_server_ca_auth" is queried.

I'm not sure if the changes suggested on this ticket would fundamentally improve the problem that the user had encountered. Or maybe the external CA documentation just needs to be improved?

Ideally, we could move away from this setting override stuff completely and just say that the value from the "ssl-ca-cert" setting in the webserver.conf file is the only one that Puppet Server will consider for the CA certificate that it configures the Jetty webserver to use. One problem with doing that, though, is that the Puppet Server package does not configure a value for "ssl-ca-cert" by default and the Jetty SSL server configuration needs to boot with some CA certificate. We went down the path of not configuring "ssl-X" values in the default webserver.conf file and, if not later set by the user, inferring them from the puppet.conf file instead because the "ssl-X" settings in puppet.conf had support for more dynamic values, like interpolating default values based on the certname of the running host, whereas the values in the webserver.conf are all literal. May not be worth revisiting that part of the implementation until another major version boundary where the SSL settings in webserver.conf can more cleanly be separated from the ones in puppet.conf.

Comment by Jeremy Barlow [ 2015/03/02 ]

Assuming we end up doing this one, it may make sense to try to do some or all of SERVER-346 at the same time. It is similar in the sense that it refers to a setting that we're not handling the overrides for properly although it is specific to CRLs and has some unique work associated with it - the surgical replacement of a Puppet CRL without affecting other CRLs that may be in the hostcrl file.

Comment by Chris Price [ 2015/03/02 ]

We can break this to a different ticket if you like, but all that I was suggesting w/rt this user is that if we used 'localcacert' instead of 'cacert', he would not have had a problem.

Comment by Jeremy Barlow [ 2015/03/03 ]

but all that I was suggesting w/rt this user is that if we used 'localcacert' instead of 'cacert', he would not have had a problem.

Ah, my bad. Above I had written:

If no explicit value is set for "ssl_server_ca_auth", however, Ruby Puppet will return the value that the "cacert" puppet.conf setting holds when the value for "ssl_server_ca_auth" is queried.

The "ssl_server_ca_auth" setting, if not explicitly set, falls back to the "localcacert" setting, not the "cacert" setting.

So, I think both points on this ticket would apply to this user:

1) If the user had placed a Custom CA cert at the "localcacert" location but had not yet disabled the CA service at startup, the "localcacert" should no longer be replaced by the "cacert" one – although a superfluous file would still be generated at the "cacert" location because the CA service had not been explicitly disabled yet.

2) Assuming no value for "ssl_server_ca_auth" had been set, the Jetty server would now be populated with the cert at the "localcacert" location instead of the one at the "cacert" location.

Given the above, I wouldn't see a need for a new ticket to be filed since this one should cover the use case.

Comment by Chris Price [ 2015/03/03 ]

One more clarifying comment:

I think that the best-case scenario for a user that is wanting to disable the CA is that we can educate them to do an agent run on the box in question before ever starting Puppet Server. In this case, they will have populated localcacert by pulling down the correct CA cert from the "real" CA. They will have no file at the 'cacert' path.

If, at this point, they attempt to start Puppet Server w/o disabling the CA, the server should (correctly) fail to start, log an error message about the pem files being an inconsistent state, and Do No Harm in terms of creating bad certs or overriding any existing ones.

If, next, they modify bootstrap.cfg to disable the CA, then the server will fail to start again because the jetty override code is looking in the 'cacert' path and there is no file there. This is really where we landed with the user in IRC the other day; we had to put the settings directly into webserver.conf because there was no 'cacert' file.

If we can get the change through to have the override fall through to 'localcacert' (either directly or through the intermediate setting that you describe, I trust you've done your homework on that), then I think the user experience described above could be considered fairly reasonable for now.

(I think we're on the same page in terms of direction of this work, just wanted to add a more explicit note about the precise use case that we ran into.)

Comment by Jeremy Barlow [ 2015/03/03 ]

If, at this point, they attempt to start Puppet Server w/o disabling the CA, the server should (correctly) fail to start, log an error message about the pem files being an inconsistent state, and Do No Harm in terms of creating bad certs or overriding any existing ones.

I think this highlights some extra work that would need to be done which my description isn't currently capturing.

If the Puppet Server CA is enabled, the CA will try to generate CA certificates if all of its "required files" are missing. From
https://github.com/puppetlabs/puppet-server/blob/puppet-server-1.0.3/src/clj/puppetlabs/puppetserver/certificate_authority.clj#L135-L137, this list includes cacert, cacrl, cakey, cert-inventory, and serial. In the situation you mentioned here, I think the Puppet Server CA would generate the CA certificate-related files, which ultimately would end up being bogus.

We could augment this by failing to startup or generate any CA related files if the file at the ssl_server_ca_auth file – and/or hostcrl? – file already exists. If this were the case, we might presume that the user was setting those files up manually in order to do an external CA configuration setup and so erroring out of the install with an informative message could help them clarify intent - e.g., if you're planning an external CA configuration, disable the CA service, otherwise remove these files and restart the puppetserver...

Comment by Chris Price [ 2015/03/03 ]

I thought that Nate Wolfe had put in changes that would cause it to fail fast if any of the master or ca pem files already existed, but not all of them? That seems like the simplest / best path...

I would like to try really hard to avoid over-complicating this issue since all of this stuff should be going away in the next major anyway...

Comment by Nate Wolfe [ 2015/03/03 ]

I thought that Nate Wolfe had put in changes that would cause it to fail fast if any of the master or ca pem files already existed, but not all of them?

Yes, but not for all of the files, only a select few, which can be seen here: https://github.com/puppetlabs/puppet-server/blob/master/src/clj/puppetlabs/puppetserver/certificate_authority.clj#L132

Comment by Chris Price [ 2015/03/03 ]

I think the key is that it looks like that code breaks up the validation into two separate pieces, one for master, and one for CA. So that might mean that if the user had done an agent run, and thus had the relevant master certs and localcacert stuff in place, before they ever started puppet server... and then they started it, with the CA enabled by mistakey, that it would try to generate the cacert et al, and you'd end up in a super weird state with half 'legit' certs and half bogus (CA). Which is not awesome.

That said, the worst of the problems that this user encountered would have been solved by simply replacing the 'cacert' in the jetty override code with 'localcacert'. That change would take significantly less time than the amount of time that has been invested in this comment thread so far So I'm starting to lean towards just breaking that out into a separate ticket, prioritizing that, and de-prioritizing this... if we don't come around to a pretty simple approach for moving forward on this one soon.

Comment by Jeremy Barlow [ 2015/03/04 ]

So that might mean that if the user had done an agent run, and thus had the relevant master certs and localcacert stuff in place, before they ever started puppet server...

I'm a bit confused as to how this would happen exactly. For master certs / localcacert to be present, wouldn't either the Puppet Server have had to be started (and, therefore, the still enabled Puppet CA service) and agent run against it or someone have manually (i.e., outside of the Puppet agent process) have dropped the "legit" certs / localcacert in place on the master's file system? I suppose how the files get there may not be as important, just wanted to make sure I wasn't missing some critical context here...

and then they started it, with the CA enabled by mistakey, that it would try to generate the cacert et al, and you'd end up in a super weird state with half 'legit' certs and half bogus (CA). Which is not awesome.

Definitely agree with that. Jetty would end up using a CA cert generated by the Puppet CA to authenticate clients but would be serving up an "external CA derived" cert to the client for it to validate. Most likely, the server would end up rejecting client requests because those clients' certs would not have been issued from the Puppet CA.

I've been thinking that the work for this ticket could come down to:

1) Change the implementation of retrieve-ca-cert! in certificate_authority.clj to not copy the cacert over the top of the localcacert if there's already a file at the localcacert location.

This would avoid the problem of the user's "external CA" cert possibly being stomped by the Puppet CA service if the "external CA" cert happened to be populated before the Puppet Server were started and the Puppet CA were still enabled as the master tries to retrieve the CA cert during initialization. This doesn't solve the issue of the Puppet CA going ahead and initializing its own certs, which isn't ultimately what the user wants to have happen.

2) Change the init-webserver! function in puppet_server_config_core.clj to, for the case that Jetty webserver ssl settings are being overridden from core Ruby Puppet settings, override the Jetty webserver ssl-ca-cert setting with the core Ruby Puppet ssl_server_ca_auth setting.

ssl_server_ca_auth falls back to localcacert when ssl_server_ca_auth is not defined in core Ruby Puppet. Targeting ssl_server_ca_auth for the override would allow for a user to configure multiple CA certificates to use when authenticating clients whereas localcacert only allows for one, presumably that from which the server's own certificate was issued. localcacert is an overloaded setting in that it also gets used by the master for authenticating servers when the master makes SSL client calls. The user should be able to define unique CA certificate stores for ssl_server_ca_auth (server authenticating clients) vs. ssl_client_ca_auth (client authenticating servers), where a difference in the content between the two is desired.

3) Add :localcacert to the list of required-ca-files that certificate_authority.clj uses to decide whether to generate the CA files or not.

If we did this and a file had already been populated in the :localcacert location but not all of the other CA files had been put in place, we'd fail the service startup with a "partial-state-error" - with no potentially confusing CA-related certificate files having been created. (Although the CA directories would still have been created because that currently happens right before the required files check).

From a coupling perspective, it would feel a little odd to do the :localcacert check in the CA service implementation since this feels like more of a master service thing. I expect this will all get reworked in a future CA service implementation anyway, though.

I'd prefer to keep this ticket about #1 and #2 above. If the simple change for #3 above seems workable, we could cover that for this ticket as well. But otherwise, I'd probably create a new ticket to cover the work related to avoiding generation of certificate-related files if :localcacert was pre-populated.

In terms of priority, Chris Price, are you suggesting that what #1 and #2 would cover would be worth including in Puppet Server 1.1 / PE 3.8 but that the superfluous certificate file generation, if we decide to separate it, could be lower priority / potentially done later?

Comment by Chris Price [ 2015/03/04 ]
So that might mean that if the user had done an agent run, and thus had the relevant master certs and localcacert stuff in place, before they ever started puppet server...

I'm a bit confused as to how this would happen exactly. For master certs / localcacert to be present, wouldn't either the Puppet Server have had to be started (and, therefore, the still enabled Puppet CA service) and agent run against it or someone have manually (i.e., outside of the Puppet agent process) have dropped the "legit" certs / localcacert in place on the master's file system? I suppose how the files get there may not be as important, just wanted to make sure I wasn't missing some critical context here...

In an external CA use case where the external CA is another Puppet CA, it's common for users to do an agent run on the new master machine before starting the actual server. The key here is that in a multi-master scenario, there is usually a 'master of masters', which is where the agents on all of the other masters are configured to point to. This is a very common deployment pattern from what I understand.

This agent run will result in certs being generated with the fqdn of the machine, and the file paths where those certs get deployed will correspond with where Puppet Server will be looking for its 'master cert' on startup. We also get a copy of the localcacert via the agent run.

w/rt the rest of the implementation details, here's what I care about:

  • We land a change in 1.1 that will cause the Jetty overrides to end up pointing to 'localcacert' instead of 'cacert' for the use case described above, and
  • We don't add on more than a couple of extra story points worth of work beyond that

I'll defer to you and Nate Wolfe on deciding the rest of the implementation details as long as those two criteria are met.

Comment by Jeremy Barlow [ 2015/03/09 ]

Thanks for the clarification on the use case, Chris Price. That all makes sense to me. I pulled the info from my last comment up into the issue description as I think that will cover the basic work that we would need for this ticket. I agree that, regarding the work to avoid CA artifacts being inappropriately / unnecessarily written, we should minimize the additional implementation work as much as possible. Nate Wolfe, feel free to chime in with any parts of the description that you think need more work.

Comment by Justin May [ 2015/03/10 ]

So, as far as I can tell, the ssl_server_ca_auth setting is supposed to point to a single PEM file which contains multiple CA certs. Does this sound right?

Comment by Justin May [ 2015/03/10 ]

And it looks like that roll-over behavior doesn't happen in the Puppet config code, but somewhere else in the CA code. So we will need to mimic this behavior:

[root@localhost ~]# puppet config print localcacert
/var/lib/puppet/ssl/certs/ca.pem
[root@localhost ~]# puppet config print  ssl_server_ca_auth

Comment by Nate Wolfe [ 2015/03/11 ]

3) Add :localcacert to the list of required-ca-files that certificate_authority.clj uses to decide whether to generate the CA files or not.

Yeah, we should add :localcacert to the required-master-files - the CA doesn't have a :localcacert.
I just verified this addition behaves as expected.

Also, while you're in there you should change the datatype from a vector to a set, which it should've been in the first place.
For example,

(def required-master-files [:hostprivkey :hostcert :localcacert])

should become

(def required-master-files #{:hostprivkey :hostcert :localcacert})

I also verified this change is acceptable.

Comment by Jeremy Barlow [ 2015/03/11 ]

RE: the ssl_server_ca_auth setting, I had misinterpreted how the default was derived. I had been thinking that if the value of the ssl_server_ca_auth setting were obtained from Puppet directly that if a user had not supplied a value for it via the CLI or puppet.conf, that the value returned would be the same as the one that localcacert would return.

This "fallback" logic, however, isn't wired into the setting at all. Instead, the fallback occurs via the WEBrick server configuration. In the following lines of code, the SSL configuration for WEBrick includes references to localcacert and ssl_server_ca_auth:

https://github.com/puppetlabs/puppet/blob/3.7.4/lib/puppet/network/http/webrick.rb#L123-L129

When the value is queried via the SSL configuration, the "fallback" is performed:

https://github.com/puppetlabs/puppet/blob/3.7.4/lib/puppet/ssl/configuration.rb#L29

This would mean that the ssl_server_ca_auth setting is only used in the context of WEBrick configuration, not Rack (e.g., Passenger) configuration. Given that, I don't think it makes sense to try to have ssl_server_ca_auth be respected by Puppet Server.

Puppet Server would still allow for users to make a distinction between the set of CA certificates that Puppet Server uses for authenticating clients connecting to it vs. the set of CA certificates that Puppet Server uses for authenticating servers for outbound connections it makes. A distinct value for ssl-ca-cert in the Puppet Server "webserver" configuration could be set for authenticating clients, see:

https://github.com/puppetlabs/trapperkeeper-webserver-jetty9/blob/trapperkeeper-webserver-jetty9-1.1.1/doc/jetty-config.md#ssl-ca-cert

The value from puppet.conf's localcacert, having a different value from ssl-ca-cert, would still be used by Puppet Server in the context of making connections to external servers.

Arguably, Puppet Server could consider the ssl_client_ca_auth setting, if non-empty, in place of localcacert in the context of making connections to external servers. I'm inclined to forego that for now and see if it becomes a significant issue.

With respect to this ticket, I'm going to remove the ssl_server_ca_auth language from the description. The major change, then, would be that ssl-ca-cert would be overridden from localcacert, not considering ssl_server_ca_auth.

Comment by Jeremy Barlow [ 2015/03/14 ]

Upon review of the PR, we decided to bail on the idea of adding :localcacert to required-ca-files as well. Doing this would have been problematic if the CA files had been pre-generated, outside puppet-server's startup process. In that scenario, it would be possible for the localcacert to not exist but for the other CA files to exist - which would have failed the required-ca-files check.

I think this could still be done safely if we were to change the logic to only additionally fail if localcacert does exist but any of the other CA files did not exist at startup - given this would still seem to be an invalid scenario. Doing this would have involved a bit more logic, though. In the interest of getting the rest of the work for this ticket into the forthcoming Puppet Server release, we decided to forego the work to fail startup if localcacert exists but other CA files do not. I removed the associated language from the ticket description.

Comment by Jeremy Barlow [ 2015/03/17 ]

PR was merged and corresponding full integration test suite job against the Puppet Server stable branch was green - https://jenkins.puppetlabs.com/job/platform_puppet-server_integration-system_full-stable/191/. Moving to "testing".

Comment by Erik Dasher [ 2015/03/20 ]

Probability: Medium-High; Users have encountered issues already for this. Severity: Medium. If a user brings up a puppet CA on a master where the user wanted to use an external puppet CA, the local CA will stomp on the certificate from the external CA. Over all risk is viewed as medium-high. QA will perform functional review, and needs to consider additional automated testing.

QA needs to at least validate that when there is an external puppet CA (and cert), that starting the puppet CA locally doesn't destroy the certificate from the external CA.

Long Term we need additional documentation that discusses setting up LEI and using a central puppet CA that many masters work on so that our users are better informed about how to setup this configuration. QA to either find the ticket or generate the ticket to docs.

Long Term, we need to consider making this easier to configure for our users, possibly by providing a script that simplifies some of the operations.

Comment by Erik Dasher [ 2015/03/27 ]

Validated that an empty ca.pem file in the localcacert location is not overwritten on puppetserver start. (puppetserver errors out and fails to start, as expected.)
Validated that a different ca.pem file (with an entirely different key, CN, etc etc) is not overwritten on puppetserver start.
Tested against 1.0.4.SNAPSHOT.2015.03.24T0146 on RHEL 7.

Comment by Erik Dasher [ 2015/03/27 ]

The above validation was performed after a conversation with Jeremy Barlow explaining that this is all that needs to be validated for this ticket.

Generated at Sat Sep 19 12:35:07 PDT 2020 using Jira 8.5.2#805002-sha1:a66f9354b9e12ac788984e5d84669c903a370049.