Prevent saving invalid configuration on deploy #155

jandeschuttere · 2025-08-18T08:46:36Z

Hi 👋

Just a minor change in the interaction with the router, I kept it simple.

Recently we had a bad configuration on our deployment. The proxy was able to prevent any of the bad configuration to be used or to influence any of the running services, yet we noticed the state was updated to reflect the bad configuration.

If there would have been a restart of the proxy for any particular reason this bad configuration might have caused the system to not boot up properly.

jandeschuttere · 2025-09-08T07:10:29Z

@kevinmcconnell just as a friendly reminder, want to be able to close this on my end as well.

kevinmcconnell · 2025-09-12T12:00:34Z

Thanks for the ping, @jandeschuttere! I finally got a chance to look into this...

If there are ways to get bad configuration into the loaded service, then I think we need to prevent that happening in the first place, rather than just avoid saving the state at that point. If the configuration change is still loaded, then a subsequent save could cause it to be written out to the state later (for example, a second service being deployed).

This probably means changing how UpdateOptions works so that attempts to initialise before mutating its fields. That way if part of that inialisation fails, it can return before it makes any changes.

Looking at it now, this situation seems like it would apply to cert paths and custom error page paths. Was the bad configuration change you ran into one of those?

jandeschuttere · 2025-09-15T07:31:13Z

Correct, we accidentally had loaded expired certs on the deployment.

I have not revisited the code right now but the check to identify whether or not the certs are valid was further down the chain. The reason I went for this approach, and not the validation - iirc, was because this might meant a bigger rewrite. I can have a look at it again if you think this would be the path forward?

Do you know if there was a particular reason this was done so late in the "game", when the state already was mutated? Are there code paths that could lead this check to be used on other events (like f.i. a restart of sorts)?

What's the point of view on whether or not a reboot should prevent the service to be reachable in case of an expired certificate? Is it allowed and should there be proper error logging/notifications, or should it be prevented by all means?

kevinmcconnell · 2025-09-15T14:51:39Z

The reason I went for this approach, and not the validation - iirc, was because this might meant a bigger rewrite. I can have a look at it again if you think this would be the path forward?

Great, yes I think that would be good. What I had in mind, in the short term, is just reordering what UpdateOptions does, so that if there's an error initializing it will return without mutating the service. So something along the lines of:

func (s *Service) UpdateOptions(options ServiceOptions, targetOptions TargetOptions) error {
	err := s.initialize()
	if err != nil {
		return err
	}

	s.options = options
	s.targetOptions = targetOptions
	return nil
}

(I haven't tested this out, but it seems like something like that should correct the current problem.)

Do you know if there was a particular reason this was done so late in the "game", when the state already was mutated? Are there code paths that could lead this check to be used on other events (like f.i. a restart of sorts)?

I don't think it was intentional. Originally the only part that could fail a deployment was the healthcheck stage, and so the rest of the operation proceeded after that state succeeds.

What's the point of view on whether or not a reboot should prevent the service to be reachable in case of an expired certificate? Is it allowed and should there be proper error logging/notifications, or should it be prevented by all means?

I think in the case of manual certs, we should boot the service as long as we can load the certificate. If it's expired, that's something that should be corrected by the operator, but the proxy shouldn't refuse to boot the service on restart just because of it. However trying to deploy a new service version and referencing an expired custom certificate, we should probably reject that deployment. It seems like a difference between validating the arguments on deploy, and using whatever state has been previously accepted.

What do you think -- does that sound reasonable to you?

jandeschuttere · 2025-09-19T13:11:44Z

internal/server/service.go

 	return nil
 }

+func (s *Service) validateOptions(options ServiceOptions) error {


I'm a bit sad on having to do this, but I assume the way it was set up was because of a certain desire on how the code is structured. An alternative I'd preferred was passing the options until they have been verified as working, and at that point set up the full state, which feels closer to what I assume the initialize was desired to be doing.

The way that was constructed could lead to the witnessed invalid state, hence the validation which duplicates that logic.

If you feel like there's merit in changing the logic of initialize to do just that, let me know, I am willing to go the extra mile on making those changes. I was hesitant on doing so because I think (assume! - and respect -) this is more of a Ruby type of styling and I don't want to break that convention.

Ah, yes, I don't think we need to construct these objects twice.

What I had in mind was what you're describing: perform the initialize step before we change any of the fields on the Service. Only if initialize returns without error would we then update the state of the Service.

So it does mean passing the options into initialize, yes, and also calling it earlier in UpdateOptions. And if it's clearer, have initialize return everything it creates, so that updating the state is all in one place. So perhaps it'd look more like:

certManager, middleware, err := s.initialize(options) if err != nil { return err } s.certManager = certManager s.middleware = middleware s.options = options s.targetOptions = targetOptions

It's totally fine to change initialize as necessary to do this -- it's not really trying to follow a particular convention.

Does that make sense?

Ok great to hear and thank you for bringing clarity.

I've chosen the route of passing the options to the initialize to keep the dependency clear, yet allowing that call to still initialize the state of the instance within the instance itself. I did like the simple return of only the error which was already there and I did not yet see a need myself of passing down that knowledge/responsibility.

what do you think of my last pushed update?

This looks great now! Thanks for taking another pass at it, @jandeschuttere!

I'm thinking we can update the TargetOptions after the initialize as well, but I'll take a look at that bit separately. In any case, this fixes the issue that you had observed.

jandeschuttere force-pushed the prevent-saving-invalid-state-on-deploy branch from 8eee415 to 0fc85a0 Compare September 19, 2025 13:01

jandeschuttere commented Sep 19, 2025

View reviewed changes

Prevent saving invalid configuration on deploy

28deaa4

jandeschuttere force-pushed the prevent-saving-invalid-state-on-deploy branch from 0fc85a0 to 28deaa4 Compare September 19, 2025 14:51

jandeschuttere requested a review from kevinmcconnell September 19, 2025 14:55

kevinmcconnell merged commit b62efc1 into basecamp:main Sep 22, 2025
1 check passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Prevent saving invalid configuration on deploy #155

Prevent saving invalid configuration on deploy #155

Uh oh!

jandeschuttere commented Aug 18, 2025

Uh oh!

jandeschuttere commented Sep 8, 2025

Uh oh!

kevinmcconnell commented Sep 12, 2025

Uh oh!

jandeschuttere commented Sep 15, 2025 •

edited

Loading

Uh oh!

kevinmcconnell commented Sep 15, 2025

Uh oh!

jandeschuttere Sep 19, 2025

Uh oh!

kevinmcconnell Sep 19, 2025

Uh oh!

jandeschuttere Sep 19, 2025

Uh oh!

kevinmcconnell Sep 22, 2025

Uh oh!

Uh oh!

Uh oh!

Prevent saving invalid configuration on deploy #155

Prevent saving invalid configuration on deploy #155

Uh oh!

Conversation

jandeschuttere commented Aug 18, 2025

Uh oh!

jandeschuttere commented Sep 8, 2025

Uh oh!

kevinmcconnell commented Sep 12, 2025

Uh oh!

jandeschuttere commented Sep 15, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

kevinmcconnell commented Sep 15, 2025

Uh oh!

jandeschuttere Sep 19, 2025

Choose a reason for hiding this comment

Uh oh!

kevinmcconnell Sep 19, 2025

Choose a reason for hiding this comment

Uh oh!

jandeschuttere Sep 19, 2025

Choose a reason for hiding this comment

Uh oh!

kevinmcconnell Sep 22, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

jandeschuttere commented Sep 15, 2025 •

edited

Loading