APIServer: Make container deletes a little more robust #567

dcantah · 2025-09-02T08:20:14Z

Today it's possible we receive the container exit event sent to the daemon after a user might've invoked container delete. The aftermath of that unfortunate scenario is that you would be met with an error stating we can't delete the container because it is running. This is not really common unless someone was doing kill/stop+delete in some script, but it still is an issue that needs to be fixed. The true source of truth for state is actually the sandboxservice hosting the container, so if we were guaranteed that that is still alive, we can query and take some action based on the possible mismatch in current snapshot state and what the sb service is telling us.

To make this possible I've introduced a new rpc on the sandbox service named shutdown. Shutdown's purpose is to run some cleanup code (if there is any) and then to wind down the process. Prior to this RPC the sandbox service would basically just call exit(0) on its own accord after the container exited. Now this process is driven by the APIServer instead, which I like a little bit more. The true win here though is that because we know as long as we haven't sent the shutdown RPC that the process is alive (barring a bug), we can query the state of the container the whole time.

Today it's possible we receive the container exit event sent to the daemon after a user might've invoked `container delete`. The aftermath of that unfortunate scenario is that you would be met with an error stating we can't delete the container because it is running. This is not really common unless someone was doing kill/stop+delete in some script, but it still is an issue that needs to be fixed. The true source of truth for state is actually the sandboxservice hosting the container, so if we were guaranteed that that is still alive, we can query and take some action based on the possible mismatch in current snapshot state and what the sb service is telling us. To make this possible I've introduced a new rpc on the sandbox service named shutdown. Shutdown's purpose is to run some cleanup code (if there is any) and then to wind down the process. Prior to this RPC the sandbox service would basically just call exit(0) on its own accord after the container exited. Now this process is driven by the APIServer instead, which I like a little bit more. The true win here though is that because we know as long as we haven't sent the shutdown RPC that the process is alive (barring a bug), we can query the state of the container the whole time.

jglogan · 2025-09-08T19:15:13Z

Sources/APIServer/Containers/ContainersService.swift

+                        networks: []
+                    )
+                    await self.setContainer(id, snapshot, context: context)
+                    if !autoRemove {


perhaps a one-line explanation of why we don't invoke this for the auto-remove case? who's responsible for cleanup?

jglogan · 2025-09-08T19:32:56Z

Sources/APIServer/Containers/ContainersService.swift

+                    return
+                }
+
+                if !force {


nit: perhaps L243-L258 in an else, and then 237-240 can go away in favor of 259-262?

jglogan · 2025-09-08T19:40:40Z

Sources/Services/ContainerSandboxService/SandboxService.swift

+
+        Task {
+            do {
+                try await Task.sleep(for: .seconds(3))


If we're in the .shuttingDown state, the state going back to the containers helper will be .stopped, right? Is there any issue with the state being that way for 3 secs while the launch agent is still running? I'm thinking of a new container with the same name being created/run immediately.

dcantah marked this pull request as ready for review September 2, 2025 17:29

jglogan reviewed Sep 8, 2025

View reviewed changes

dcantah marked this pull request as draft September 9, 2025 05:57

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

APIServer: Make container deletes a little more robust #567

APIServer: Make container deletes a little more robust #567

dcantah commented Sep 2, 2025

Uh oh!

jglogan Sep 8, 2025

Uh oh!

jglogan Sep 8, 2025

Uh oh!

jglogan Sep 8, 2025

Uh oh!

Uh oh!

APIServer: Make container deletes a little more robust #567

Are you sure you want to change the base?

APIServer: Make container deletes a little more robust #567

Conversation

dcantah commented Sep 2, 2025

Uh oh!

jglogan Sep 8, 2025

Choose a reason for hiding this comment

Uh oh!

jglogan Sep 8, 2025

Choose a reason for hiding this comment

Uh oh!

jglogan Sep 8, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!