APIServer: Make container deletes a little more robust #567
+191
−112
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Today it's possible we receive the container exit event sent to the daemon after a user might've invoked
container delete
. The aftermath of that unfortunate scenario is that you would be met with an error stating we can't delete the container because it is running. This is not really common unless someone was doing kill/stop+delete in some script, but it still is an issue that needs to be fixed. The true source of truth for state is actually the sandboxservice hosting the container, so if we were guaranteed that that is still alive, we can query and take some action based on the possible mismatch in current snapshot state and what the sb service is telling us.To make this possible I've introduced a new rpc on the sandbox service named shutdown. Shutdown's purpose is to run some cleanup code (if there is any) and then to wind down the process. Prior to this RPC the sandbox service would basically just call exit(0) on its own accord after the container exited. Now this process is driven by the APIServer instead, which I like a little bit more. The true win here though is that because we know as long as we haven't sent the shutdown RPC that the process is alive (barring a bug), we can query the state of the container the whole time.