Extract permission node creation 8201 #8963

tkalir · 2025-04-14T13:35:10Z

Related to #8201

Change Description

Automating the documentation for endpoint permissions requires to decouple the permission objects from the endpoint code. This PR extracts the permission objects to a separate file.
Since the permission code is sensitive, I opted for a straightforward refactor as a first step. This also makes it possible to test the next changes in the permission code separately from the endpoint code.
The functions are named after the logAction operation names when available (e.g copy_objects -> CopyObjectsPermissions).

Testing Details

When refactoring I used the IDE's "Extract Method" feature for consistency and minimize room for error. I also ran the tests locally (excluding Hadoop related tests, which I had an issue with running locally).
Let me know if you think additional coverage is required.

…ultipartUploadPermissions

arielshaqed

Hi @tkalir ,

THANKS! Overall this change looks fine. I am curious how this will continue. At some point you will need to strings from all of these permissions, and you will need something in memory that holds all usable operations, and the knowledge that it matches the list of all your funcs. This will still be an implementation issue.

As an alternative: it seems that you have a small number of truly different "*Permissions" funcs. Could we instead implement something like

func MakeGroupPermissions(action string) func(groupID string) permissions.Node {
	return func(groupID string) permissions.Node {
		return permissions.Node{
			Permission: Permission{
				Action:   action,
				Resource: GroupArn(groupID),
			},
		}
	}
}

// A "few" other similar Make*Permissions, say MakeObjectPermissions, MakePolicyPermissions, and similar.

Now you might store a Make*Permissions func in a map where the index is the operation name. And then you load in controller.go a permissions func, and then call that func. With a few tweaks we will be able to use that map to print documentation.

WDYT?

arielshaqed · 2025-04-17T08:45:20Z

pkg/permissions/operationPermissions.go

@@ -0,0 +1,1210 @@
+package permissions


This is a comment about the filename. Unfortunately filenames can be really tough. We dislike capitals in filenames for two reasons:

The existing Go code base uses underscores not uppercase.

Right now we have only one such file, which is auto-generated by a tool:

❯ find ./pkg/ ./cmd/ -name '*[A-Z]*.go' ./pkg/metastore/hive/gen-go/hive_metastore/GoUnusedProtection__.go

Uppercase can actively be harmful because of bad defaults on MacOS. By default, MacOS creates a case-insensitive but case-preserving filesystem when you set it up. So this file will also be accessible by names operationpermissions.go, opeRATionPermissions.go, and others. But as soon as someone does that by mistake, every other OS fails to find that file. We have had bad experience here. So we try to avoid uppercase where it matters.

BUT please make this change only after we approve all other issues on the review. Otherwise GitHub tends to lose the comments when you rename a file. 😢

Got it, thanks for the explanation. I'll rename the file after the rest is fine.

tkalir · 2025-04-20T14:55:15Z

@arielshaqed thanks for reviewing!

My initial thought was doing something similar to the draft here.
The tldr version is along the lines of mapping each operation name to a permissions object that has the resource template (e.g "group/{groupID}") in the resource field. The doc generation part will use this object as is, and when used in the endpoint flow it will call a "fill" function that substitutes the placeholders with the actual values.

If I understand your suggestion correctly, the doc-gen code is supposed to call the function that MakeGroupPermissions (etc) returns, which means it needs to know how many arguments this function expects. It's not impossible to solve, but I think the template solution is easier in this case.

Unrelated question - there's an automatic check that fails this PR for not closing an issue. I'd rather split the code to multiple PRs (I also want to extract the log_action strings to enums before making the map object, which I don't think belongs here). In that case, this PR won't close the original issue. Should I create another issue for this PR to close?
Thanks

arielshaqed

Thanks!

Firstly, very sorry about dropping this PR. I completely missed it, and that is wrong. I will change my workflow around emails from GitHub in order to see if I can prevent this in future.

Thanks for the detailed work! I find this PR quite hard to review, there are unfortunately many changed lines. And I am certain it was even harder to write. Mistakes here are, by definition, security bugs - so I will have to automate this part. To that end - I see that you changed some operation names. This make automating the check impossible. Could you please ensure that names are kept? I commented in 2 places.

I am also unsure how this helps us auto-generate documentation. That is probably because I do not understand the next step. Is the plan to process permissions/operationPermissions somehow and extract permissions from there?

arielshaqed · 2025-06-05T10:34:08Z

pkg/api/controller.go

-			Resource: permissions.GroupArn(groupID),
-		},
-	}) {
+	if !c.authorize(w, r, permissions.ListGroupUsersPermissions(groupID)) {


Can we keep the same names, please?

arielshaqed · 2025-06-05T10:34:24Z

pkg/api/controller.go

-			Resource: permissions.GroupArn(groupID),
-		},
-	}) {
+	if !c.authorize(w, r, permissions.RemoveUserFromGroupPermissions(groupID)) {


Also here and elsewhere in the file, of course.

…n-8201

…nId names

tkalir · 2025-06-24T18:18:35Z

Hi @arielshaqed thanks for the review.

honestly, I didn't realize that the function names in the controller should match the operationIds, so as I mentioned, I named the functions after the logAction strings. I corrected that now.
regarding how this PR helps: my next step is to construct the operationId -> permission-node map, where the resource parts of the nodes are templates. So the docgen flow can use these nodes as-is, and the controller flow will fill these templates with the request data.

I intend to replace the code in the operationPermissions functions to use and fill the nodes from the map.
I did not include it in this PR because I thought it would be even harder to review and verify.
Additionally, if the current operationPermissions functions are correct, it will be easy to verify that the new versions (that use the map) return the same results.

Part of my assumption was that this PR would be hard to review and verify automatically, so I tried to make the changes as straightforward as possible (extracting code from each controller function to a helper function). I am open to suggestions how to to make such PRs more reviewable in the future.
I am also curious to see how you plan to automate the verification that the pre-pr and post-pr code do the same, I didn't think of a good way to do so myself.

Thanks!

arielshaqed · 2025-07-04T15:58:03Z

Hi @tkalir ,
Apologies for another delay - I've been unwell. Better now, so will reply by Sunday.

arielshaqed · 2025-07-06T18:22:21Z

I will write a script to extract permissions used for each call. I think that may give us confidence we're doing the right thing here.

arielshaqed · 2025-07-07T06:04:35Z

Here is a more concrete proposal. The chief driving force here is reliability: reduce the risk of human error when extracting the correct permissions by automating it a bit. So I wrote a dumb script that just manages to extract the permissions. Because this is a forklift (we replace one thing with another, but we do not plan on doing the same replacement again), it is allowed for the script to fail and say that it failed in some cases. This is better than making a silent error.

After wasting some time and burning the planet using Gemini Code (pro tip: it doesn't work), I hacked together the attached Python script and ran it on controller.go. It manages to extract permissions for most of the API functions - we can do the remaining 4 manually. Because of reasonably strict error checking in the script, I feel that the automation here is trustworthy.

I think next up we can extract permissions. We will need to extract permissions to a more usable object. For instance, if we had an interface Allower with a method Allowed(permissionParams, r http.Request, w *http.Response) then we could have a map element "create_presign_multipart_upload" -> AllowByPath(...).

An implementation of AllowByPath would be something like this:

type AllowByPath struct {
    Action string
}

func (a *AllowByPath) func Allowed(params PermissionParams, r http.Request, w *http.Response) bool {
    return a.Controller.AuthorizeCallback(w, r, permissions.Node{
		Permission: permissions.Permission{
			Action:   a.Action
			Resource: permissions.ObjectArn(params.Repository, params.Path),
		},
	})
}

Once we've done that, we can generate documentation by adding another method to Allower. And we can order the documentation correctly by ordering all the function names output in permissions.out.

WDYT?

permissions.out.gz
extract_permissions.py.gz

tkalir · 2025-07-08T22:06:37Z

Thanks, this looks like a solid idea.

My main concern with this approach is how to verify the code's correctness after the change.
In the last PR, I relied on the IDE's refactoring tool to reduce the chance of human error when extracting the nodes' code into functions. With this approach, the IDE won't be as much help. While this approach does result in a lot less new functions, I think the new objects + the changes in the calling code still results in many changes that would be hard to check manually.

One option for mitigating that could be verifying that Authorize() receives the same input when the same endpoints are called with the same parameters. To try this, I called LinkPhysicalAddress() with a fake Controller object whose Authorize() logs the permission object, and it worked.
So what I could do is:

create a fake controller that logs Authorize()'s input to a file
call all endpoint handlers
write the new code
call the endpoint handlers again
check that the outputs are the same

I suspect the test code would have to change a little for each endpoint, so the test code won't be small, but it is doable.
It might be an overkill, but since this touches security code along all handlers, it might be justified. What do you think? Is there a simpler way to test this change?

arielshaqed · 2025-07-14T11:43:41Z

Since we have reproducible automation, I suggest we:

Review the Python script (I wrote it, so either you review it or I find a third person).
Sample 10 of its outputs.
Go on from there.

(Of course change the script if you need to tweak its output).

tkalir · 2025-07-17T14:14:21Z

I do believe the script works as intended - but my concern is also with the boilerplate that the new code will require, which the script won't help with verifying.
But we can postpone this discussion to after I write the code.
(Or, if you prefer, I can start with doing only the flows involving AllowByPath to make the review more manageable)

tkalir · 2025-07-24T07:25:07Z

Hi @arielshaqed ,

Working on the code, I noticed that some endpoints pass a callback to AuthorizeCallback, which will need to be passed to Allowers. It can certainly work, but made me wonder if we can simplify the design.

This suggestion assumes that the only real difference between Allowers is the Node object they use, let me know if you think that might change.
Otherwise, I suggest storing in the map objects like these:

type PermissionDescriptor interface {
	Permission(params PermissionParams) permissions.Node
        <doc generation function>
}

type PathPermissionDescriptor struct {
	Action string
}

func (p *PathPermissionDescriptor) Permission(params PermissionParams) permissions.Node {
	return permissions.Node{
		Permission: permissions.Permission{
			Action:   p.Action,
			Resource: permissions.ObjectArn(params.Repository, params.Path),
		},
	}
}

Then, instead of calling authorize() directly, we can wrap it in a function that receives OperationId and all the relevant params, and have it handle the map and call authorize(). This way there's less chance that changes to the flow will ripple to every endpoint.

What do you think?

arielshaqed · 2025-07-24T07:34:00Z

Hi @tkalir ,

Really glad we're back on this!

Sorry, I am not sure about this part of your suggestion; I am probably missing your point in all of this.

	return permissions.Node{
  	Permission: permissions.Permission{
  		Action:   p.Action,
  		Resource: permissions.ObjectArn(params.Repository, params.Path),
  	},
  }

There are many more types of Nodes, in actual use! I mean, many Nodes refer to other ARNs than object ARNs, and we sometimes use "AND" Nodes.

I think you refer to more than what I read -- and would be glad if you could help me understand.

tkalir · 2025-07-24T13:12:01Z

Hi, I'll explain a bit more.

This suggestion is similar to your last one - having all objects in the map share an interface (PermissionDescritor), with the same number of implementations. Meaning, every time the permission node is the same ARN structure with a different action, they share an implementation. Times when AND is used may require a different implementation each.

Comparing to the Allowers design, it's just about splitting a bit differently what's in the interface implementation and what's in the calling code. Instead of having each implementation calling AuthorizeCallback with a different node object, I am having each implementation returning a different node, which the calling code passes to Authorize().

Some more examples:

type BranchPermissionDescriptor struct {
	Action string
}

func (b *BranchPermissionDescriptor) Permission(params PermissionParams) permissions.Node {
	return permissions.Node{
		Permission: permissions.Permission{
			Action:   b.Action,
			Resource: permissions.BranchArn(params.Repository, params.Branch),
		},
	}
}

example using AND:

type CreateRepoPermissionDescriptor struct {}

func (c *CreateRepoPermissionDescriptor) Permission(params PermissionParams) permissions.Node {
	return permissions.Node{
		Type: permissions.NodeTypeAnd,
		Nodes: []permissions.Node{
			{
				Permission: permissions.Permission{
					Action:   permissions.CreateRepositoryAction,
					Resource: permissions.RepoArn(params.Repository),
				},
			},
			{
				Permission: permissions.Permission{
					Action:   permissions.AttachStorageNamespaceAction,
					Resource: permissions.StorageNamespace(params.StorageNamespace),
				},
			},
		},
	}
}

(PermissionDescriptor seems too long, if we use this design we can go with "OpPermission", "PermissionSpec" or something similar)
Makes sense?

arielshaqed · 2025-07-31T11:33:59Z

Thanks for the explanation - this really makes a lot of sense! So most of the logic I thought you needed to put in the PermissionDescriptor, you actually put into the factory (constructor) functions Permission, right?

You've obviously put a great deal of understanding into our requirements, even those I did not phrase formally. And I think this proposal offers a great path forward.

I am asking for some clarifications, to be sure that I understand and to try to avoid some blockers. I really think we can manage all of these, either this way or another way.

IIUC this means PermissionParams will have many more fields, for branches, source objects (for copy), repositories, users, groups, tags, and all that. That sounds great, I had not thought of it like that. If I now understand correctly, my only request is that PermissionParams fields will be pointers to strings not strings: I would like code in controller.go to be very clear about the difference between "I don't have a parameter value to give here" (nil) and "I have a parameter value to give here, and it is the empty string" (a pointer to an empty string). I realize it's probably OK because empty strings probably don't make sense for any of the parameters. But... maybe they do? Maybe an empty path component could make sense in future? If we do it with pointers we will never have to worry about it.

Creating a pointer to a string can be unpleasant in Go, particularly when you have a literal or a computed string. We use swag.String(...) which does that for us; you could use that in your upcoming code too.

tkalir · 2025-08-10T16:33:36Z

Hi, I was sick for a few days so it got delayed.
Thanks for the feedback!

Regarding where the logic is located, to make sure we are on the same page, my suggestion is to replace Authorize() with a function like this

func (c *Controller) authorizeReq(w http.ResponseWriter, r *http.Request, params PermissionParams, operationId string, opts *AuthorizeOpts) bool {
	desc := GetPermissionDescriptor(operationId)
	callback := writeError
	if opts != nil && opts.callback != nil {
		callback = opts.callback
	}
	return c.authorizeCallback(w, r, desc.Permission(params), callback)
}
// currently custom callback is needed only in ~2 endpoints, but having opts reduces 
// the chance of needing to change the function's signature in the future

GetPermissionDescriptor just looks in the map for a PermissionDescriptor object, using the operationId as key.
The map is populated by (for the examples from my last message)

"CreatePresignMultipartUpload" -> &BranchPermissionDescriptor{action: permissions.WriteObjectAction}
"CreateRepo" -> &CreateRepoPermissionDescriptor{}

So not a lot of logic overall.

Some additional points:

as you said, PermissionParams will use multiple fields for tags, branches, etc. I think that currently there won't be a use for having nil values there, but future-proofing won't harm us, I'll do as you suggested.
which keys to use in the map - I suggest using the operationId names, like createPresignMultipartUpload. When generating the docs we will need to fetch info from the swagger (like the endpoint url for the operation), and the swagger uses operationIds.
There are a few possibilities:
a - using hard coded strings as keys in the map
b - generate operationId constants from the swagger and use them as keys
c - generate Operation objects from the swagger, which will include the operationIds and endpoint urls (so parsing the swagger in the doc gen part won't be necessary). If we want to create operation->logAction mapping, the logAction names can be added as custom fields in the swagger, for the times where getting logAction from operationId is not trivial.
In this option we use <operation>.id as key.

A or B will be easier, as this issue doesn't necessitate dealing with the logaction names, but tell me what you think.

arielshaqed · 2025-08-11T06:52:39Z

Glad to hear you're better!

I'm happy with any of the options. Personally I do not see much benefit in consts / string enums / int enums compared to just copying the operation ID. I mean, it basically means writing stuff like

func GetObject(...params) {
    something.Permissions(GetObjectActionName, ...)
    // ...
}

rather than

func GetObject(...params) {
    something.Permissions("get-object", ...)
    // ...
}

The minor advantage to the first is that it catches some errors at compile-time rather than at run-time. But:

If you write "geet-object" with an error, then any test of the code will fail. And I really hope we test our changes by running them at least once 🙀
A much more common error for me is to copy-paste and end up with "put-object" instead of "get-object". And then I'd be as likely to write PutObjectActionName.

So no strong feelings here.

arielshaqed

(empty comment to make it your turn to review)

tkalir · 2025-08-11T07:52:35Z

thanks!

I think another benefit of having constant variables is that it makes it easier to find where in the code the string is used (for this flow's purpose). I mean, the operationid might appear in multiple places in the code base, but the constant appears only in this flow.
However, since you don't have a preference and this task is quite big as it is, I think I'll go for the simpler solution.

I think my next step will be doing a PR for migrating the Path-using endpoints to this method, and then we can talk testing with concrete examples.

tkalir added 2 commits April 9, 2025 15:01

extracting operation permission nodes from controller

5e5a1f2

Merge branch 'master' into extract-permission-node-creation-8201

30166b1

tkalir marked this pull request as draft April 14, 2025 13:35

tkalir added 2 commits April 14, 2025 16:57

renaming GetCreatePresignMultipartUploadPermissions to CreatePresignM…

1309b79

…ultipartUploadPermissions

renaming operationPermissions.go functions to fit with operation names

aed75a1

tkalir marked this pull request as ready for review April 15, 2025 22:04

arielshaqed self-requested a review April 16, 2025 15:09

arielshaqed reviewed Apr 17, 2025

View reviewed changes

tkalir added 2 commits April 20, 2025 15:40

Merge branch 'master' into extract-permission-node-creation-8201

f72f8a6

fixing linter errors in operationPermissions.go

3651b7d

arielshaqed reviewed Jun 5, 2025

View reviewed changes

tkalir added 3 commits June 24, 2025 17:40

Merge branch 'refs/heads/master' into extract-permission-node-creatio…

86fcf95

…n-8201

changing operationPermissions.go functions from logAction to operatio…

112e560

…nId names

changing GetCredentialsForUserPermissions -> GetCredentialsPermissions

00506ca

tkalir requested a review from arielshaqed August 10, 2025 16:34

arielshaqed reviewed Aug 11, 2025

View reviewed changes

tkalir mentioned this pull request Aug 12, 2025

refactoring object-operation endpoints to use permission descriptors #9416

Open

Extract permission node creation 8201 #8963

Are you sure you want to change the base?

Extract permission node creation 8201 #8963

Uh oh!

Conversation

tkalir commented Apr 14, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Change Description

Testing Details

Uh oh!

arielshaqed left a comment

Choose a reason for hiding this comment

Uh oh!

arielshaqed Apr 17, 2025

Choose a reason for hiding this comment

Uh oh!

tkalir Apr 20, 2025

Choose a reason for hiding this comment

Uh oh!

tkalir commented Apr 20, 2025

Uh oh!

arielshaqed left a comment

Choose a reason for hiding this comment

Uh oh!

arielshaqed Jun 5, 2025

Choose a reason for hiding this comment

Uh oh!

arielshaqed Jun 5, 2025

Choose a reason for hiding this comment

Uh oh!

tkalir commented Jun 24, 2025

Uh oh!

arielshaqed commented Jul 4, 2025

Uh oh!

arielshaqed commented Jul 6, 2025

Uh oh!

arielshaqed commented Jul 7, 2025

Uh oh!

tkalir commented Jul 8, 2025

Uh oh!

arielshaqed commented Jul 14, 2025

Uh oh!

tkalir commented Jul 17, 2025

Uh oh!

tkalir commented Jul 24, 2025

Uh oh!

arielshaqed commented Jul 24, 2025

Uh oh!

tkalir commented Jul 24, 2025

Uh oh!

arielshaqed commented Jul 31, 2025

Uh oh!

tkalir commented Aug 10, 2025

Uh oh!

arielshaqed commented Aug 11, 2025

Uh oh!

arielshaqed left a comment

Choose a reason for hiding this comment

Uh oh!

tkalir commented Aug 11, 2025

Uh oh!

Uh oh!

tkalir commented Apr 14, 2025 •

edited

Loading