Use a queue for execution #5389

rmosolgo · 2025-06-20T20:09:28Z

I'm feeling really inspired by @gmac's proof-of-concept over at https://github.com/gmac/graphql-cardinal/, so I thought I'd try again at refactoring execution to use a queue instead of recursive method calls. I have tried before (#4967, #4968, somewhat #3998, #4935) and given up along the way, but I'm going to try again 😅

TODO:

rmosolgo · 2025-06-23T21:33:37Z

There's still a lot to do here for compatibility and performance, but the initial benchmark results are surprisingly good (even with the terrible FieldResolutionStep implementation which is hogging memory). This is using a modified version of the benchmark from graphql-cardinal:

  ruby 3.4.1 (2024-12-25 revision 48d4efcb85) +PRISM [x86_64-darwin22]
  Calculating -------------------------------------
  graphql-ruby: 142 resolvers
-                         399.549 (± 8.5%) i/s    (2.50 ms/i) -      2.000k in   5.048521s
+                         538.453 (± 3.9%) i/s    (1.86 ms/i) -      2.695k in   5.013324s

  graphql-ruby: 140002 resolvers
-                           0.517 (± 0.0%) i/s     (1.93 s/i) -      3.000 in   5.797333s
+                           0.761 (± 0.0%) i/s     (1.31 s/i) -      4.000 in   5.272811s

- Total allocated: 10149304 bytes (70353 objects)
+ Total allocated: 13431224 bytes (97389 objects)

So that's a 34% or 47% speedup. It's using 30% more memory, because it's currently the Simplest Thing That Could Possibly Work. Presumably refactoring it to avoid those repeated allocations will make it even faster!

…tching

rmosolgo · 2025-07-26T10:58:26Z

I haven't pushed a commit here for a while but I'm still planning on landing this somehow. Here's a quick brain-dump in case anyone is curious:

The perf win is not as big as it was initially because I added objects (FieldResolveStep) for each field resolution. Amazingly it's still faster than the previous implementation. But it's 20%-30% IIRC.
It might get slower because the shift to this kind of flow will also require queue objects for Argument resolution (basically anywhere that can return a Promise)
The time tradeoff comes at the expense of memory which I think is worth it. I have some ideas for reducing memory once this is really working:
- Merge GraphQLResultHash into GraphQL::Schema::Object -- they have the same lifecycle at runtime so they could be merged.
- Merge Field::ExtendedState into FieldResolveStep; this will save an allocation in possibly-tight loops when field extensions are added to repeatedly-used fields.
- Introduce "optimized" field execution. Sometimes we can detect when a field could use a much simpler execution flow, eg "just call a method and insert the result into the hash", without preparing arguments, running extensions, etc. This could cover a lot of basic fields and speed things up, but we still have to be ready for Promises and Dataloader pauses.
Merging and releasing this will be tricky. I expect to have full compatibility when I'm done (or very-nearly-full 😅 ) but it's such a huge change that I want to make sure it's a gentle release so nobody gets nasty surprises. Some thoughts:
- Many of the changes in this branch could be isolated and released separately: merging Dataloader and Lazy resolution; merging ResultHash and Schema::Object; moving execution code into named methods on Step classes;
- After that, you could switch from stack-based to queue-based using code that was mostly in place. Ideally both implementations would be available (and runtime-swappable) but I think the implementation will have to reveal itself once the other changes have been merged.

gmac · 2025-09-02T20:33:33Z

lib/graphql/dataloader.rb


    def run
+      # TODO unify the initialization lazies_at_depth
+      @lazies_at_depth ||= Hash.new { |h, k| h[k] = [] }


I suppose the disadvantage of this is that we always have to take a lazy pathway for running data loader. At present, you can run dataloader inline then return directly if there are no lazies enqueued. With this you'd always need to run dataloader out-of-band to both run jobs and resolve lazies.

I speak in the context of my own interests. This is certainly simpler and I presume better for the library.

Yeah, that's a good point -- on this branch, performance is better than baseline after arranging things this way. But I haven't run benchmarks on #5422 yet. I will!

A quick look at the benchmarks revealed that the initial pass at this work should not use the new FlatDataloader (which captures procs but doesn't use Fibers to run them), and instead, should have lazy resolution implemented inside the existing NullDataloader (which yields immediately to any work given to it). This keeps performance the same: #5422 (comment)

gmac · 2025-09-02T20:38:15Z

lib/graphql/dataloader/flat_dataloader.rb

+        end
+      end
+
+      def run_isolated


What, basically, does run_isolated do? I remember it's used in context of mutations, so I presume its a serial execution concern? In Cardinal we address serial constraints by splitting up the root into separate execution scopes that run serially, effectively treating it as N separate executions that happen to run in sequence.

It's used in a couple of hacky places where, for legacy reasons ([which ones??]), we need a dataloader-enabled code block to return right away, without running any other enqueued work.

Yeah, it seems like initializing a new dataloader, using it for one thing, then discarding it would do. I'll give that a try sometime.

gmac · 2025-09-02T20:40:27Z

lib/graphql/execution/interpreter/runtime/graphql_result.rb

          end

-          attr_accessor :graphql_dead
+          attr_accessor :graphql_dead, :was_scoped


What is was_scoped...? I've seen this floating around but never read back to figure out what it does, but I'm curious about it.

It implements this feature: https://graphql-ruby.org/authorization/scoping.html#bypassing-object-level-authorization

rmosolgo added 10 commits June 19, 2025 16:43

Start on a run_queue

1fdc9cd

Add resolve_type step

9d5b6da

Move #run into Result classes

fdbfd2f

Make a RunQueue object

8f04cc9

support sequential mutation fields

c45f628

Fix dataloader integration

32ea55c

Implement resolve_each

c188312

Isolate field resolution in step object

d9f4ebb

Add some Lazy support

56f38c0

Fix inspect output

186e822

rmosolgo added 4 commits June 24, 2025 10:51

More lazy support, better mutation eager execution

caa754b

Move execute field methods into ResultHash

0f090cc

use self instead of selection result

325d5d8

Move resolution code into FieldResolveStep

b233c92

rmosolgo mentioned this pull request Jun 24, 2025

Remove needless job counting #5392

Merged

rmosolgo added 14 commits June 24, 2025 15:17

Merge branch 'master' into run-queue-3

8837fa7

Add todos

d3927b8

Merge ResolveTypeStep into ResultHash

123bf35

Use named states

898ba4a

Move authorized into runtime state machine

6f6c79e

Move directive resolution into other steps

466a80c

Run dataloader if arguments need it

bdc1b08

Start working on dataloader compat

7f24a79

Merge branch 'master' into run-queue-3

805605d

Share a run queue within a multiplex; run steps inside a dataloader job

96e5d1c

Support appending callables directly to dataloader

46044fa

Improve eager continuation in FieldResolveStep, improve dataloader ba…

e3cf018

…tching

Rework to support lazy arguments

b1a8c74

Improve current runtime state

5be5032

rmosolgo added 9 commits July 10, 2025 06:48

Catch unauthorized errors from field resolution

f29cac4

Support list scoping

c4ebc03

Start merging RunQueue back into Dataloader

a46bcfa

Remove RunQueue

1f5117e

Remove Interpreter::Resolve which is now needless

77e7485

Merge branch 'master' into run-queue-3

a2285cd

Return NullDataloader which can be frozen

7812e22

Rescue some errors; hack to fix double-execute

b386fd6

Keep working

7d1b365

This was referenced Aug 26, 2025

Merge lazy resolution into Dataloader #5422

Merged

Runtime hook for breadth execution patterns #5425

Merged

gmac reviewed Sep 2, 2025

View reviewed changes

rmosolgo mentioned this pull request Sep 13, 2025

Dataloader perf report #5431

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Use a queue for execution #5389

Use a queue for execution #5389

Uh oh!

rmosolgo commented Jun 20, 2025 •

edited

Loading

Uh oh!

rmosolgo commented Jun 23, 2025

Uh oh!

rmosolgo commented Jul 26, 2025

Uh oh!

gmac Sep 2, 2025

Uh oh!

rmosolgo Sep 3, 2025

Uh oh!

rmosolgo Sep 4, 2025

Uh oh!

gmac Sep 2, 2025

Uh oh!

rmosolgo Sep 2, 2025

Uh oh!

gmac Sep 2, 2025

Uh oh!

rmosolgo Sep 2, 2025

Uh oh!

Uh oh!

Use a queue for execution #5389

Are you sure you want to change the base?

Use a queue for execution #5389

Uh oh!

Conversation

rmosolgo commented Jun 20, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

rmosolgo commented Jun 23, 2025

Uh oh!

rmosolgo commented Jul 26, 2025

Uh oh!

gmac Sep 2, 2025

Choose a reason for hiding this comment

Uh oh!

rmosolgo Sep 3, 2025

Choose a reason for hiding this comment

Uh oh!

rmosolgo Sep 4, 2025

Choose a reason for hiding this comment

Uh oh!

gmac Sep 2, 2025

Choose a reason for hiding this comment

Uh oh!

rmosolgo Sep 2, 2025

Choose a reason for hiding this comment

Uh oh!

gmac Sep 2, 2025

Choose a reason for hiding this comment

Uh oh!

rmosolgo Sep 2, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

rmosolgo commented Jun 20, 2025 •

edited

Loading