Skip to content

Extremely slow order_by of nested object while querying nested relationships #8977

@carlosbaraza

Description

@carlosbaraza

Version Information

Server Version: 2.11.2
CLI Version (for CLI related issue): 2.11.2

Environment

MacOS / Linux

What is the current behaviour?

Sorting by a nested table column and at the same time joining big tables causes Hasura to fetch all records, serialise them and then sort, which is extremely slow. Even if a limit and offset are set to paginate the results.

Example query which takes 15 seconds with our dataset:

query ProjectUsers {
  project(
    order_by: {
      users: {
        name: desc
      }
    }
    offset: 0
    limit: 10
  ) {
    id
    user {
      companies {
        id
      }
    }
  }
}

What is the expected behaviour?

I would expect the query to be much faster.

Please provide any traces or logs that could help here.

Analyze performance

Aggregate  (cost=7269481.60..7269481.61 rows=1 width=32)
  ->  Limit  (cost=7269481.45..7269481.48 rows=10 width=46)
        ->  Sort  (cost=7269481.45..7269521.37 rows=15967 width=46)
              Sort Key: users.name DESC
              ->  Nested Loop Left Join  (cost=447.13..7269136.41 rows=15967 width=46)
                    ->  Seq Scan on project  (cost=0.00..638.67 rows=15967 width=32)
                    ->  Nested Loop Left Join  (cost=447.13..455.20 rows=1 width=46)
                          ->  Limit  (cost=0.29..8.30 rows=1 width=442)
                                ->  Index Scan using users_pkey on users  (cost=0.29..8.30 rows=1 width=442)
                                      Index Cond: (id = project.user_id)
                          ->  Aggregate  (cost=446.84..446.85 rows=1 width=32)
                                ->  Seq Scan on user_company  (cost=0.00..446.83 rows=1 width=16)
                                      Filter: (users.id = user_id)
                                SubPlan 3
                                  ->  Result  (cost=0.00..0.01 rows=1 width=32)
                          SubPlan 1
                            ->  Result  (cost=0.00..0.01 rows=1 width=32)
                    SubPlan 2
                      ->  Result  (cost=0.00..0.01 rows=1 width=32)
JIT:
  Functions: 27
  Options: Inlining true, Optimization true, Expressions true, Deforming true

Generated SQL:

SELECT
  coalesce(
    json_agg(
      "root"
      ORDER BY
        "root.or.user.pg.name" DESC NULLS FIRST
    ),
    '[]'
  ) AS "root"
FROM
  (
    SELECT
      row_to_json(
        (
          SELECT
            "_e"
          FROM
            (
              SELECT
                "_root.base"."id" AS "id",
                "_root.or.user"."user" AS "user"
            ) AS "_e"
        )
      ) AS "root",
      "_root.or.user"."root.or.user.pg.name" AS "root.or.user.pg.name"
    FROM
      (
        SELECT
          *
        FROM
          "public"."project"
        WHERE
          ('true')
      ) AS "_root.base"
      LEFT OUTER JOIN LATERAL (
        SELECT
          row_to_json(
            (
              SELECT
                "_e"
              FROM
                (
                  SELECT
                    "_root.or.user.ar.user.companies"."companies" AS "companies"
                ) AS "_e"
            )
          ) AS "user",
          "_root.or.user.base"."name" AS "root.or.user.pg.name"
        FROM
          (
            SELECT
              *
            FROM
              "public"."users"
            WHERE
              (("_root.base"."user_id") = ("id"))
            LIMIT
              1
          ) AS "_root.or.user.base"
          LEFT OUTER JOIN LATERAL (
            SELECT
              coalesce(json_agg("companies"), '[]') AS "companies"
            FROM
              (
                SELECT
                  row_to_json(
                    (
                      SELECT
                        "_e"
                      FROM
                        (
                          SELECT
                            "_root.or.user.ar.user.companies.base"."id" AS "id"
                        ) AS "_e"
                    )
                  ) AS "companies"
                FROM
                  (
                    SELECT
                      *
                    FROM
                      "public"."user_company"
                    WHERE
                      (
                        ("_root.or.user.base"."id") = ("user_id")
                      )
                  ) AS "_root.or.user.ar.user.companies.base"
              ) AS "_root.or.user.ar.user.companies"
          ) AS "_root.or.user.ar.user.companies" ON ('true')
      ) AS "_root.or.user" ON ('true')
    ORDER BY
      "root.or.user.pg.name" DESC NULLS FIRST
    LIMIT
      10 OFFSET 0
  ) AS "_root"

Any possible solutions?

My workaround is to do two queries, one fetching only the ids and then another to fetch all the rest of the data:

  1. Get the page ids (~60ms)
query ProjectUserIds {
  project(
    order_by: {
      users: {
        name: desc
      }
    }
    offset: 0
    limit: 10
  ) {
    id
  }
}
  1. Get all the details for the ids in the current page (~200ms)
query ProjectUserDetails {
  project(
    where: {
      id: { _in: $ids }
    }
  ) {
    user { companies { id } }
    ...other data
  }
}
  1. Sort in the application layer again by the ids returned in the first query.

This process is much faster. The first query takes 60ms, compared to 15s. The second query takes 200ms. The only problem is that currently there are two roundtrips. If this process was done by hasura internally, these queries would be much faster. It could be possible to enable a mode where hasura internally runs two queries instead of the row_to_json performance bottleneck.

Can you identify the location in the source code where the problem exists?

The order_by sorting happens after a Nested Loop Left Join that is serialised to JSON.

If the bug is confirmed, would you be willing to submit a PR?

I don't know Haskell.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions