CPU Limit Adjustments #1181

mrnicegyu11 · 2025-08-19T09:53:09Z

What do these changes do?

Higher CPU Limit for director-v0
Higher CPU Limit and unconstrained wrt node-labels for registry

Related issue/s

Related PR/s

Checklist

I tested and it works

Merge remote-tracking branch 'upstream/main'

…oundation#979) * Introduce longhorn chart * Further longhorn configuration * Longhorn: further settings configuration * Fix longhorn configuration bugs Extra: introduce longhorn pv vales for portainer * Add comment for deletion longhorn * Further longhorn configuration * Add README.md for Longhorn wit FAQ * Update Longhorn readme * Update readme * Futher LH configuration * Update LH's Readme * Update Longhorn Readme * Improve LH's Readme * LH: Reduce reserved default disk space to 5% Since we use a dedicated disk for LH, we can go ahead with 5% * Use values to set Longhorn storage class * Update LH's Readme * LH Readme: add requirements reference * PR Review: bring back portainer s3 pv * LH: decrease portinaer volume size

Merge remote-tracking branch 'upstream/main'

YuryHrytsuk · 2025-08-19T09:55:26Z

services/registry/docker-compose.yml.j2

      resources:
        limits:
          memory: 1G
-          cpus: '2'
+          cpus: '6'


Why this number? What is the reasoning behind?
Do we take into account available number of CPU on machines?

the number of available CPUs are not taken into account, and afaik this is correct, they do not matter (as much). [talk to me for clarification ;)]

The number is a guess based on prometheus observation on osparc-master (e.g. PromQL sum(clamp_max(rate(container_cpu_cfs_throttled_seconds_total{image=~"(registry:.*)|(traefik:.*)|(.*itisfoundation.*)|(.*director:.*)"}[2m]) > 0.2,3)) by (container_label_com_docker_swarm_service_name))

nevertheless @mrnicegyu11 if you set a limit above what is available in reality, the container never starts.

OK then I vote for removing the limit alltogether. I can see that it was throttled >7 (additional) CPUs at a time on master, so 8 CPUs would be the right number. But this is too high to put into the section if what @sanderegg says is correct. @YuryHrytsuk @sanderegg

Optional: Test CPU Limit
Optional: Benchmark with and without registry CPU limit

sanderegg

thanks.
for the registry that will not go on worker nodes where we run services in prod right?

YuryHrytsuk · 2025-08-19T09:58:18Z

Is there an issue where we track all these efforts so we can have a summary?

mrnicegyu11 · 2025-08-19T12:35:18Z

thanks. for the registry that will not go on worker nodes where we run services in prod right?

@sanderegg good point, they would right now. Can you provide me with a label that autoscaled machines have (by which they can be identified?) so I can exclude at least those? thx

sanderegg · 2025-08-19T14:06:23Z

thanks. for the registry that will not go on worker nodes where we run services in prod right?

@sanderegg good point, they would right now. Can you provide me with a label that autoscaled machines have (by which they can be identified?) so I can exclude at least those? thx

@mrnicegyu11 These labels are defined in osparc-config. so I guess at least it should not go anywhere where there are dynamic sidecars at least.

mrnicegyu11 · 2025-08-19T14:43:21Z

~~Please note that this PR is currently blocking the master CD (CD is disabled until this PR is merged)~~
unblocked

mrnicegyu11 · 2025-08-19T14:43:38Z

thanks. for the registry that will not go on worker nodes where we run services in prod right?

@sanderegg good point, they would right now. Can you provide me with a label that autoscaled machines have (by which they can be identified?) so I can exclude at least those? thx

@mrnicegyu11 These labels are defined in osparc-config. so I guess at least it should not go anywhere where there are dynamic sidecars at least.

I have added constraints along those lines

YuryHrytsuk · 2025-08-20T06:30:53Z

services/registry/docker-compose.yml.j2

+          - node.labels.gpu!=true
+          - node.labels.dynamicsidecar!=true


Why removing OPS constraint node.labels.ops==true and introducing negative constraints? Is there anything special about this service so it cannot follow general conventions (ops label in this case)?

mrnicegyu11 · 2025-08-20T06:39:52Z

services/registry/docker-compose.yml.j2

@@ -115,11 +115,12 @@ services:
        parallelism: 1
      placement:
        constraints:
-          - node.labels.ops==true


Chore:

use .master. docker compose file or j2 instead of negative labels

spread-constraint

or: consider manager machine

on all deployment

optional: test / precommit (minor)

@Hrytsuk

…on#1182) * wip * Add csi-s3 and have portainer use it * Change request @Hrytsuk 1GB max portainer volume size * Arch Linux Certificates Customization * Fix pgsql exporter failure * [Kubernetes] Introduce on-prem persistent Storage (Longhorn) 🎉 (ITISFoundation#979) * Introduce longhorn chart * Further longhorn configuration * Longhorn: further settings configuration * Fix longhorn configuration bugs Extra: introduce longhorn pv vales for portainer * Add comment for deletion longhorn * Further longhorn configuration * Add README.md for Longhorn wit FAQ * Update Longhorn readme * Update readme * Futher LH configuration * Update LH's Readme * Update Longhorn Readme * Improve LH's Readme * LH: Reduce reserved default disk space to 5% Since we use a dedicated disk for LH, we can go ahead with 5% * Use values to set Longhorn storage class * Update LH's Readme * LH Readme: add requirements reference * PR Review: bring back portainer s3 pv * LH: decrease portinaer volume size * Experimental: Try to add tracing to simcore-traefik on master * Fixes ITISFoundation/osparc-simcore#7363 * Arch Linux Certificates Customization - 2 * Upgrade registry, add tracing * revert accidental commit --------- Co-authored-by: Dustin Kaiser <[email protected]> Co-authored-by: YH <[email protected]>

…undation#1188)

…tAdjustment

mrnicegyu11 and others added 30 commits October 15, 2024 16:18

wip

f0d8cf0

Merge remote-tracking branch 'upstream/main' into main

e906b41

Merge remote-tracking branch 'upstream/main' into main

14c751d

Add csi-s3 and have portainer use it

293f63c

Change request @Hrytsuk 1GB max portainer volume size

f7f72ec

t push

94cfb76

Merge remote-tracking branch 'upstream/main'

Merge remote-tracking branch 'upstream/main'

509c717

Merge remote-tracking branch 'upstream/main'

1a65ecf

Merge remote-tracking branch 'upstream/main'

77ee45e

Arch Linux Certificates Customization

c9c70d6

Merge remote-tracking branch 'upstream/main'

7b8be53

Merge remote-tracking branch 'upstream/main'

bcd61cd

Merge remote-tracking branch 'upstream/main'

58e1030

Merge remote-tracking branch 'upstream/main'

ed8d479

Merge remote-tracking branch 'upstream/main'

dda6e01

Merge remote-tracking branch 'upstream/main'

f6f4f36

Merge remote-tracking branch 'upstream/main'

5dca5c3

Merge remote-tracking branch 'upstream/main'

4a653ef

Merge remote-tracking branch 'upstream/main'

3a21f0f

Fix pgsql exporter failure

48fbbca

Merge remote-tracking branch 'upstream/main'

08c57db

Experimental: Try to add tracing to simcore-traefik on master

3ea41b5

Fixes ITISFoundation/osparc-simcore#7363

29f2f2e

Merge branch 'ITISFoundation:main' into main

cdef57f

t push

c0f393e

Merge remote-tracking branch 'upstream/main'

Merge remote-tracking branch 'upstream/main'

34a86fd

Merge remote-tracking branch 'upstream/main'

df3f5df

Merge remote-tracking branch 'upstream/main'

ac44663

Merge remote-tracking branch 'upstream/main'

4100b87

mrnicegyu11 requested a review from sanderegg August 19, 2025 09:53

mrnicegyu11 self-assigned this Aug 19, 2025

mrnicegyu11 added p:high-prio FAST labels Aug 19, 2025

revert

0840c61

mrnicegyu11 marked this pull request as ready for review August 19, 2025 09:53

mrnicegyu11 requested a review from YuryHrytsuk as a code owner August 19, 2025 09:53

YuryHrytsuk reviewed Aug 19, 2025

View reviewed changes

sanderegg reviewed Aug 19, 2025

View reviewed changes

Merge branch 'main' into 2025/change/CPULimitAdjustment

b7c0dbc

mrnicegyu11 requested review from YuryHrytsuk and sanderegg August 19, 2025 13:31

Dont [place registry on GPU or sidecar machines

1db1257

sanderegg approved these changes Aug 19, 2025

View reviewed changes

YuryHrytsuk reviewed Aug 20, 2025

View reviewed changes

mrnicegyu11 commented Aug 20, 2025

View reviewed changes

mrnicegyu11 and others added 9 commits August 20, 2025 09:24

Merge branch 'main' into 2025/change/CPULimitAdjustment

7726cff

Merge branch 'main' into 2025/change/CPULimitAdjustment

5f95bb7

Increase registry & dv0 CPU limit

dede0a1

FluentD configuration: remove / in front of container names (ITISFo…

a7a9af9

…undation#1188)

ensure graylog source is correct (ITISFoundation#1189)

4d2e914

merge

1aa8e2e

Merge remote-tracking branch 'upstream/main' into 2025/change/CPULimi…

a9d7b35

…tAdjustment

Merge branch 'main' into 2025/change/CPULimitAdjustment

0086be0

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

CPU Limit Adjustments #1181

CPU Limit Adjustments #1181

mrnicegyu11 commented Aug 19, 2025

Uh oh!

YuryHrytsuk Aug 19, 2025

Uh oh!

mrnicegyu11 Aug 19, 2025

Uh oh!

sanderegg Aug 19, 2025

Uh oh!

mrnicegyu11 Aug 19, 2025

Uh oh!

mrnicegyu11 Aug 20, 2025

Uh oh!

sanderegg left a comment

Uh oh!

YuryHrytsuk commented Aug 19, 2025 •

edited

Loading

Uh oh!

mrnicegyu11 commented Aug 19, 2025

Uh oh!

sanderegg commented Aug 19, 2025 •

edited

Loading

Uh oh!

mrnicegyu11 commented Aug 19, 2025 •

edited

Loading

Uh oh!

mrnicegyu11 commented Aug 19, 2025

Uh oh!

YuryHrytsuk Aug 20, 2025 •

edited

Loading

Uh oh!

mrnicegyu11 Aug 20, 2025 •

edited

Loading

Uh oh!

mrnicegyu11 Aug 20, 2025

Uh oh!

Uh oh!

CPU Limit Adjustments #1181

Are you sure you want to change the base?

CPU Limit Adjustments #1181

Conversation

mrnicegyu11 commented Aug 19, 2025

What do these changes do?

Related issue/s

Related PR/s

Checklist

Uh oh!

YuryHrytsuk Aug 19, 2025

Choose a reason for hiding this comment

Uh oh!

mrnicegyu11 Aug 19, 2025

Choose a reason for hiding this comment

Uh oh!

sanderegg Aug 19, 2025

Choose a reason for hiding this comment

Uh oh!

mrnicegyu11 Aug 19, 2025

Choose a reason for hiding this comment

Uh oh!

mrnicegyu11 Aug 20, 2025

Choose a reason for hiding this comment

Uh oh!

sanderegg left a comment

Choose a reason for hiding this comment

Uh oh!

YuryHrytsuk commented Aug 19, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

mrnicegyu11 commented Aug 19, 2025

Uh oh!

sanderegg commented Aug 19, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

mrnicegyu11 commented Aug 19, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

mrnicegyu11 commented Aug 19, 2025

Uh oh!

YuryHrytsuk Aug 20, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

mrnicegyu11 Aug 20, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

mrnicegyu11 Aug 20, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

YuryHrytsuk commented Aug 19, 2025 •

edited

Loading

sanderegg commented Aug 19, 2025 •

edited

Loading

mrnicegyu11 commented Aug 19, 2025 •

edited

Loading

YuryHrytsuk Aug 20, 2025 •

edited

Loading

mrnicegyu11 Aug 20, 2025 •

edited

Loading