Skip to content

Commit 644c4b3

Browse files
authored
Merge pull request #4327 from zac-nixon/main
doc updates for scaling, IMDS usage
2 parents 9ff2282 + 97214a5 commit 644c4b3

File tree

4 files changed

+115
-1
lines changed

4 files changed

+115
-1
lines changed

docs/deploy/configurations.md

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -114,7 +114,8 @@ Currently, you can set only 1 namespace to watch in this flag. See [this Kuberne
114114
| webhook-cert-dir | string | /tmp/k8s-webhook-server/serving-certs | The directory that contains the server key and certificate |
115115
| webhook-cert-file | string | tls.crt | The server certificate name |
116116
| webhook-key-file | string | tls.key | The server key name |
117-
117+
| alb-gateway-max-concurrent-reconciles | int | 3 | Maximum number of concurrently running reconcile loops for ALB gateways, if enabled |
118+
| nlb-gateway-max-concurrent-reconciles | int | 3 | Maximum number of concurrently running reconcile loops for NLB gateways, if enabled |
118119

119120
### disable-ingress-class-annotation
120121
`--disable-ingress-class-annotation` controls whether to disable new usage of the `kubernetes.io/ingress.class` annotation.

docs/deploy/installation.md

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -127,6 +127,8 @@ If you're not setting up IAM roles for service accounts, apply the IAM policies
127127
curl -o iam-policy.json https://raw.githubusercontent.com/kubernetes-sigs/aws-load-balancer-controller/v2.13.4/docs/install/iam_policy.json
128128
```
129129
130+
When using this option, IMDS *must* be enabled. The controller retrieves the instance credentials using IMDS. Use IRSA to avoid usage of IMDS.
131+
130132
## Special IAM cases
131133
132134
### You only want the LBC to add and remove IPs to already existing target groups:

docs/deploy/scaling.md

Lines changed: 110 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,110 @@
1+
# Scaling your controller deployment
2+
3+
The AWS Load Balancer Controller (LBC) implements a standard Kubernetes controller. The controller reads changes from the cluster
4+
API server, calculates an intermediate representation (IR) of your AWS resources, then ensures the actual AWS resources match
5+
the IR state. The controller can perform CRUD operations to ensure the Kubernetes and AWS resources stay in-sync. This page is
6+
meant to 1/ inform users about some LBC internals and 2/ help users get higher performance out of their LBC.
7+
8+
As of writing, the controller uses a high-availability deployment model using an active-passive mode. When running multiple replicas
9+
of the LBC, only one replica is responsible for talking to AWS to change the state of resources. The extra replicas are able to assist with
10+
webhook invocations, e.g. for object validation or mutation, but will not change the state of any resources within AWS unless the active
11+
leader replica relinquishes the leader lock. Generally, it is recommended to run at least two replicas for fast fail-over of leadership changes.
12+
During leadership changes, there is a 15-second to 2 minute stoppage of CRUD operations that can lead to state drift between your cluster
13+
and AWS resources. Another benefit of running multiple replicas is to alleviate some load from the leader replica, as more replicas
14+
mean fewer webhook invocations on the leader replica.
15+
16+
## Resource Allocation
17+
18+
By default, the provided installation bundle sets the CPU and memory requests / limits to:
19+
20+
```
21+
resources:
22+
limits:
23+
cpu: 200m
24+
memory: 500Mi
25+
requests:
26+
cpu: 100m
27+
memory: 200Mi
28+
```
29+
30+
these limits are used by the default threading model the LBC uses which is:
31+
32+
- 3 threads for Ingress management (ALB)
33+
- 3 threads for Service management (NLB)
34+
- 3 threads for ALB Gateway management (IF ENABLED)
35+
- 3 threads for NLB Gateway management (IF ENABLED)
36+
- 3 threads for TargetGroupBinding management (Target Registration for ALB / NLB)
37+
38+
For 99.9% of use-cases, these values are enough. When managing a large number of resources, the threads should be tuned in turn the
39+
memory and CPU resources should be tuned. Here's a general formula:
40+
41+
** This formula is just a suggestion, and many workloads might perform differently. It's important to load test your exact scenario **
42+
43+
For every 200 Ingresses your controller manages, add three additional Ingress threads.
44+
45+
For every 400 Services your controller manages, add three additional Service threads.
46+
47+
For every 100 TargetGroupBindings, add three additional TargetGroupBinding threads.
48+
49+
** Gateway thread management still needs research **
50+
51+
A good formula to use for setting CPU requests / limit is to add 50m per 10 threads added.
52+
53+
A good formula to use for setting Memory requests / limit is to add 100Mi per 10 threads added.
54+
55+
Use these controller flags to update the threadpools:
56+
```
57+
--targetgroupbinding-max-concurrent-reconciles
58+
--service-max-concurrent-reconciles
59+
--ingress-max-concurrent-reconciles
60+
--alb-gateway-max-concurrent-reconciles
61+
--nlb-gateway-max-concurrent-reconciles
62+
```
63+
64+
65+
** Important **
66+
67+
When adding more threads, the LBC will call AWS APIs more often. See the next section for how to raise your AWS API limits to accommodate
68+
more threads.
69+
70+
71+
## API throttling
72+
73+
74+
There is multiple layers of API throttling to consider.
75+
76+
### Kubernetes API <-> LBC
77+
78+
Cluster administrators may configure the Kubernetes API, LBC interaction using this document.
79+
[Kubernetes Throttling](https://kubernetes.io/docs/concepts/cluster-administration/flow-control/)
80+
81+
### LBC <-> AWS APIs
82+
83+
The LBC uses clientside throttling and AWS APIs use server side throttling.
84+
85+
This document talks about the AWS API throttling mechanisms.
86+
[AWS API Throttling](https://aws.amazon.com/blogs/mt/managing-monitoring-api-throttling-in-workloads/)
87+
88+
#### Clientside throttling
89+
90+
The LBC implements clientside throttling by default, to preserve AWS API throttle volume for other processes that
91+
may need to communicate with AWS. By default, this is the clientside throttling configuration:
92+
93+
````
94+
Elastic Load Balancing v2:RegisterTargets|DeregisterTargets=4:20,Elastic Load Balancing v2:.*=10:40
95+
````
96+
97+
To decipher what this means, let's break it down. We are setting the ELBv2 APIs (the ELB APIs the controller talks to)
98+
to limit the controller to four register / deregister calls per second with a token bucket allowance that allows spikes up to 20 tps.
99+
The other (10:40) rule limits the overall calls to the ELBv2 APIs, no matter the API invoked. The overall allowance is 10 calls per second,
100+
with a burst allowance of 40 tps.
101+
102+
#### AWS Serverside throttling
103+
104+
AWS allows for server-siding throttling limit increases for valid uses-cases, cut a support ticket with your use-case if you
105+
see throttling within the controller. Make sure to increase the clientside throttles when a limit increase is granted.
106+
107+
108+
109+
110+

mkdocs.yml

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -14,6 +14,7 @@ nav:
1414
- Subnet Discovery: deploy/subnet_discovery.md
1515
- Security Group Management: deploy/security_groups.md
1616
- Pod Readiness Gate: deploy/pod_readiness_gate.md
17+
- Scaling your LBC: deploy/scaling.md
1718
- Upgrade:
1819
- Migrate v1 to v2: deploy/upgrade/migrate_v1_v2.md
1920
- Guide:

0 commit comments

Comments
 (0)