Skip to content

Commit c6862de

Browse files
Add guide on how to debug network traffic (#780)
* Add guide on how to debug network traffic * typo * typo * whitespace * Hint on HTTP2 * Update modules/guides/pages/debug-network-traffic.adoc Co-authored-by: Nick <[email protected]> --------- Co-authored-by: Nick <[email protected]>
1 parent 10245d4 commit c6862de

File tree

6 files changed

+155
-0
lines changed

6 files changed

+155
-0
lines changed
72.8 KB
Loading
110 KB
Loading
113 KB
Loading
386 KB
Loading

modules/guides/nav.adoc

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,6 @@
11
* xref:index.adoc[]
22
** xref:custom-images.adoc[]
3+
** xref:debug-network-traffic.adoc[]
34
** xref:providing-resources-with-pvcs.adoc[]
45
** xref:running-stackable-in-an-airgapped-environment.adoc[]
56
** xref:viewing-and-verifying-sboms.adoc[]
Lines changed: 154 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,154 @@
1+
= Debug network traffic
2+
:description: Capture and analyze network traffic between Pods. This also includes TLS encrypted communications.
3+
:tcpdump: https://www.tcpdump.org/
4+
:mitmproxy: https://www.mitmproxy.org/
5+
6+
You likely know this problem: Some tools is behaving weird, and you need to debug (often times HTTP/HTTPS or DNS) traffic between Kubernetes Pods.
7+
If the tool would be running on a local machine, one would simply start {tcpdump}[`tcpdump`] and inspect the traffic.
8+
Maybe use {mitmproxy}[`mitmproxy`] as a HTTPS proxy to re-encrypt the HTTPS traffic, so that it is readable.
9+
10+
However, as we are running in a containerized environment, things are a bit more complicated.
11+
This guide explains you how you can capture and inspect traffic anyway.
12+
13+
There are a few things needed:
14+
15+
1. A sidecar running {tcpdump}[`tcpdump`], capturing the traffic into a file.
16+
2. If TLS (e.g. HTTPS) traffic is involved, the product needs to be configured in such a way, that it writes the TLS session keys into a file.
17+
The key log can be used afterwards to decrypt the TLS traffic.
18+
3. Wireshark to make it easier to inspect the captured traffic.
19+
You can give it the TLS key log and it will automatically decrypt the TLS traffic.
20+
21+
== Simple usage
22+
23+
If you only care about unencrypted communications, you can use this snippet to dump all traffic using {tcpdump}[`tcpdump`].
24+
25+
[source,yaml]
26+
----
27+
apiVersion: trino.stackable.tech/v1alpha1
28+
kind: TrinoCluster
29+
metadata:
30+
name: trino
31+
spec:
32+
coordinators:
33+
podOverrides:
34+
spec:
35+
containers:
36+
- name: tcpdump
37+
image: nicolaka/netshoot
38+
command: ["/bin/bash"]
39+
args:
40+
- -c
41+
# If the dump grows to big, you can use regular tcpdump filters here
42+
# to filter the captured traffic
43+
- tcpdump -i any -w /tmp/tcpdump.pcap
44+
----
45+
46+
=== Attach without restart
47+
48+
You can also use something like `kubectl debug trino-coordinator-default-0 -it --image=nicolaka/netshoot -c tcpdump` to use a debug container and attach to a Pod without restart.
49+
50+
== TLS decryption usage
51+
52+
Let's make things a bit more interesting using a real-world example.
53+
Let's assume Superset is behaving weird and we want to debug the network traffic from Superset to Trino, which is using HTTPS.
54+
55+
As of Java 21 the JVM does not respect the `SSLKEYLOGFILE` env var and does not seem to have support to write the TLS key log.
56+
So we need to use a third-party Java agent called https://github.com/neykov/extract-tls-secrets[extract-tls-secrets] for that.
57+
58+
[source,yaml]
59+
----
60+
apiVersion: trino.stackable.tech/v1alpha1
61+
kind: TrinoCluster
62+
metadata:
63+
name: trino
64+
spec:
65+
coordinators:
66+
envOverrides:
67+
SSLKEYLOGFILE: /tmp/sslkeys.log
68+
podOverrides:
69+
spec:
70+
# As we can not add a curl command to the Trino startup script, we add a initContainer,
71+
# that curls the needed jar for us
72+
initContainers:
73+
- name: download-java-agent
74+
image: nicolaka/netshoot # We only need curl, reusing same image for quicker pulls
75+
command: ["/bin/bash"]
76+
args:
77+
- -c
78+
- curl -L -o /jar/extract-tls-secrets.jar https://github.com/neykov/extract-tls-secrets/releases/download/v4.0.0/extract-tls-secrets-4.0.0.jar
79+
volumeMounts:
80+
- name: jar
81+
mountPath: /jar
82+
containers:
83+
- name: tcpdump
84+
image: nicolaka/netshoot
85+
command: ["/bin/bash"]
86+
args:
87+
- -c
88+
# If the dump grows to big, you can use regular tcpdump filters here
89+
# to filter the captured traffic
90+
- tcpdump -i any -w /tcpdump/tcpdump.pcap
91+
volumeMounts:
92+
- name: tcpdump
93+
mountPath: /tcpdump
94+
- name: trino
95+
volumeMounts:
96+
- name: jar
97+
mountPath: /jar
98+
volumes:
99+
- name: jar
100+
emptyDir: {}
101+
# As the dump can grow quite big we use a dedicated emptyDir for it
102+
- name: tcpdump
103+
emptyDir: {}
104+
jvmArgumentOverrides:
105+
add:
106+
- -javaagent:/jar/extract-tls-secrets.jar=/tmp/sslkeys.log
107+
----
108+
109+
Your Trino now captures all traffic into `tcpdump.pcap` and the SSL key logs into `sslkeys.log`.
110+
111+
Use the following command to copy the files to your local machine
112+
113+
[source,bash]
114+
----
115+
kubectl cp trino-coordinator-default-0:/tcpdump/tcpdump.pcap -c tcpdump tcpdump.pcap && kubectl cp trino-coordinator-default-0:/tmp/sslkeys.log -c trino sslkeys.log
116+
----
117+
118+
To inspect the traffic in Wireshark run
119+
120+
[source,bash]
121+
----
122+
wireshark -o tls.keylog_file:./sslkeys.log tcpdump.pcap
123+
----
124+
125+
Normal Wireshark usage applies now.
126+
E.g. for the case of Trino we want to see all `POST /v1/statement` HTTPS calls.
127+
You can filter for them using `http.request.method == POST && http.request.uri == "/v1/statement"`:
128+
129+
image::debug-network-traffic/1.png[]
130+
131+
You can see that the HTTP packet was actually TLS encrypted in the packet explorer at the bottom.
132+
133+
image::debug-network-traffic/2.png[]
134+
135+
To follow the entire HTTP stream, right-click on the packet and select `Follow` -> `HTTP Stream`.
136+
137+
image::debug-network-traffic/3.png[]
138+
139+
You now see the entire Superset -> Trino conversation, in this case the following SQL query:
140+
141+
[source,sql]
142+
----
143+
SELECT date_trunc('day', CAST(tpep_pickup_datetime AS TIMESTAMP)) AS __timestamp, AVG(duration_min) AS "Average trip duration"
144+
FROM demo.ny_taxi_data GROUP BY date_trunc('day', CAST(tpep_pickup_datetime AS TIMESTAMP)) ORDER BY "Average trip duration" DESC
145+
LIMIT 10000
146+
----
147+
148+
image::debug-network-traffic/4.png[]
149+
150+
== Follow-up tips
151+
152+
1. You can filter the packets in the {tcpdump}[`tcpdump`] call to reduce the capture file size.
153+
2. If you do this on a production setup, keep in mind that the dump might contain sensitive data and the TLS keys can be used to decrypt all TLS traffic of this Pod!
154+
3. In case the product uses HTTP 2 (or newer), you need to use a Wireshark filter such as `http2.headers.path == "/nifi-api/flow/current-user"`

0 commit comments

Comments
 (0)