Sometimes is hard to analyse what is happening as networking level into your pods deployed in OpenShift or Kubernetes.
How you can debug and/or analyse your network traffic to your application to solve issues quicker and more effectively? How you can use the well known Wireshark tool as always?
We will be using tcpdump to capture a so-called, PCAP (packet capture) file that will contain the pod’s network traffic. This PCAP file can then be loaded in a tool like Wireshark to analyze the traffic and, in this case, the RESTful communication of a service running in a pod.
This tcpdump will be running in a sidecar container beside our app container within our pod.
A sidecar container is a container that is running in the same pod as the actual service/application and is able to provide additional functionality to the service/application.
Deploying the sidecar
- Create a new project for testing purposes:
$ oc new-project test-delete-rcarrata
- Deploy an example application for testing it:
$ oc new-app django-psql-example $ oc get pod NAME READY STATUS RESTARTS AGE django-psql-example-1-build 0/1 Completed 0 3m4s django-psql-example-1-deploy 0/1 Completed 0 74s django-psql-example-1-j4w28 1/1 Running 0 65s django-psql-example-2-deploy 0/1 ContainerCreating 0 4s postgresql-1-2q9h7 1/1 Running 0 2m49s postgresql-1-deploy 0/1 Completed 0 2m57s
- Fetch the deploymentconfig of the django-psql-example:
$ oc get dc django-psql-example -o yaml > django-psql-tcpdump.yaml
- Into the deploymentconfig add the container that you want to do tcpdump:
- name: tcpdump image: corfr/tcpdump command: - /bin/sleep - infinity
- In the case for the django app, the sidecar will be into the container spec:
spec: containers: - name: tcpdump image: corfr/tcpdump command: - /bin/sleep - infinity - env: - name: DATABASE_SERVICE_NAME value: postgresql
This will spin up an additional container with a sidecar, that you can execute tcpdump to capture and for further analysis the several packets that are receiving / sending the django container (remember that tcpdump and django containers are in the same pod).
- Apply the sidecar deploymentconfig django psql:
$ oc apply -f django-psql-example.yaml deploymentconfig.apps.openshift.io/django-psql-example configured $ oc get pod -w NAME READY STATUS RESTARTS AGE django-psql-example-1-build 0/1 Completed 0 3m24s django-psql-example-1-deploy 0/1 Completed 0 94s django-psql-example-2-deploy 1/1 Running 0 24s django-psql-example-2-gfws6 0/2 ContainerCreating 0 8s postgresql-1-2q9h7 1/1 Running 0 3m9s postgresql-1-deploy 0/1 Completed 0 3m17s django-psql-example-2-gfws6 0/2 ContainerCreating 0 8s django-psql-example-2-gfws6 1/2 Running 0 10s django-psql-example-2-gfws6 2/2 Running 0 14s django-psql-example-2-deploy 0/1 Completed 0 30s django-psql-example-2-deploy 0/1 Completed 0 30s
Capturing and analyzing traffic
With the sidecar deployed and running, we can now start capturing data
- Log in into the tcpdump container:
~ $ oc rsh -c tcpdump django-psql-example-2-gfws6 ~ $ tcpdump -s 0 -n -w /tmp/example.pcap tcpdump: eth0: You don't have permission to capture on that device (socket: Operation not permitted)
What happened? Due to the SCCs, the tcpdump is not capable to capture the packets into the eth0 because the container have not the proper scc permissions.
- To avoid that you need to add a specific cluster-admin permissions to the default Service Account of the namespace with the anyuid scc:
oc adm policy add-scc-to-user anyuid -z default -n `oc project -q` --as=system:admin
IMPORTANT: This is could cause a security issue, because any pod can run as root so be careful and only implement this in the testing namespaces, or in the namespaces that are controlled by the cluster-admin and being noticed that security capabilities disabled.
- Rollout the deploymentconfig for deploy with the proper scc:
oc rollout latest dc django-psql
- Inside of the container of tcpdump of the pod that we deployed before (django-psql-example sidecard) execute the tcpdump:
$ tcpdump -s 0 -n -w /tmp/example.pcap
- Generate requests for this applications that will be captured by the tcpdump sidecar:
$ curl django-psql-example-test-delete-rcarrata.apps.ocp4.rcarrata.com -I HTTP/1.1 200 OK Server: gunicorn/19.4.5 Date: Thu, 27 Feb 2020 19:43:32 GMT Content-Type: text/html; charset=utf-8 X-Frame-Options: SAMEORIGIN Content-Length: 18255 Set-Cookie: 320587f6606431b421a7ed809db87323=ec4dec0bb6e99d5a3aaed6dd165eaa51; path=/; HttpOnly Cache-control: private
- Control+C the tcpdump command to exit and see how many packets are captured:
$ tcpdump -s 0 -n -w /tmp/example.pcap tcpdump: listening on eth0, link-type EN10MB (Ethernet), capture size 262144 bytes ^C574 packets captured 574 packets received by filter 0 packets dropped by kernel
- Copy the example.pcap to your localhost:
$ oc cp -c tcpdump django-psql-example-2-gfws6 :/tmp/example.pcap example.pcap
- Examine the pcap with wireshark… and voilà! You can analyse your your network traffic!
$ wireshark example.pcap
This is very useful for debugging and for see connectivity and app issues within external systems, or within interaction with other pods.
NOTE: Opinions expressed in this blog are my own and do not necessarily reflect that of the company I work for.