티스토리 뷰

Cloud/Kubernetes

Retina

Jacob_baek 2024. 3. 22. 17:42

Introduce Retina

네트워크 트래픽(capture)과 metric 수집하고 저장하는것을 도와주는 도구이다.

  • Metrics (eBPF 프로그램을 주입하여 지속가능한 형태로 저장)
  • Captures (일시적인 tcpdump 형태이며 custom 하게 지정이 가능함)

Retina 와 관련된 주요 링크들

Architecture

https://retina.sh/docs/intro#extendable-architecture

Install

공식적인 helm chart를 제공하고 있지는 않다.
이를 사용하기 위해서는 retina 공식 github 저장소에서 clone 받은 source를 이용하여

$ make helm-install
$ ## 혹은 operator를 통한 설치를 하고자 한다면,,
$ make helm-install-with-operator

실제 배포시 아래와 같은 형태로 helm install이 진행된다.

helm upgrade --install retina ./deploy/manifests/controller/helm/retina/ \
        --namespace kube-system \
        --set image.repository=ghcr.io/microsoft/retina/retina-agent \
        --set image.initRepository=ghcr.io/microsoft/retina/retina-init \
        --set image.tag=v0.0.1 \
        --set operator.tag=v0.0.1 \
        --set image.pullPolicy=Always \
        --set logLevel=info \
        --set os.windows=true \
        --set operator.enabled=true \
        --set operator.enableRetinaEndpoint=true \
        --set operator.repository=ghcr.io/microsoft/retina/retina-operator \
        --skip-crds \
        --set enabledPlugin_linux="\[dropreason\,packetforward\,linuxutil\,dns\,packetparser\]"

아래와 같은 deployment 및 daemonset이 배포되어진다.

$ k get ds -n kube-system -l k8s-app=retina
NAME               DESIRED   CURRENT   READY   UP-TO-DATE   AVAILABLE   NODE SELECTOR              AGE
retina-agent       3         3         2       3            2           kubernetes.io/os=linux     3m17s
retina-agent-win   0         0         0       0            0           kubernetes.io/os=windows   3m17s

monitoring

additionalscrapconfigs를 추가한다.

$ cat retina-values.yaml
prometheus:
  prometheusSpec:
    additionalScrapeConfigs: |
      - job_name: "retina-pods"
        kubernetes_sd_configs:
          - role: pod
        relabel_configs:
          - source_labels: [__meta_kubernetes_pod_container_name]
            action: keep
            regex: retina(.*)
          - source_labels:
              [__address__, __meta_kubernetes_pod_annotation_prometheus_io_port]
            separator: ":"
            regex: ([^:]+)(?::\d+)?
            target_label: __address__
            replacement: ${1}:${2}
            action: replace
          - source_labels: [__meta_kubernetes_pod_node_name]
            action: replace
            target_label: instance
        metric_relabel_configs:
          - source_labels: [__name__]
            action: keep
            regex: (.*)

https://github.com/microsoft/retina/blob/main/deploy/prometheus/values.yaml

$ helm upgrade -f retina-values.yaml kube-prom-stack prometheus-community/kube-prometheus-stack -n monitoring

How to use

retina binary

아래 링크를 따라 kubectl-retian binary를 설치한다.
(krew로 설치하거나 download를 받아 PATH가 지정된 곳에 넣어주어도 된다.)

job을 통해 capture를 수행하기에 별도의 CRD 설치가 필요하지는 않다.
만약 blob의 token이 포함된 SAS등의 민감정보로 직접 명령을 수행하는것이 어렵다면
CRD를 통해서도 capture가 가능하다. (https://retina.sh/docs/captures/)
다시 말하자면, kubectl-retian binary 설치만으로 kubectl 명령으로 capture 및 blob에 저장이 가능하다.

hostpath로 capture 파일 생성은 아래와 같은 명령으로 수행 가능하다.

$ kubectl retina capture create --name capturetestjacob0 --namespace default --pod-selectors="app=nginx-sample"  --host-path /mnt/capture
ts=2024-06-04T10:58:53.947+0900 level=info caller=capture/create.go:243 msg="The capture duration is set to 1m0s"
ts=2024-06-04T10:58:53.947+0900 level=info caller=capture/create.go:289 msg="The capture file max size is set to 100MB"
ts=2024-06-04T10:58:53.947+0900 level=info caller=utils/capture_image.go:56 msg="Using capture workload image ghcr.io/microsoft/retina/retina-agent:v0.0.11 with version determined by CLI version"
ts=2024-06-04T10:58:53.948+0900 level=info caller=capture/crd_to_job.go:201 msg="HostPath is not empty" HostPath=/mnt/capture
ts=2024-06-04T10:58:54.083+0900 level=info caller=capture/crd_to_job.go:876 msg="The Parsed tcpdump filter is \"\""
ts=2024-06-04T10:58:54.109+0900 level=info caller=capture/create.go:369 msg="Packet capture job is created" namespace=default capture job=capturetestjacob0-2xbnf
ts=2024-06-04T10:58:54.109+0900 level=info caller=capture/create.go:125 msg="Please manually delete all capture jobs"
NAMESPACE   CAPTURE NAME        JOBS                      COMPLETIONS   AGE
default     capturetestjacob0   capturetestjacob0-2xbnf   0/1           1s

Azure Storage Account 의 blob에 capture 파일을 업로드 할수 있다.

$ $ kubectl retina capture create --namespace default --pod-selectors="app=nginx-sample" --blob-upload="https://azstorageaccountname.blob.core.windows.net/captureblob?sp=raw&st=xxxxxxxxxxxx" --name capturetestbyjacob1
ts=2024-06-04T10:55:48.130+0900 level=info caller=capture/create.go:243 msg="The capture duration is set to 1m0s"
ts=2024-06-04T10:55:48.130+0900 level=info caller=capture/create.go:289 msg="The capture file max size is set to 100MB"
ts=2024-06-04T10:55:48.264+0900 level=info caller=utils/capture_image.go:56 msg="Using capture workload image ghcr.io/microsoft/retina/retina-agent:v0.0.11 with version determined by CLI version"
ts=2024-06-04T10:55:48.266+0900 level=info caller=capture/crd_to_job.go:224 msg="BlobUpload is not empty"
ts=2024-06-04T10:55:48.331+0900 level=info caller=capture/crd_to_job.go:876 msg="The Parsed tcpdump filter is \"\""
ts=2024-06-04T10:55:48.349+0900 level=info caller=capture/create.go:369 msg="Packet capture job is created" namespace=default capture job=capturetestbyjacob1-2cz48
ts=2024-06-04T10:55:48.349+0900 level=info caller=capture/create.go:125 msg="Please manually delete all capture jobs"
ts=2024-06-04T10:55:48.349+0900 level=info caller=capture/create.go:127 msg="Please manually delete capture secret" namespace=default secret name=capture-blob-upload-secretsjvwh
NAMESPACE   CAPTURE NAME          JOBS                        COMPLETIONS   AGE
default     capturetestbyjacob1   capturetestbyjacob1-2cz48   0/1           0s

완료된후 kubectl-retina 에서 제공하는 capture list parameter로 확인 가능하며 job으로도 볼수 있다.

$ kubectl retina capture list
NAMESPACE   CAPTURE NAME           JOBS                         COMPLETIONS   AGE
default     retina-capture-2vqh6   retina-capture-2vqh6-cf4kg   1/1           4m30s
default     retina-capture-mb862   retina-capture-mb862-jzns9   1/1           2m16s
default     retina-capture-tckbw   retina-capture-tckbw-swlt7   1/1           4m42s

$ kubectl get jobs
NAME                         COMPLETIONS   DURATION   AGE
retina-capture-2vqh6-cf4kg   1/1           70s        4m17s
retina-capture-mb862-jzns9   1/1           71s        2m3s
retina-capture-tckbw-swlt7   1/1           80s        4m29s

위와 같이 capture 가 가능하며 실제로 storageaccount에서 확인해보면 아래와 같은 tar 파일이 생성된것을 확인할 수 있다.

azure storageaccount 및 hostpath, PVC를 통해 capture 된 tar 파일을 받을수도 있으며
s3 규격을 맞춘 storage 의 경우 지원이 된다.

metric

아래 plugin을 추가하여 다양한
https://retina.sh/docs/metrics/plugins/packetforward

기본적으로 아래와 같은 metric이

root@aks-nodepool1-26826537-vmss00000A:/# curl 10.224.0.4:18080/metrics
# HELP certwatcher_read_certificate_errors_total Total number of certificate read errors
# TYPE certwatcher_read_certificate_errors_total counter
certwatcher_read_certificate_errors_total 0
# HELP certwatcher_read_certificate_total Total number of certificate reads
# TYPE certwatcher_read_certificate_total counter
certwatcher_read_certificate_total 0
# HELP go_gc_duration_seconds A summary of the pause duration of garbage collection cycles.
# TYPE go_gc_duration_seconds summary
go_gc_duration_seconds{quantile="0"} 5.0899e-05
go_gc_duration_seconds{quantile="0.25"} 6.75e-05
go_gc_duration_seconds{quantile="0.5"} 0.0001773
go_gc_duration_seconds{quantile="0.75"} 0.000183199
go_gc_duration_seconds{quantile="1"} 0.000394699
go_gc_duration_seconds_sum 0.000980497
go_gc_duration_seconds_count 6
# HELP go_goroutines Number of goroutines that currently exist.
# TYPE go_goroutines gauge
go_goroutines 30
# HELP go_info Information about the Go environment.
# TYPE go_info gauge
go_info{version="go1.21.8"} 1
# HELP go_memstats_alloc_bytes Number of bytes allocated and still in use.
# TYPE go_memstats_alloc_bytes gauge
go_memstats_alloc_bytes 4.1930528e+07
# HELP go_memstats_alloc_bytes_total Total number of bytes allocated, even if freed.
# TYPE go_memstats_alloc_bytes_total counter
go_memstats_alloc_bytes_total 6.376612e+07
# HELP go_memstats_buck_hash_sys_bytes Number of bytes used by the profiling bucket hash table.
# TYPE go_memstats_buck_hash_sys_bytes gauge
go_memstats_buck_hash_sys_bytes 1.551113e+06
# HELP go_memstats_frees_total Total number of frees.
# TYPE go_memstats_frees_total counter
go_memstats_frees_total 198985
# HELP go_memstats_gc_sys_bytes Number of bytes used for garbage collection system metadata.
# TYPE go_memstats_gc_sys_bytes gauge
go_memstats_gc_sys_bytes 5.046112e+06
# HELP go_memstats_heap_alloc_bytes Number of heap bytes allocated and still in use.
# TYPE go_memstats_heap_alloc_bytes gauge
go_memstats_heap_alloc_bytes 4.1930528e+07
# HELP go_memstats_heap_idle_bytes Number of heap bytes waiting to be used.
# TYPE go_memstats_heap_idle_bytes gauge
go_memstats_heap_idle_bytes 8.97024e+06
# HELP go_memstats_heap_inuse_bytes Number of heap bytes that are in use.
# TYPE go_memstats_heap_inuse_bytes gauge
go_memstats_heap_inuse_bytes 4.4802048e+07
# HELP go_memstats_heap_objects Number of allocated objects.
# TYPE go_memstats_heap_objects gauge
go_memstats_heap_objects 368726
# HELP go_memstats_heap_released_bytes Number of heap bytes released to OS.
# TYPE go_memstats_heap_released_bytes gauge
go_memstats_heap_released_bytes 2.53952e+06
# HELP go_memstats_heap_sys_bytes Number of heap bytes obtained from system.
# TYPE go_memstats_heap_sys_bytes gauge
go_memstats_heap_sys_bytes 5.3772288e+07
# HELP go_memstats_last_gc_time_seconds Number of seconds since 1970 of last garbage collection.
# TYPE go_memstats_last_gc_time_seconds gauge
go_memstats_last_gc_time_seconds 1.7111824243566203e+09
# HELP go_memstats_lookups_total Total number of pointer lookups.
# TYPE go_memstats_lookups_total counter
go_memstats_lookups_total 0
# HELP go_memstats_mallocs_total Total number of mallocs.
# TYPE go_memstats_mallocs_total counter
go_memstats_mallocs_total 567711
# HELP go_memstats_mcache_inuse_bytes Number of bytes in use by mcache structures.
# TYPE go_memstats_mcache_inuse_bytes gauge
go_memstats_mcache_inuse_bytes 2400
# HELP go_memstats_mcache_sys_bytes Number of bytes used for mcache structures obtained from system.
# TYPE go_memstats_mcache_sys_bytes gauge
go_memstats_mcache_sys_bytes 15600
# HELP go_memstats_mspan_inuse_bytes Number of bytes in use by mspan structures.
# TYPE go_memstats_mspan_inuse_bytes gauge
go_memstats_mspan_inuse_bytes 482496
# HELP go_memstats_mspan_sys_bytes Number of bytes used for mspan structures obtained from system.
# TYPE go_memstats_mspan_sys_bytes gauge
go_memstats_mspan_sys_bytes 505176
# HELP go_memstats_next_gc_bytes Number of heap bytes when next garbage collection will take place.
# TYPE go_memstats_next_gc_bytes gauge
go_memstats_next_gc_bytes 8.2977432e+07
# HELP go_memstats_other_sys_bytes Number of bytes used for other system allocations.
# TYPE go_memstats_other_sys_bytes gauge
go_memstats_other_sys_bytes 611673
# HELP go_memstats_stack_inuse_bytes Number of bytes in use by the stack allocator.
# TYPE go_memstats_stack_inuse_bytes gauge
go_memstats_stack_inuse_bytes 753664
# HELP go_memstats_stack_sys_bytes Number of bytes obtained from system for stack allocator.
# TYPE go_memstats_stack_sys_bytes gauge
go_memstats_stack_sys_bytes 753664
# HELP go_memstats_sys_bytes Number of bytes obtained from system.
# TYPE go_memstats_sys_bytes gauge
go_memstats_sys_bytes 6.2255626e+07
# HELP go_threads Number of OS threads created.
# TYPE go_threads gauge
go_threads 9
# HELP process_cpu_seconds_total Total user and system CPU time spent in seconds.
# TYPE process_cpu_seconds_total counter
process_cpu_seconds_total 1.4
# HELP process_max_fds Maximum number of open file descriptors.
# TYPE process_max_fds gauge
process_max_fds 1.048576e+06
# HELP process_open_fds Number of open file descriptors.
# TYPE process_open_fds gauge
process_open_fds 52
# HELP process_resident_memory_bytes Resident memory size in bytes.
# TYPE process_resident_memory_bytes gauge
process_resident_memory_bytes 1.11316992e+08
# HELP process_start_time_seconds Start time of the process since unix epoch in seconds.
# TYPE process_start_time_seconds gauge
process_start_time_seconds 1.71118218107e+09
# HELP process_virtual_memory_bytes Virtual memory size in bytes.
# TYPE process_virtual_memory_bytes gauge
process_virtual_memory_bytes 1.338843136e+09
# HELP process_virtual_memory_max_bytes Maximum amount of virtual memory available in bytes.
# TYPE process_virtual_memory_max_bytes gauge
process_virtual_memory_max_bytes 1.8446744073709552e+19
# HELP rest_client_request_duration_seconds Request latency in seconds. Broken down by verb, and host.
# TYPE rest_client_request_duration_seconds histogram
rest_client_request_duration_seconds_bucket{host="https://karpenter2-karpenter2test-6fb462-rcp6iqyi.hcp.koreacentral.azmk8s.io:443/api",verb="GET",le="0.005"} 1
rest_client_request_duration_seconds_bucket{host="https://karpenter2-karpenter2test-6fb462-rcp6iqyi.hcp.koreacentral.azmk8s.io:443/api",verb="GET",le="0.025"} 1
rest_client_request_duration_seconds_bucket{host="https://karpenter2-karpenter2test-6fb462-rcp6iqyi.hcp.koreacentral.azmk8s.io:443/api",verb="GET",le="0.1"} 2
rest_client_request_duration_seconds_bucket{host="https://karpenter2-karpenter2test-6fb462-rcp6iqyi.hcp.koreacentral.azmk8s.io:443/api",verb="GET",le="0.25"} 2
rest_client_request_duration_seconds_bucket{host="https://karpenter2-karpenter2test-6fb462-rcp6iqyi.hcp.koreacentral.azmk8s.io:443/api",verb="GET",le="0.5"} 2
rest_client_request_duration_seconds_bucket{host="https://karpenter2-karpenter2test-6fb462-rcp6iqyi.hcp.koreacentral.azmk8s.io:443/api",verb="GET",le="1"} 2
rest_client_request_duration_seconds_bucket{host="https://karpenter2-karpenter2test-6fb462-rcp6iqyi.hcp.koreacentral.azmk8s.io:443/api",verb="GET",le="2"} 2
rest_client_request_duration_seconds_bucket{host="https://karpenter2-karpenter2test-6fb462-rcp6iqyi.hcp.koreacentral.azmk8s.io:443/api",verb="GET",le="4"} 2
rest_client_request_duration_seconds_bucket{host="https://karpenter2-karpenter2test-6fb462-rcp6iqyi.hcp.koreacentral.azmk8s.io:443/api",verb="GET",le="8"} 2
rest_client_request_duration_seconds_bucket{host="https://karpenter2-karpenter2test-6fb462-rcp6iqyi.hcp.koreacentral.azmk8s.io:443/api",verb="GET",le="15"} 2
rest_client_request_duration_seconds_bucket{host="https://karpenter2-karpenter2test-6fb462-rcp6iqyi.hcp.koreacentral.azmk8s.io:443/api",verb="GET",le="30"} 2
rest_client_request_duration_seconds_bucket{host="https://karpenter2-karpenter2test-6fb462-rcp6iqyi.hcp.koreacentral.azmk8s.io:443/api",verb="GET",le="60"} 2
rest_client_request_duration_seconds_bucket{host="https://karpenter2-karpenter2test-6fb462-rcp6iqyi.hcp.koreacentral.azmk8s.io:443/api",verb="GET",le="+Inf"} 2
rest_client_request_duration_seconds_sum{host="https://karpenter2-karpenter2test-6fb462-rcp6iqyi.hcp.koreacentral.azmk8s.io:443/api",verb="GET"} 0.039509378
rest_client_request_duration_seconds_count{host="https://karpenter2-karpenter2test-6fb462-rcp6iqyi.hcp.koreacentral.azmk8s.io:443/api",verb="GET"} 2
rest_client_request_duration_seconds_bucket{host="https://karpenter2-karpenter2test-6fb462-rcp6iqyi.hcp.koreacentral.azmk8s.io:443/apis",verb="GET",le="0.005"} 1
rest_client_request_duration_seconds_bucket{host="https://karpenter2-karpenter2test-6fb462-rcp6iqyi.hcp.koreacentral.azmk8s.io:443/apis",verb="GET",le="0.025"} 2
rest_client_request_duration_seconds_bucket{host="https://karpenter2-karpenter2test-6fb462-rcp6iqyi.hcp.koreacentral.azmk8s.io:443/apis",verb="GET",le="0.1"} 2
rest_client_request_duration_seconds_bucket{host="https://karpenter2-karpenter2test-6fb462-rcp6iqyi.hcp.koreacentral.azmk8s.io:443/apis",verb="GET",le="0.25"} 2
rest_client_request_duration_seconds_bucket{host="https://karpenter2-karpenter2test-6fb462-rcp6iqyi.hcp.koreacentral.azmk8s.io:443/apis",verb="GET",le="0.5"} 2
rest_client_request_duration_seconds_bucket{host="https://karpenter2-karpenter2test-6fb462-rcp6iqyi.hcp.koreacentral.azmk8s.io:443/apis",verb="GET",le="1"} 2
rest_client_request_duration_seconds_bucket{host="https://karpenter2-karpenter2test-6fb462-rcp6iqyi.hcp.koreacentral.azmk8s.io:443/apis",verb="GET",le="2"} 2
rest_client_request_duration_seconds_bucket{host="https://karpenter2-karpenter2test-6fb462-rcp6iqyi.hcp.koreacentral.azmk8s.io:443/apis",verb="GET",le="4"} 2
rest_client_request_duration_seconds_bucket{host="https://karpenter2-karpenter2test-6fb462-rcp6iqyi.hcp.koreacentral.azmk8s.io:443/apis",verb="GET",le="8"} 2
rest_client_request_duration_seconds_bucket{host="https://karpenter2-karpenter2test-6fb462-rcp6iqyi.hcp.koreacentral.azmk8s.io:443/apis",verb="GET",le="15"} 2
rest_client_request_duration_seconds_bucket{host="https://karpenter2-karpenter2test-6fb462-rcp6iqyi.hcp.koreacentral.azmk8s.io:443/apis",verb="GET",le="30"} 2
rest_client_request_duration_seconds_bucket{host="https://karpenter2-karpenter2test-6fb462-rcp6iqyi.hcp.koreacentral.azmk8s.io:443/apis",verb="GET",le="60"} 2
rest_client_request_duration_seconds_bucket{host="https://karpenter2-karpenter2test-6fb462-rcp6iqyi.hcp.koreacentral.azmk8s.io:443/apis",verb="GET",le="+Inf"} 2
rest_client_request_duration_seconds_sum{host="https://karpenter2-karpenter2test-6fb462-rcp6iqyi.hcp.koreacentral.azmk8s.io:443/apis",verb="GET"} 0.021485134
rest_client_request_duration_seconds_count{host="https://karpenter2-karpenter2test-6fb462-rcp6iqyi.hcp.koreacentral.azmk8s.io:443/apis",verb="GET"} 2
rest_client_request_duration_seconds_bucket{host="https://karpenter2-karpenter2test-6fb462-rcp6iqyi.hcp.koreacentral.azmk8s.io:443/version",verb="GET",le="0.005"} 1
rest_client_request_duration_seconds_bucket{host="https://karpenter2-karpenter2test-6fb462-rcp6iqyi.hcp.koreacentral.azmk8s.io:443/version",verb="GET",le="0.025"} 1
rest_client_request_duration_seconds_bucket{host="https://karpenter2-karpenter2test-6fb462-rcp6iqyi.hcp.koreacentral.azmk8s.io:443/version",verb="GET",le="0.1"} 1
rest_client_request_duration_seconds_bucket{host="https://karpenter2-karpenter2test-6fb462-rcp6iqyi.hcp.koreacentral.azmk8s.io:443/version",verb="GET",le="0.25"} 1
rest_client_request_duration_seconds_bucket{host="https://karpenter2-karpenter2test-6fb462-rcp6iqyi.hcp.koreacentral.azmk8s.io:443/version",verb="GET",le="0.5"} 1
rest_client_request_duration_seconds_bucket{host="https://karpenter2-karpenter2test-6fb462-rcp6iqyi.hcp.koreacentral.azmk8s.io:443/version",verb="GET",le="1"} 1
rest_client_request_duration_seconds_bucket{host="https://karpenter2-karpenter2test-6fb462-rcp6iqyi.hcp.koreacentral.azmk8s.io:443/version",verb="GET",le="2"} 1
rest_client_request_duration_seconds_bucket{host="https://karpenter2-karpenter2test-6fb462-rcp6iqyi.hcp.koreacentral.azmk8s.io:443/version",verb="GET",le="4"} 1
rest_client_request_duration_seconds_bucket{host="https://karpenter2-karpenter2test-6fb462-rcp6iqyi.hcp.koreacentral.azmk8s.io:443/version",verb="GET",le="8"} 1
rest_client_request_duration_seconds_bucket{host="https://karpenter2-karpenter2test-6fb462-rcp6iqyi.hcp.koreacentral.azmk8s.io:443/version",verb="GET",le="15"} 1
rest_client_request_duration_seconds_bucket{host="https://karpenter2-karpenter2test-6fb462-rcp6iqyi.hcp.koreacentral.azmk8s.io:443/version",verb="GET",le="30"} 1
rest_client_request_duration_seconds_bucket{host="https://karpenter2-karpenter2test-6fb462-rcp6iqyi.hcp.koreacentral.azmk8s.io:443/version",verb="GET",le="60"} 1
rest_client_request_duration_seconds_bucket{host="https://karpenter2-karpenter2test-6fb462-rcp6iqyi.hcp.koreacentral.azmk8s.io:443/version",verb="GET",le="+Inf"} 1
rest_client_request_duration_seconds_sum{host="https://karpenter2-karpenter2test-6fb462-rcp6iqyi.hcp.koreacentral.azmk8s.io:443/version",verb="GET"} 0.001631895
rest_client_request_duration_seconds_count{host="https://karpenter2-karpenter2test-6fb462-rcp6iqyi.hcp.koreacentral.azmk8s.io:443/version",verb="GET"} 1
# HELP rest_client_request_size_bytes Request size in bytes. Broken down by verb and host.
# TYPE rest_client_request_size_bytes histogram
rest_client_request_size_bytes_bucket{host="karpenter2-karpenter2test-6fb462-rcp6iqyi.hcp.koreacentral.azmk8s.io:443",verb="GET",le="64"} 5
rest_client_request_size_bytes_bucket{host="karpenter2-karpenter2test-6fb462-rcp6iqyi.hcp.koreacentral.azmk8s.io:443",verb="GET",le="256"} 5
rest_client_request_size_bytes_bucket{host="karpenter2-karpenter2test-6fb462-rcp6iqyi.hcp.koreacentral.azmk8s.io:443",verb="GET",le="512"} 5
rest_client_request_size_bytes_bucket{host="karpenter2-karpenter2test-6fb462-rcp6iqyi.hcp.koreacentral.azmk8s.io:443",verb="GET",le="1024"} 5
rest_client_request_size_bytes_bucket{host="karpenter2-karpenter2test-6fb462-rcp6iqyi.hcp.koreacentral.azmk8s.io:443",verb="GET",le="4096"} 5
rest_client_request_size_bytes_bucket{host="karpenter2-karpenter2test-6fb462-rcp6iqyi.hcp.koreacentral.azmk8s.io:443",verb="GET",le="16384"} 5
rest_client_request_size_bytes_bucket{host="karpenter2-karpenter2test-6fb462-rcp6iqyi.hcp.koreacentral.azmk8s.io:443",verb="GET",le="65536"} 5
rest_client_request_size_bytes_bucket{host="karpenter2-karpenter2test-6fb462-rcp6iqyi.hcp.koreacentral.azmk8s.io:443",verb="GET",le="262144"} 5
rest_client_request_size_bytes_bucket{host="karpenter2-karpenter2test-6fb462-rcp6iqyi.hcp.koreacentral.azmk8s.io:443",verb="GET",le="1.048576e+06"} 5
rest_client_request_size_bytes_bucket{host="karpenter2-karpenter2test-6fb462-rcp6iqyi.hcp.koreacentral.azmk8s.io:443",verb="GET",le="4.194304e+06"} 5
rest_client_request_size_bytes_bucket{host="karpenter2-karpenter2test-6fb462-rcp6iqyi.hcp.koreacentral.azmk8s.io:443",verb="GET",le="1.6777216e+07"} 5
rest_client_request_size_bytes_bucket{host="karpenter2-karpenter2test-6fb462-rcp6iqyi.hcp.koreacentral.azmk8s.io:443",verb="GET",le="+Inf"} 5
rest_client_request_size_bytes_sum{host="karpenter2-karpenter2test-6fb462-rcp6iqyi.hcp.koreacentral.azmk8s.io:443",verb="GET"} 0
rest_client_request_size_bytes_count{host="karpenter2-karpenter2test-6fb462-rcp6iqyi.hcp.koreacentral.azmk8s.io:443",verb="GET"} 5
# HELP rest_client_requests_total Number of HTTP requests, partitioned by status code, method, and host.
# TYPE rest_client_requests_total counter
rest_client_requests_total{code="200",host="karpenter2-karpenter2test-6fb462-rcp6iqyi.hcp.koreacentral.azmk8s.io:443",method="GET"} 5
# HELP rest_client_response_size_bytes Response size in bytes. Broken down by verb and host.
# TYPE rest_client_response_size_bytes histogram
rest_client_response_size_bytes_bucket{host="karpenter2-karpenter2test-6fb462-rcp6iqyi.hcp.koreacentral.azmk8s.io:443",verb="GET",le="64"} 0
rest_client_response_size_bytes_bucket{host="karpenter2-karpenter2test-6fb462-rcp6iqyi.hcp.koreacentral.azmk8s.io:443",verb="GET",le="256"} 0
rest_client_response_size_bytes_bucket{host="karpenter2-karpenter2test-6fb462-rcp6iqyi.hcp.koreacentral.azmk8s.io:443",verb="GET",le="512"} 1
rest_client_response_size_bytes_bucket{host="karpenter2-karpenter2test-6fb462-rcp6iqyi.hcp.koreacentral.azmk8s.io:443",verb="GET",le="1024"} 1
rest_client_response_size_bytes_bucket{host="karpenter2-karpenter2test-6fb462-rcp6iqyi.hcp.koreacentral.azmk8s.io:443",verb="GET",le="4096"} 1
rest_client_response_size_bytes_bucket{host="karpenter2-karpenter2test-6fb462-rcp6iqyi.hcp.koreacentral.azmk8s.io:443",verb="GET",le="16384"} 3
rest_client_response_size_bytes_bucket{host="karpenter2-karpenter2test-6fb462-rcp6iqyi.hcp.koreacentral.azmk8s.io:443",verb="GET",le="65536"} 5
rest_client_response_size_bytes_bucket{host="karpenter2-karpenter2test-6fb462-rcp6iqyi.hcp.koreacentral.azmk8s.io:443",verb="GET",le="262144"} 5
rest_client_response_size_bytes_bucket{host="karpenter2-karpenter2test-6fb462-rcp6iqyi.hcp.koreacentral.azmk8s.io:443",verb="GET",le="1.048576e+06"} 5
rest_client_response_size_bytes_bucket{host="karpenter2-karpenter2test-6fb462-rcp6iqyi.hcp.koreacentral.azmk8s.io:443",verb="GET",le="4.194304e+06"} 5
rest_client_response_size_bytes_bucket{host="karpenter2-karpenter2test-6fb462-rcp6iqyi.hcp.koreacentral.azmk8s.io:443",verb="GET",le="1.6777216e+07"} 5
rest_client_response_size_bytes_bucket{host="karpenter2-karpenter2test-6fb462-rcp6iqyi.hcp.koreacentral.azmk8s.io:443",verb="GET",le="+Inf"} 5
rest_client_response_size_bytes_sum{host="karpenter2-karpenter2test-6fb462-rcp6iqyi.hcp.koreacentral.azmk8s.io:443",verb="GET"} 84242
rest_client_response_size_bytes_count{host="karpenter2-karpenter2test-6fb462-rcp6iqyi.hcp.koreacentral.azmk8s.io:443",verb="GET"} 5

capture

azure blob storage로 capture 파일을 전달

kubectl-retina

CLI 도구로 사용이 가능하며 사용시 kubernetes 의 Node 단에 capture된 패킷 덤프 파일이 남겨지게 된다.

$ kubectl-retina
Usage:
   [command]

Available Commands:
  capture     Retina Capture - capture network traffic
  completion  Generate the autocompletion script for the specified shell
  help        Help about any command
  version     Show version

Flags:
  -h, --help   help for this command

Use " [command] --help" for more information about a command.

https://retina.sh/docs/captures/cli

node 가 3개일 경우 3개의 job이 생성

$ kubectl retina capture list --namespace capture
NAMESPACE   CAPTURE NAME           JOBS                                                                               COMPLETIONS   AGE
capture     retina-capture-n4h8m   retina-capture-n4h8m-ffkpp,retina-capture-n4h8m-q75gj,retina-capture-n4h8m-zwfpp   3/3           5h40m
$ k get job -A
NAMESPACE   NAME                         COMPLETIONS   DURATION   AGE
capture     retina-capture-n4h8m-ffkpp   1/1           84s        5h40m
capture     retina-capture-n4h8m-q75gj   1/1           84s        5h40m
capture     retina-capture-n4h8m-zwfpp   1/1           87s        5h40m
$ k get pod -n capture
NAME                               READY   STATUS      RESTARTS   AGE
retina-capture-n4h8m-ffkpp-x8kr9   0/1     Completed   0          5h40m
retina-capture-n4h8m-q75gj-tddqc   0/1     Completed   0          5h40m
retina-capture-n4h8m-zwfpp-nf5dz   0/1     Completed   0          5h40m

실제 노드로 접근(e.g. node-shell)하여 확인해보면 다음과 같은 경로에 tar 파일이 존재한다.

root@aks-nodepool1-26826537-vmss000006:/mnt/capture# ls -al
total 464
drwxr-xr-x 2 root root   4096 Mar 22 03:03 .
drwxr-xr-x 4 root root   4096 Mar 22 03:01 ..
-rw-r--r-- 1 root root 464849 Mar 22 03:03 retina-capture-n4h8m-aks-nodepool1-26826537-vmss000006-20240322030207UTC.tar.gz

custom build

https://retina.sh/docs/contributing/developing#environment-config
사전에 llvm 설치 필요.

$ sudo apt install llvm clang -y

이후 아래 명령을 통해 build 를 수행할수 있다.

$ make retina-binary

개인적으로 생각한 이점들

  • 특정 CNI에 종석되어 있지 않다. eBPF를 사용하여 CNI에 제한없이 사용이 가능
  • packet dump를 하고 이를 다양한 목적지로 보내줄수 있다.
  • 단순 packet dump 만이 아닌 socket 정보 부터 arp, iptables 정보등 다양한 네트워크 단의 정보를 확인해볼수 있도록 해준다.

Reference

'Cloud > Kubernetes' 카테고리의 다른 글

ingress-nginx  (0) 2025.07.14
fluentbit with azure blob storage  (0) 2024.08.27
kubernetes_sd_config on Prometheus  (0) 2023.06.13
Gatekeeper monitoring and logging  (0) 2023.05.26
Eraser (Image cleaner)  (0) 2023.04.25
댓글
공지사항
최근에 올라온 글
최근에 달린 댓글
Total
Today
Yesterday
링크
«   2025/09   »
1 2 3 4 5 6
7 8 9 10 11 12 13
14 15 16 17 18 19 20
21 22 23 24 25 26 27
28 29 30
글 보관함