In GKE, I have a pod with two containers. They use the same image, and the only difference is that I pass them slightly different flags. One works fine, the other goes in a crash cycle. How can I debug the cause of the failure?
Defining My Swap
apiVersion: v1 kind: ReplicationController metadata: name: doorman-client spec: replicas: 10 selector: app: doorman-client template: metadata: name: doorman-client labels: app: doorman-client spec: containers: - name: doorman-client-proportional resources: limits: cpu: 10m image: gcr.io/google.com/doorman/doorman-client:v0.1.1 command: - client - -port=80 - -count=50 - -initial_capacity=15 - -min_capacity=5 - -max_capacity=2000 - -increase_chance=0.1 - -decrease_chance=0.05 - -step=5 - -resource=proportional - -addr=$(DOORMAN_SERVICE_HOST):$(DOORMAN_SERVICE_PORT_GRPC) - -vmodule=doorman_client=2 - --logtostderr ports: - containerPort: 80 name: http - name: doorman-client-fair resources: limits: cpu: 10m image: gcr.io/google.com/doorman/doorman-client:v0.1.1 command: - client - -port=80 - -count=50 - -initial_capacity=15 - -min_capacity=5 - -max_capacity=2000 - -increase_chance=0.1 - -decrease_chance=0.05 - -step=5 - -resource=fair - -addr=$(DOORMAN_SERVICE_HOST):$(DOORMAN_SERVICE_PORT_GRPC) - -vmodule=doorman_client=2 - --logtostderr ports: - containerPort: 80 name: http
kubectl describe gives me the following:
6:06 [0] (szopa szopa-macbookpro):~/GOPATH/src/github.com/youtube/doorman$ kubectl describe pod doorman-client-tylba Name: doorman-client-tylba Namespace: default Image(s): gcr.io/google.com/doorman/doorman-client:v0.1.1,gcr.io/google.com/doorman/doorman-client:v0.1.1 Node: gke-doorman-loadtest-d75f7d0f-node-k9g6/10.240.0.4 Start Time: Sun, 21 Feb 2016 16:05:42 +0100 Labels: app=doorman-client Status: Running Reason: Message: IP: 10.128.4.182 Replication Controllers: doorman-client (10/10 replicas created) Containers: doorman-client-proportional: Container ID: docker://0bdcb8269c5d15a4f99ccc0b0ee04bf3e9fd0db9fd23e9c0661e06564e9105f7 Image: gcr.io/google.com/doorman/doorman-client:v0.1.1 Image ID: docker://a603248608898591c84216dd3172aaa7c335af66a57fe50fd37a42394d5631dc QoS Tier: cpu: Guaranteed Limits: cpu: 10m Requests: cpu: 10m State: Running Started: Sun, 21 Feb 2016 16:05:42 +0100 Ready: True Restart Count: 0 Environment Variables: doorman-client-fair: Container ID: docker://92fea92f1307b943d0ea714441417d4186c5ac6a17798650952ea726d18dba68 Image: gcr.io/google.com/doorman/doorman-client:v0.1.1 Image ID: docker://a603248608898591c84216dd3172aaa7c335af66a57fe50fd37a42394d5631dc QoS Tier: cpu: Guaranteed Limits: cpu: 10m Requests: cpu: 10m State: Running Started: Sun, 21 Feb 2016 16:06:03 +0100 Last Termination State: Terminated Reason: Error Exit Code: 0 Started: Sun, 21 Feb 2016 16:05:43 +0100 Finished: Sun, 21 Feb 2016 16:05:44 +0100 Ready: False Restart Count: 2 Environment Variables: Conditions: Type Status Ready False Volumes: default-token-ihani: Type: Secret (a secret that should populate this volume) SecretName: default-token-ihani Events: FirstSeen LastSeen Count From SubobjectPath Reason Message ───────── ──────── ───── ──── ───────────── ────── ─────── 29s 29s 1 {scheduler } Scheduled Successfully assigned doorman-client-tylba to gke-doorman-loadtest-d75f7d0f-node-k9g6 29s 29s 1 {kubelet gke-doorman-loadtest-d75f7d0f-node-k9g6} implicitly required container POD Pulled Container image "gcr.io/google_containers/pause:0.8.0" already present on machine 29s 29s 1 {kubelet gke-doorman-loadtest-d75f7d0f-node-k9g6} implicitly required container POD Created Created with docker id 5013851c67d9 29s 29s 1 {kubelet gke-doorman-loadtest-d75f7d0f-node-k9g6} implicitly required container POD Started Started with docker id 5013851c67d9 29s 29s 1 {kubelet gke-doorman-loadtest-d75f7d0f-node-k9g6} spec.containers{doorman-client-proportional} Created Created with docker id 0bdcb8269c5d 29s 29s 1 {kubelet gke-doorman-loadtest-d75f7d0f-node-k9g6} spec.containers{doorman-client-proportional} Started Started with docker id 0bdcb8269c5d 29s 29s 1 {kubelet gke-doorman-loadtest-d75f7d0f-node-k9g6} spec.containers{doorman-client-fair} Created Created with docker id ed0928176958 29s 29s 1 {kubelet gke-doorman-loadtest-d75f7d0f-node-k9g6} spec.containers{doorman-client-fair} Started Started with docker id ed0928176958 28s 28s 1 {kubelet gke-doorman-loadtest-d75f7d0f-node-k9g6} spec.containers{doorman-client-fair} Created Created with docker id 0a73290085b6 28s 28s 1 {kubelet gke-doorman-loadtest-d75f7d0f-node-k9g6} spec.containers{doorman-client-fair} Started Started with docker id 0a73290085b6 18s 18s 1 {kubelet gke-doorman-loadtest-d75f7d0f-node-k9g6} spec.containers{doorman-client-fair} Backoff Back-off restarting failed docker container 8s 8s 1 {kubelet gke-doorman-loadtest-d75f7d0f-node-k9g6} spec.containers{doorman-client-fair} Started Started with docker id 92fea92f1307 29s 8s 4 {kubelet gke-doorman-loadtest-d75f7d0f-node-k9g6} spec.containers{doorman-client-fair} Pulled Container image "gcr.io/google.com/doorman/doorman-client:v0.1.1" already present on machine 8s 8s 1 {kubelet gke-doorman-loadtest-d75f7d0f-node-k9g6} spec.containers{doorman-client-fair} Created Created with docker id 92fea92f1307
As you can see, the exit code is zero, and the message is "Error", which is not very useful.
I tried:
reordering definitions (the first is always running, the second is always failing).
change of used ports for different (no effect)
changing the name of the ports for different (no effect).
source share