Activity Feed
- Posted Re: how to parse Events in splunk for more useful dashboard panels on Dashboards & Visualizations. 01-11-2022 04:13 AM
- Posted Re: how to parse Events in splunk for more useful dashboard panels on Dashboards & Visualizations. 01-11-2022 12:23 AM
- Posted Re: how to parse Events in splunk for more useful dashboard panels on Dashboards & Visualizations. 01-09-2022 09:37 PM
- Posted Re: how to parse Events in splunk for more useful dashboard panels on Dashboards & Visualizations. 01-06-2022 07:11 AM
- Karma Re: how to parse Events in splunk for more useful dashboard panels for gcusello. 01-06-2022 07:11 AM
- Posted Re: how to parse Events in splunk for more useful dashboard panels on Dashboards & Visualizations. 01-03-2022 07:13 AM
- Karma Re: how to parse Events in splunk for more useful dashboard panels for gcusello. 01-03-2022 07:01 AM
- Karma Re: how to parse Events in splunk for more useful dashboard panels for isoutamo. 01-03-2022 07:01 AM
- Posted Re: how to parse Events in splunk for more useful dashboard panels on Dashboards & Visualizations. 12-30-2021 12:17 AM
- Posted how to parse Events in splunk for more useful dashboard panels on Dashboards & Visualizations. 12-29-2021 11:10 PM
Topics I've Started
Subject | Karma | Author | Latest Post |
---|---|---|---|
0 |
01-11-2022
04:13 AM
Thanks @gcusello Just understand that my firm doesnt allow to install external apps to achieve this. Shall i request for a search query in secure-splunk itself to achieve this? Like i explained earlier, Just need a stats for pod health from the events already present in splunk.
... View more
01-11-2022
12:23 AM
Thanks @isoutamo I have already got each fields extracted. Now, i have challenges with filtering out exact events related to pod and make the stats of pod/containers failures/create statistics in splunk like we get from 'kubectl get events'. @gcusello Do we have any apps/TAs to analyze and monitor k8s logs? All i need is a stats/table of events related to pod(with failures,created, imagepullbackoff, etc..) and of kind=Repliacaset/statefulset/deployment. Attached the raw events from one of the kubernetes cluster.
... View more
01-09-2022
09:37 PM
Appreciate any suggestions on this please.
... View more
01-06-2022
07:11 AM
Here are some events with Failed/SuccessfulCreate. But the challenge is that we need to filter out and make a stats of the events 'Failed/SuccessfulCreate' of kind= Replicaset/statefulset/Deployment/Daemonset. Attached the raw events from one of the kubernetes cluster. The basic idea is get the stats of pod/containers failures/create statistics in splunk like we get from 'kubectl get events' <135>Jan 6 10:39:26 control1.ai1-dev.dd.k8s.c0.ms.com kubernetes.var.log.containers.ku: namespace_name=openshift-kube-controller-manager, container_name=kube-controller-manager, pod_name=kube-controller-manager-control1.ai1-dev.dd.k8s.c0.ms.com, message=I0106 10:38:56.512561 1 event.go:291] "Event occurred" object="clp-monitoring/loki-distributed-gateway-6bcfd9dc99" kind="ReplicaSet" apiVersion="apps/v1" type="Warning" reason="FailedCreate" message="Error creating: admission webhook \"endorse-validating-webhook.ai1-dev.dd.k8s.c0.ms.com\" denied the request: Denying image infra1.kod.ms.com:5000/nginxinc/nginx-unprivileged:1.19-alpine from unrecognized image registry infra1.kod.ms.com:5000." <135>Jan 6 10:39:26 control1.ai1-dev.dd.k8s.c0.ms.com kubernetes.var.log.containers.ku: namespace_name=openshift-kube-controller-manager, container_name=kube-controller-manager, pod_name=kube-controller-manager-control1.ai1-dev.dd.k8s.c0.ms.com, message=I0106 10:38:56.500812 1 event.go:291] "Event occurred" object="loki-distributed/loki-loki-distributed-gateway-599d76c47c" kind="ReplicaSet" apiVersion="apps/v1" type="Warning" reason="FailedCreate" message="Error creating: pods \"loki-loki-distributed-gateway-599d76c47c-\" is forbidden: unable to validate against any security context constraint: [provider restricted: .spec.securityContext.fsGroup: Invalid value: []int64{1001160000}: 1001160000 is not an allowed group spec.containers[0].securityContext.runAsUser: Invalid value: 1001160000: must be in the ranges: [1001040000, 1001049999]]" <135>Jan 6 10:39:26 control1.ai1-dev.dd.k8s.c0.ms.com kubernetes.var.log.containers.ku: namespace_name=openshift-kube-controller-manager, container_name=kube-controller-manager, pod_name=kube-controller-manager-control1.ai1-dev.dd.k8s.c0.ms.com, message=I0106 10:38:56.499675 1 event.go:291] "Event occurred" object="loki-distributed/loki-loki-distributed-distributor-c886b96fc" kind="ReplicaSet" apiVersion="apps/v1" type="Warning" reason="FailedCreate" message="Error creating: pods \"loki-loki-distributed-distributor-c886b96fc-\" is forbidden: unable to validate against any security context constraint: [provider restricted: .spec.securityContext.fsGroup: Invalid value: []int64{1001160000}: 1001160000 is not an allowed group spec.containers[0].securityContext.runAsUser: Invalid value: 1001160000: must be in the ranges: [1001040000, 1001049999]]" <135>Jan 6 10:36:51 control1.app9.hz.k8s.c0.ms.com kubernetes.var.log.containers.ku: namespace_name=openshift-kube-controller-manager, container_name=kube-controller-manager, pod_name=kube-controller-manager-control1.app9.hz.k8s.c0.ms.com, message=I0106 10:36:10.686055 1 event.go:291] "Event occurred" object="tigera-dex/tigera-dex-9d895b785" kind="ReplicaSet" apiVersion="apps/v1" type="Normal" reason="SuccessfulCreate" message="Created pod: tigera-dex-9d895b785-9jdgv" <135>Jan 6 10:19:08 control3.stepping-stone1-dev.dd.k8s.c0.ms.com kubernetes.var.log.containers.ku: namespace_name=openshift-kube-controller-manager, container_name=kube-controller-manager, pod_name=kube-controller-manager-control3.stepping-stone1-dev.dd.k8s.c0.ms.com, message=I0106 10:18:48.721499 1 event.go:291] "Event occurred" object="git-mirror/git-mirror-morgan-stanley-cloud-git-mirror-0" kind="Pod" apiVersion="v1" type="Warning" reason="FailedAttachVolume" message="AttachVolume.Attach failed for volume \"pvc-9361ced0-07fe-4212-9e7d-9efdc6369fd0\" : CSINode dd9002c17n1.nodes.c0.ms.com does not contain driver csi.trident.netapp.io" <135>Jan 6 14:04:23 control3.ai2-dev.dd.k8s.c0.ms.com fluentd: docker:{"container_id"=>"cd60f994892219216651d53275d0eb4a1d1fee53cfd6f4ba50c48711297ee0d3"} kubernetes:{"container_name"=>"kube-controller-manager", "namespace_name"=>"openshift-kube-controller-manager", "pod_name"=>"kube-controller-manager-control3.ai2-dev.dd.k8s.c0.ms.com", "pod_id"=>"8429ce46-b305-4691-9258-98a7acb24e39", "host"=>"control3.ai2-dev.dd.k8s.c0.ms.com", "master_url"=>"https://kubernetes.default.svc", "namespace_id"=>"13d0f6f3-67a7-4f90-90b5-20f0311a4c9c", "namespace_labels"=>{"openshift_io/cluster-monitoring"=>"true", "openshift_io/run-level"=>"0"}, :flat_labels=>["app=kube-controller-manager", "kube-controller-manager=true", "revision=15"]} message:I0106 14:04:21.415065 1 event.go:291] "Event occurred" object="cps/prometheus-xiaomin-test-o11y-prometheus-server-6c65f45c79" kind="ReplicaSet" apiVersion="apps/v1" type="Warning" reason="FailedCreate" message="Error creating: pods \"prometheus-xiaomin-test-o11y-prometheus-server-6c65f45c79-\" is forbidden: unable to validate against any security context constraint: [provider restricted: .spec.securityContext.fsGroup: Invalid value: []int64{65534}: 65534 is not an allowed group pod.metadata.annotations.seccomp.security.alpha.kubernetes.io/pod: Forbidden: seccomp may not be set spec.containers[0].securityContext.runAsUser: Invalid value: 65535: must be in the ranges: [1000840000, 1000849999] pod.metadata.annotations.container.seccomp.security.alpha.kubernetes.io/o11y-prometheus-server: Forbidden: seccomp may not be set]" level:unknown hostname:control3.ai2-dev.dd.k8s.c0.ms.com pipeline_metadata:{"collector"=>{"ipaddr4"=>"10.85.166.220", "inputname"=>"fluent-plugin-systemd", "name"=>"fluentd", "received_at"=>"2022-01-06T14:04:22.323401+00:00", "version"=>"1.7.4 1.6.0"}} @timestamp:2022-01-06T14:04:21.415092+00:00 viaq_index_name:infra-write viaq_msg_id:ZThjZjliMzYtZWY4NS00N2FmLWE5MTgtOGRmMTY4NWQ1MmMw
... View more
01-03-2022
07:13 AM
@gcusello @isoutamo Thanks a lot. It helped to filter out the following info(views into failed mounts and types of failures(views into failed mounts and types of failures). However, im scratching my head to get the following info from the events, but not getting any clue to filter it out. Realtime views around created/started containers/pod and failures Realtime views on image pulls, success, backoffs, failures, denies How can i attach the events list csv file here?
... View more
12-30-2021
12:17 AM
Thanks much @gcusello Yeah, your understand is correct. Need to extract the required fields(namespace, last seen, type, reason, object, message) from the sample log and use those fields to create different new panels. The search query which you provided seems promising and helpful. Does below query looks good? index=log-135473-prod event.go NOT "l0.ms.com" | rex field=_raw ^\<\d+\>(?<last_seen>\w+\s+\d+\s+\d+:\d+:\d+).*namespace_name\=(?<namespace_name>[^,]+),\s+container_name\=(?<container_name>[^,]+),\s+pod_name\=(?<pod_name>[^,]+),\s+message\=(?<message1>[^\]]+).*object\=\"(?<object>[^\"]+)\"\s+kind\=\"(?<kind>[^\"]+)\"\s+apiVersion\=\"(?<apiVersion>[^\"]+)\"\s+type\=\"(?<type>[^\"]+)\"\s+reason\=\"(?<reason>[^\"]+)\"\s+message\=\"(?<message2>[^\"]+)\" Also, How can i get new fields named cluster_namespace= openshift-logging and cluster_podname=elasticsearch-im-infra from below field? considering "/" as a separator here object="openshift-logging/elasticsearch-im-infra"
... View more
12-29-2021
11:10 PM
Currently it's difficult to parse out the details of Cluster events in Splunk, to enable more useful Dashboard panels. Looking for suggestions to figure out a way to extract from the splunk event.go events, the columns that we would see when we run "oc get events" on a cluster; namespace, last seen, type, reason, object, message. Once we can extract those fields and make available as variables for splunk stats/tables/timechart, we can put some useful panels together to gauge plant health. Realtime views around created/started containers/pod and failures Realtime views around job start/failure/complete Realtime views into failed mounts and types of failures Realtime views on image pulls, success, backoffs, failures, denies Appreciate the help with any docs/leads and high level ideas to achieve this please. Sample Events: Time Event 12/30/21 1:59:07.000 AM <135>Dec 30 06:59:07 9000n2.nodes.com kubernetes.var.log.containers.ku: namespace_name=openshift-kube-controller-manager, container_name=kube-controller-manager, pod_name=kube-controller-manager-9000n2.nodes.com, message=I1230 06:58:56.139184 1 event.go:291] "Event occurred" object="openshift-logging/elasticsearch-im-infra" kind="CronJob" apiVersion="batch/v1beta1" type="Warning" reason="FailedNeedsStart" message="Cannot determine if job needs to be started: too many missed start time (> 100). Set or decrease .spec.startingDeadlineSeconds or check clock skew" host = laas-agent-log-forwarder-6dddb6d69c-95t4b source = /namespace/openshift-kube-controller-manager sourcetype = ocpprod.stepping-infra-openshift-kube-controller-manager:application 12/30/21 1:59:07.000 AM <135>Dec 30 06:59:07 9000n2.nodes.com kubernetes.var.log.containers.ku: namespace_name=openshift-kube-controller-manager, container_name=kube-controller-manager, pod_name=kube-controller-manager-9000n2.nodes.com, message=I1230 06:58:56.133312 1 event.go:291] "Event occurred" object="openshift-logging/elasticsearch-im-audit" kind="CronJob" apiVersion="batch/v1beta1" type="Warning" reason="FailedNeedsStart" message="Cannot determine if job needs to be started: too many missed start time (> 100). Set or decrease .spec.startingDeadlineSeconds or check clock skew" host = laas-agent-log-forwarder-6dddb6d69c-95t4b source = /namespace/openshift-kube-controller-manager sourcetype = ocpprod.stepping-infra-openshift-kube-controller-manager:application
... View more