Splunk Cloud Platform

Drop messages when using OTEL collector in Kubernetes

d_kazakov
Observer

Hello everyone!

I'm using the Splunk OpenTelemetry collector to send logs from k8s to Splunk through HEC input. It's running as DaemonSet.

The collector is deployed via Helm Chart: https://github.com/signalfx/splunk-otel-collector-chart

I would like to exclude logs with specific string, for example: "Connection reset by peer", but cannot find the configuration that would be able to do that. It looks like the processors can do that: 

https://opentelemetry.io/docs/collector/configuration/#processors

And also there is a default configuration for opentelemetry in the chart, but I cannot understand how to add filter to it:

https://github.com/signalfx/splunk-otel-collector-chart/blob/main/helm-charts/splunk-otel-collector/...

Has anyone encountered such issue or do you have any advices for this case?

Labels (1)
0 Karma

dhimanv
Loves-to-Learn Lots

Thank you @d_kazakov  for response.

I was looking for solution like if a log entry contains a specific string, then that entire log entry should be excluded to push to Splunk indexer.

Let me check if this solution work in that case or need to alert it.

0 Karma

d_kazakov
Observer

In this case, you can update filters like this:

gateway:

  enabled: true

  resources:

    requests:

      cpu: 100m

      memory: 500Mi

    limits: memory: 500Mi

  replicaCount: 1

  config:

    processors:

      filter/filter:

        logs:

          log_record:

          - 'IsMatch(body, ".*bot.*") == false'

     service:

       pipelines:

         logs:

           processors:

             - filter/filter

 This way, when data is coming to the gateway it will be filtered an all log entries with "bot" in the body will be removed.

BTW, previous configuration also must be under gateway

0 Karma

dhimanv
Loves-to-Learn Lots

I am also looking for something like this. Does anyone tried to do this and is that worked?

0 Karma

d_kazakov
Observer

Hey, dhimanv!

I've managed to achieve it. Splunk OnDemand request assisted with this issue. So there are a couple of options, but in my case, these filters worked to cut some fields in the JSON body to decrease the amount of GB we ingest:

logsCollection:
containers:
enabled: true
useSplunkIncludeAnnotation: true
extraOperators:
- type: router
default: noop-router
routes:
- expr: body contains "timestamp" and attributes.log matches "^{.*}$"
output: remove-nginx-keys
- expr: body contains "timestamp" and attributes.log matches "^{.*}\\n$"
output: remove-nginx-keys
- type: json_parser
id: remove-nginx-keys
parse_from: attributes.log
parse_to: attributes.log
- type: remove
field: 'attributes.log.cf_ray'
on_error: send
- type: remove
field: 'attributes.log.proxyUpstreamName'
on_error: send
- type: remove
field: 'attributes.log.proxyAlternativeUpstreamName'
on_error: send
- type: remove
field: 'attributes.log.upstreamAddr'
on_error: send
- type: remove
field: 'attributes.log.upstreamStatus'
on_error: send
- type: remove
field: 'attributes.log.requestID'
on_error: send
- id: noop-router
type: noop
 
So the JSON goes from:
{"timestamp": "2023-12-20T10:05:17+00:00", "requestID": "ID", "proxyUpstreamName": "service-name", "proxyAlternativeUpstreamName": "","upstreamStatus": "200", "upstrea
mAddr": "IP:4444", "Host": "DNS", "httpRequest":{"requestMethod": "POST", "requestUrl": "/request", "status": 200, "requestSize": "85", "responseSize": "14", "userAgent": "Google", "remoteIp": "IP", "referer": "", "latency": "0.003 s", "protocol": "HTTP/2.0"}, "cf_ray":
"1239kvksad2139kc923"}
 
To:
 
{ [-]
Host: web.web.eu
httpRequest: { [-]
latency: 0.092 s
protocol: HTTP/1.1
referer: referer
remoteIp: IP
requestMethod: GET
requestSize: 834
requestUrl: /request
responseSize: 133
status: 200
userAgent: agent
}
timestamp: 2023-12-20T10:05:08+00:00
}
 
 
Hope this helps!
 
0 Karma
Get Updates on the Splunk Community!

Optimize Cloud Monitoring

  TECH TALKS Optimize Cloud Monitoring Tuesday, August 13, 2024  |  11:00AM–12:00PM PST   Register to ...

What's New in Splunk Cloud Platform 9.2.2403?

Hi Splunky people! We are excited to share the newest updates in Splunk Cloud Platform 9.2.2403! Analysts can ...

Stay Connected: Your Guide to July and August Tech Talks, Office Hours, and Webinars!

Dive into our sizzling summer lineup for July and August Community Office Hours and Tech Talks. Scroll down to ...