We are having difficulty getting exclusions of logs that have fields in Camelcase or have entries that have special characters related to OTEL logs. Fields without capitalization and/or special character values are able to be parsed out, but not others.
Here is an example log that we are looking at (attached as yaml and key portion).
filelog/kube-apiserver-audit-log:
include:
- /var/log/kubernetes/kube-apiserver.log
include_file_name: false
include_file_path: true
operators:
- id: extract-audit-group
type: regex_parser
regex: '\s*\"resourceGroup\"\s*\:\s*\"(?P<extracted_group>[^\"]+)\"\s*'
- id: filter-group
type: filter
expr: 'attributes.extracted_beta == "batch"'
- id: remove-extracted-group
type: remove
field: attributes.extracted_group
- id: extract-audit-api
type: regex_parser
regex: '\"level\"\:\"(?P<extracted_audit_beta>[^\"]+)\"'
- id: filter-api
type: filter
expr: 'attributes.extracted_audit_beta == "Metadata"'
- id: remove-extracted-api
type: remove
field: attributes.extracted_api
- id: extract-audit-verb
type: regex_parser
regex: '\"verb\"\:\"(?P<extracted_verb>[^\"]+)\"'
- id: filter-verb
type: filter
expr: 'attributes.extracted_verb == "watch" || attributes.extracted_verb == "list"'
- id: remove-extracted-verb
type: remove
field: attributes.extracted_verb
The resourceGroup field is compared to something else and failing, verb and level are succeeding.
Here is an example log that would be pulled in.
{"apiVersion":"batch/v1","component":"sync-agent","eventType":"MODIFIED","kind":"CronJob","level":"info","msg":"sent event","name":"agentupdater-workload","namespace":"vmware-system-tmc","resourceGroup":"batch","resourceType":"cronjobs","resourceVersion":"v1","time":"2024-03-14T18:17:11Z"}
Hey @padresman
Will try your example. Gotta be very careful that your expression fields match the capture group you use, as it will store it in "attributes."capture group value" by default.
Also, make sure to use golang regex on regex101. though your regex appears to be fine.
Also its wise to iterate and NOT remove the fields you make to see what they look like when they arrive at splunk. Can help make sure your value is what you think it is.....
This is a little confusing. I do not see special characters in field values in the provided sample. But I see a mismatch between operators about resourceGroup. I assume that extract-audit-group and filter-group are intended to match resourceGroup. Is this correct? In the following snippets, extract-audit-group extracts a variable named extracted_group, whereas filter-group calls for one named attributes.extracted_beta. Maybe filter-group should use extracted_group instead?
- id: extract-audit-group type: regex_parser regex: '\s*\"resourceGroup\"\s*\:\s*\"(?P<extracted_group>[^\"]+)\"\s*' - id: filter-group type: filter expr: 'attributes.extracted_beta == "batch"'
Thanks for the response yuanliu, much appreciated, and sorry for the confusion. You're right that those fields should match up - it should look like the following:
- id: extract-audit-group
type: regex_parser
regex: '\"resourceGroup\"\:\"(?P<extracted_group>[^\"]+)\"'
- id: filter-group
type: filter
expr: 'attributes.extracted_group == "batch"'
- id: remove-extracted-group
type: remove
field: attributes.extracted_group
The Id field can be named just about anything, so difference among names there doesn't matter. We've gone through quite a few iterations of testing which is why there was a discrepancy there. What we have narrowed down the problem in our testing is either the camelCase is causing a regex issue with the field, or special characters within a value are causing an issue (or both, my hunch is that it is the camelCase, but we haven't had success with either). Putting these results into a regex RE2 parser gets the results we expect, but not with the actual deployed OTEL.