Splunk Enterprise

transform ingested HEC JSON log as regular log

PT_crusher
Explorer

Looking for props.conf / transforms.conf configuration guidance.

The aim is to search logs from a HTTP Event Collector the same way we search for regular logs. Don't want to search JSON in the search heads.

We're in the process of migrating from Splunk Forwarders to logging-operator in k8s. Thing is, Splunk Forwarder uses log files and standard indexer discovery whereas logging-operator uses stdout/stderr and must output to an HEC endpoint, meaning the logs arrive as JSON at the heavy forwarder.

We want to use Splunk the same way we did over the years and want to avoid adapting alerts/dashboards etc to the new JSON source

OLD CONFIG AIMED TO THE INDEXERS (using the following config we get environment/site/node/team/pod as search-time extraction fields)

 

[vm.container.meta]
# source: /data/nodes/env1/site1/host1/logs/team1/env1/pod_name/localhost_access_log.log
CLEAN_KEYS = 0
REGEX = \/.*\/.*\/(.*)\/(.*)\/(.*)\/.*\/(.*)\/.*\/(.*)\/
FORMAT = environment::$1 site::$2 node::$3 team::$4 pod::$5
SOURCE_KEY = MetaData:Source
WRITE_META = true

 


SAMPLE LOG USING logging-operator

 

{
"log": "ts=2024-10-15T15:22:44.548Z caller=scrape.go:1353 level=debug component=\"scrape manager\" scrape_pool=kubernetes-pods target=http://1.1.1.1:8050/_api/metrics msg=\"Scrape failed\" err=\"Get \\\"http://1.1.1.1:8050/_api/metrics\\\": dial tcp 1.1.1.1:8050: connect: connection refused\"\n",
"stream": "stderr",
"time": "2024-10-15T15:22:44.548801729Z",
"environment": "env1",
"node": "host1",
"pod": "pod_name",
"site": "site1",
"team": "team1"
}

 

Labels (2)
Tags (1)
0 Karma

sainag_splunk
Splunk Employee
Splunk Employee

The only way you can do is to use  /services/collector/raw endpoint.

I understand the desire to maintain your existing Splunk setup, I would advise against using the raw endpoint (/services/collector/raw) to transform the JSON logs back into regular log format. This approach would unnecessarily increase system load and complexity.

Instead, the best practice is to use the existing event endpoint (/services/collector/event) for ingesting data into Splunk. This is optimized for handling structured data like JSON and is more efficient.

I recommend adjusting your alerts and dashboards to work with the new JSON structure from logging-operator. While this may require some initial effort, it's a more sustainable approach in the long run:

  1. Update your search queries to use JSON-specific commands like spath or KV_MODE=JSON to extract fields.
  2. Modify dashboards to reference the new JSON field names.
  3. Adjust alerts to use the appropriate JSON fields and structure.




0 Karma

PickleRick
SplunkTrust
SplunkTrust

I don't think that's the issue here. The same payload sent to the /raw endpoint would end up looking the same. It's the source formatting the data differently than before.

0 Karma

PT_crusher
Explorer

raw endpoint is not an option because it is not supported by the logging-operator

0 Karma
Get Updates on the Splunk Community!

Enterprise Security Content Update (ESCU) | New Releases

In December, the Splunk Threat Research Team had 1 release of new security content via the Enterprise Security ...

Why am I not seeing the finding in Splunk Enterprise Security Analyst Queue?

(This is the first of a series of 2 blogs). Splunk Enterprise Security is a fantastic tool that offers robust ...

Index This | What are the 12 Days of Splunk-mas?

December 2024 Edition Hayyy Splunk Education Enthusiasts and the Eternally Curious!  We’re back with another ...