Looking for props.conf / transforms.conf configuration guidance.
The aim is to search logs from a HTTP Event Collector the same way we search for regular logs. Don't want to search JSON in the search heads.
We're in the process of migrating from Splunk Forwarders to logging-operator in k8s. Thing is, Splunk Forwarder uses log files and standard indexer discovery whereas logging-operator uses stdout/stderr and must output to an HEC endpoint, meaning the logs arrive as JSON at the heavy forwarder.
We want to use Splunk the same way we did over the years and want to avoid adapting alerts/dashboards etc to the new JSON source
OLD CONFIG AIMED TO THE INDEXERS (using the following config we get environment/site/node/team/pod as search-time extraction fields)
[vm.container.meta]
# source: /data/nodes/env1/site1/host1/logs/team1/env1/pod_name/localhost_access_log.log
CLEAN_KEYS = 0
REGEX = \/.*\/.*\/(.*)\/(.*)\/(.*)\/.*\/(.*)\/.*\/(.*)\/
FORMAT = environment::$1 site::$2 node::$3 team::$4 pod::$5
SOURCE_KEY = MetaData:Source
WRITE_META = true
SAMPLE LOG USING logging-operator
{
"log": "ts=2024-10-15T15:22:44.548Z caller=scrape.go:1353 level=debug component=\"scrape manager\" scrape_pool=kubernetes-pods target=http://1.1.1.1:8050/_api/metrics msg=\"Scrape failed\" err=\"Get \\\"http://1.1.1.1:8050/_api/metrics\\\": dial tcp 1.1.1.1:8050: connect: connection refused\"\n",
"stream": "stderr",
"time": "2024-10-15T15:22:44.548801729Z",
"environment": "env1",
"node": "host1",
"pod": "pod_name",
"site": "site1",
"team": "team1"
}
The only way you can do is to use /services/collector/raw endpoint.
I understand the desire to maintain your existing Splunk setup, I would advise against using the raw endpoint (/services/collector/raw) to transform the JSON logs back into regular log format. This approach would unnecessarily increase system load and complexity.
Instead, the best practice is to use the existing event endpoint (/services/collector/event) for ingesting data into Splunk. This is optimized for handling structured data like JSON and is more efficient.
I recommend adjusting your alerts and dashboards to work with the new JSON structure from logging-operator. While this may require some initial effort, it's a more sustainable approach in the long run:
I don't think that's the issue here. The same payload sent to the /raw endpoint would end up looking the same. It's the source formatting the data differently than before.
raw endpoint is not an option because it is not supported by the logging-operator