Hello all,
We are sending some JSON files using HEC (raw endpoint), where a file contains some metadata at the beginning (see below). We want this metadata to be present in ALL events of said file. Basically, we want to prevent having common data repeated in each event in the JSON.
We already tried creating a regex that extracts some fields, but it will add those fields on one event only, not on all.
The JSONs looks like this:
{
"metadata": {
"job_id": "11234",
"project": "Platform",
"variant": "default",
"date": "26.06.2023"
},
"data":
{
"ID": "1",
"type": "unittest",
"status": "SUCCESS",
"identified": 123
},
{
"ID": "2",
"type": "unittest",
"status": "FAILED",
"identified": 500
},
{
"ID": "3",
"type": "unittest",
"status": "SUCCESS",
"identified": 560
}
}
We want to "inject" the metadata attributes into each event, so we expect to have a table like this:
job_id
project
variant
date
ID
type
status
identified
11234
Platform
default
26.06.2023
1
unittest
SUCCESS
123
11234
Platform
default
26.06.2023
2
unittest
FAILED
500
11234
Platform
default
26.06.2023
3
unittest
SUCCESS
560
Metadata Data
Currently we use this configuration in props.conf:
[sepcial_sourcetype]
BREAK_ONLY_BEFORE_DATE =
DATETIME_CONFIG =
LINE_BREAKER = ((?<!"),|[\r\n]+)
NO_BINARY_CHECK = true
SHOULD_LINEMERGE = false
category = Custom
description = Jenkins Job Configurations
pulldown_type = 1
disabled = false
SEDCMD-removeunwanted1 = s/\{\s*?"metadata(.*\s*)+?}//g
SEDCMD-remove_prefix = s/"data":\s*\[//g
SEDCMD-remove_suffix = s/\]\s*}//g
What should our props.conf and transforms.conf look like to accomplish this?
Even if this splits the events and extracts the fields correctly, it obviously causes the metadata part to be ignored (due to SEDCMD-removeunwanted1). But even without that configuration, the metadata will only be present in its own separate event and not replicated on all events.
Here we saw that it is also not supported to send custom metadata, but that would have been perfect for our use case: https://community.splunk.com/t5/Getting-Data-In/Does-the-HTTP-Event-Collector-API-support-events-with-arbitrary/m-p/216092
We already have a workaround where we will edit the JSON so that each event contains the metadata, but this is not ideal as will require to preprocess it before sending to Splunk and all events would have repeated data. So we are looking for a solution that could be handled by Splunk directly.
Thanks for any hints!
... View more